Tag Archives: cassandra.apache.org

10 Key/Value Store, Distributed, Open Source Databases

Posted on November 4, 2013 | Comments Off

Riak

HTTP API
Master-less, so remains operational even if multiple nodes fail
Near linear scalability
Architecture same of both large and small clusters
Key/value model, flat namespace, can store anything

Redis

Key/value. Can store data types such as sets, sorted lists, hashes and do operations on them such as set intersection and incrementing the value in a hash.
In-memory dataset
Easy to setup, master/slave replication

Hibari

Very simple data model with 5 attributes: keys, values, timestamps, expiry date, flags for metadata
Chain replication across nodes that are geographically dispersed. Not single points of failure
Excellent performance for large batches (~200k) read/write operations
Runs on commodity hardware or blades. Does not require SAN

Hypertable

High performance, massively scalable, modeled after Google’s Bigtable
Runs on top of a distributed file system such as Apache Hadoop DFS, GlusterDS, or Kosmos File System
Data model is a traditional, but huge table, that is physically stored in sort order of the primary key

Voldemort

High scalability due to allowing only very simple key/value data access.
Used by LinkedIn
Not an object or a relational database. Just a big, distributed, fault-tolerant, persistent hash table
Includes in-memory caching, so separate caching tier isn’t required

MemcacheDB

High performance persistent storage that’s compatible with Memcache protocol

Tarantool

NoSQL database with messaging server
All data maintained in RAM. Persistence via a write ahead log.
Asynchronous replication and hot standby
Supports stored procedures
Data model: tuples (unique key plus any number of other fields); spaces (multiple tuples)

Apache Cassandra

Can use massive cluster of commodity servers with no single point of failure. Can be deploy across multiple data centers.
Was used by Facebook for Inbox Search until 2010
Read/write scales linearly with number of nodes
Data replicated across multiple nodes
Supports MapReduce, Pig, and Hive
Has SQL-like CQL providing for a hybrid between key/value and tabular database

HyperDex

NoSQL key/value that provides lower latency and higher throughput than some alternatives
Replicates data to multiple nodes
Very easy to administer and maintain
Data model: key plus zero or more attributes

Lightcloud

Great performance even on small clusters with millions of keys
Nodes replicated via master-to-master replication. Hot backups and restores
Very small client footprint
Built on top of Tokyo Tyrant

Sources:

Comments Off on 10 Key/Value Store, Distributed, Open Source Databases

Posted in apache, Cassandra, database, Facebook, Hibari, Hive, HyperDex, Hypertable, Lightcloud, LinkedIn, MapReduce, MemcacheDB, NoSQL, Pig, Redis, Riak, scalability, SQL, Tarantool, Voldemort

Tagged basho.com, cassandra.apache.org, github.com, hyperdex.org, hypertable.com, memcachedb.org, project-voldemort.com, redis.io, tarantool.org, toolsjournal.com