NOSQL DATABASE OVERVIEW – Part 3

Before reading this post do have a look into NOSQL DATABASE OVERVIEW – Part 1 and NOSQL DATABASE OVERVIEW – Part 2.

This post outlines some fundamental concepts, techniques and patterns that are common among NoSQL datastores and not unique to only one class of non-relational databases or a single NoSQL store.

a) CAP THEOREM: A distributed database has three main desirable properties

  • Consistency: The data available on all machines should be same in all respects. A distributed system is typically considered to be consistent if after an update operation of some writer all readers see his updates in some shared data source.
  • Availability: Availability meaning that a system is designed and implemented in a way that allows it to continue operation in case of any failures.
  • Partition Tolerance: – Unlike the other two requirements, this property can be seen as a statement regarding the underlying system: communication among the servers is not reliable, and the servers may be partitioned into multiple groups that cannot communicate with each other.

The CAP theorem states that: You can have at most two of these properties for any NOSQL systems.

b) BASE: BASE stands for Basically, Available, Soft state, and Eventual consistency. BASE is reverse of ACID properties which is core to RDBMS system. The distributed nature of NoSQL brings possibilities of data being partially available when some parts of the distributed database are not operation or cannot be reached hence, the term Basically Available.  Soft State allows data to vary overtime with or without input.  Eventually Consistent guarantees that data will become consistent in future and not immediately after an operation

c)Query Model: NOSQL databases offer many querying capabilities based on different NOSQL datastore. Key/Value stores by design only provide a lookup by primary key or some other id field and lack capability to query any further fields, other datastore like document store provide complex query capabilities. Column family stores only provide range queries and some operations like “in”, “and/or” and regular expression, if they are applied on row keys or indexed values. Graph databases can be queried in two different ways. Graph pattern matching strategies try to find parts of the original graph, which match a defined graph pattern.

d)Partitioning: Assuming that data in large scale systems exceeds the capacity of a single server and to ensure better performance of read/write operation data need to be partitioned across multiple clustered machine. NOSQL Database differ in their way to distribute data on multiple machine. Mostly data model of key/value stores, document stores and columnar stores are key oriented data can be partitioned across multiple cluster based on there key value. This key value is generated by a Hash function. Graph information is not gained by simple key lookups but by analyzing relationships between entities. On the one hand, nodes should be distributed on many servers evenly, on the other hand, heavily linked nodes should not be distributed over large distances, since traversals would cause huge performance penalty due to heavy network load. Therefore, one has to trade between these two limitations.

Leave a comment