Problem of Relational Database
Most of the time the necessary data displayed to the user on a page as a logical one-piece is chopped into pieces and saved to different tables in a relational database.
When there are lots of traffics to the application and it’s time to scale, with SQL, you can really only scale up – getting a bigger box, rather than scale out – getting large clusters of little boxes.
The Term “NoSQL”
A guy in London called Johan Oskarsson who proposed a meet-up where people could discuss various of non-relational database ideas just needed a flashy Twitter Hashtag. And another guy just came up the term “#NoSQL”.
Now “NoSQL” is generally a term to describe databases that are not using relational data models:
Data Model Type | Examples |
---|---|
Document (most in JSON style) | mongoDB, RavenDB |
Graph | Neo4j |
Column | Cassandra |
key-value | redis |
Document & Key-Value Data Model
Actually these 2 types can be very similar. Many key-value data stores allow you to store metadata, which is similar to document model. On the other hand on the document database, often there’s an ID and often when you look things up, you go by ID. This works more or less like a key-value model.
Martin Fowler used the term called “Aggregate Oriented Database” to describe them. We can think of those complex objects as a single unit when persisting or retrieving it.
Column Data Model
Column data model is an advanced type of Aggregate Oriented Database that utilizes row keys and column families.
Graph Database
Graph database is not aggregate oriented. It contains nodes and arcs. It’s very good at handling moving across relationships between things.
Consistency
In relational databases, transaction is used to ensure atomic and isolated updates. Graph databases do follow acid updates. Aggregate Oriented Databases actually don’t need transaction as much as others do because the aggregate is kind of the transaction boundaries.
Often time you will have to make a choice between consistency and availability, and this is a choice that can only be made with the business rules.
CAP Theorem
It’s basically saying: consistency, availability and partition tolerance, you can only pick 2 of them.
In other words, with a distributed system, you will get a network partitions. That means you have to pick between Consistency and Availability.
Why picking NoSQL over Relational
- You have more data that can fit in a single database server
- Development will go more easily as the business model might be a natural aggregate that fits the Aggregate Oriented Database