Introduction to NoSQL

Problem of Relational Database

Most of the time the necessary data displayed to the user on a page as a logical one-piece is chopped into pieces and saved to different tables in a relational database.

When there are lots of traffics to the application and it’s time to scale, with SQL, you can really only scale up – getting a bigger box, rather than scale out – getting large clusters of little boxes.

The Term “NoSQL”

A guy in London called Johan Oskarsson who proposed a meet-up where people could discuss various of non-relational database ideas just needed a flashy Twitter Hashtag. And another guy just came up the term “#NoSQL”.

Now “NoSQL” is generally a term to describe databases that are not using relational data models:

Data Model TypeExamples
Document (most in JSON style)mongoDB, RavenDB
GraphNeo4j
ColumnCassandra
key-valueredis

Document & Key-Value Data Model

Actually these 2 types can be very similar. Many key-value data stores allow you to store metadata, which is similar to document model. On the other hand on the document database, often there’s an ID and often when you look things up, you go by ID. This works more or less like a key-value model.

Martin Fowler used the term called “Aggregate Oriented Database” to describe them. We can think of those complex objects as a single unit when persisting or retrieving it.

Column Data Model

Column data model is an advanced type of Aggregate Oriented Database that utilizes row keys and column families.

Graph Database

Graph database is not aggregate oriented. It contains nodes and arcs. It’s very good at handling moving across relationships between things.

Consistency

In relational databases, transaction is used to ensure atomic and isolated updates. Graph databases do follow acid updates. Aggregate Oriented Databases actually don’t need transaction as much as others do because the aggregate is kind of the transaction boundaries.

Often time you will have to make a choice between consistency and availability, and this is a choice that can only be made with the business rules.

CAP Theorem

It’s basically saying: consistency, availability and partition tolerance, you can only pick 2 of them.

In other words, with a distributed system, you will get a network partitions. That means you have to pick between Consistency and Availability.

Why picking NoSQL over Relational

  1. You have more data that can fit in a single database server
  2. Development will go more easily as the business model might be a natural aggregate that fits the Aggregate Oriented Database

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s