What is covered in this presentation?
A brief history of databases
NoSQL WHY, WHAT & WHEN?
Characteristics of NoSQL databases
Aggregate data models
CAP theorem
Ashwani Kumar
16 February 2018
NOSQL Databases
Introduction 3
Database - Organized collection of data
DBMS - Database Management System: a software
package with computer programs that controls the
creation, maintenance and use of a database
Databases are created to operate large quantities of
information by inputting, storing, retrieving, and
managing that information.
A brief history 4
Relational databases 5
• Benefits of Relational databases:
Designed for all purposes
ACID
Strong consistancy, concurrency, recovery
Mathematical background
Standard Query language (SQL)
Lots of tools to use with i.e: Reporting services, entity
frameworks, ...
SQL databases 6
RDBMS 7
NoSQL why, what and when? 8
But...
Relational databases were not built
for distributed applications.
Because...
Joins are expensive
Hard to scale horizontally
Impedance mismatch occurs
Expensive (product cost, hardware,
Maintenance)
NoSQL why, what and when? 9
And....
It’s weak in:
Speed (performance)
High availability
Partition tolerance
Why NOSQL now?? Ans. Driving Trends 11
Side note: RDBMS performance 12
13
But..
But..What’s NoSQL?
What’s NoSQL?
A No SQL database provides a mechanism
for storage and retrieval of data that
employs less constrained consistency
models than traditional relational database
No SQL systems are also referred to as
"NotonlySQL“ to emphasize that they do in
fact allow SQL-like query languages to be
used.
Characteristics of NoSQL databases 14
NoSQL avoids:
Overhead of ACID transactions
Complexity of SQL query
Burden of up-front schema design
DBA presence
Transactions (It should be handled at
application layer)
Provides:
Easy and frequent changes to DB
Fast development
Large data volumes(eg.Google)
Schema less
NoSQL why, what and when? 10
NoSQL is getting more & more popular 15
What is a schema-less datamodel? 16
In relational Databases:
You can’t add a record which does
not fit the schema
You need to add NULLs to unused
items in a row
We should consider the datatypes.
i.e : you can’t add a stirng to an
interger field
You can’t add multiple items in a
field (You should create another
table: primary-key, foreign key,
joins, normalization, ... !!!)
What is a schema-less datamodel? 17
In NoSQL Databases:
There is no schema to consider
There is no unused cell
There is no datatype (implicit)
Most of considerations are done in
application layer
We gather all items in an aggregate (document)
Aggregate Data Models 18
NoSQL databases are classified in four major
datamodels:
• Key-value
• Document
• Column family
• Graph
Each DB has its own query language
Key-value data model 19
Simplest NOSQL databases
The main idea is the use of a
hash table
Access data (values) by strings
called keys
Data has no required format data
may have any format
Data model: (key, value) pairs
Basic Operations:
Insert(key,value),
Fetch(key),
Update(key),
Delete(key)
Column family data model 20
The column is lowest/smallest
instance of data.
It is a tuple that contains a
name, a value and a timestamp
Graph data model 22
Based on Graph Theory.
Scale vertically, no clustering.
You can use graph algorithms easily
Transactions
ACID
Document based data model 23
• Pair each key with complex data
structure known as data structure.
• Indexes are done via B-Trees.
• Documents can contain many different
key-value pairs, or key-array pairs, or
even nested documents.
Document based data model 24
SQL vs NOSQL 25
What we need ? 26
• We need a distributed database system having such features:
• – Fault tolerance
• – High availability
• – Consistency
• – Scalability
Which is impossible!!!
According to CAP theorem
CAP theorem 27
We can not achieve all the three items
In distributed database systems (center)
CAP theorem 28