0% found this document useful (0 votes)

68 views48 pages

NOSQL Databases

The document discusses different types of NoSQL data stores including key-value stores, document stores, and column family stores. It provides examples like DynamoDB, MongoDB, Cassandra and HBase. The document outlines reasons for using NoSQL databases like scalability, flexibility and ability to handle large volumes of data across multiple servers. It also covers challenges in scaling relational databases and how NoSQL databases address issues like replication, partitioning, and eventual consistency.

Uploaded by

paresh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

68 views48 pages

NOSQL Databases

Uploaded by

paresh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

NOSQL Data Stores

Not Only SQL

Tuesday, September 21, 2010

Data Store

Super Set
Relational Databases
Key Value Stores
Document Stores
Column Family Stores

Tuesday, September 21, 2010

Design This Schema
Student Course
Student
Address
Address Score
Course
Score

Tuesday, September 21, 2010

Scalable huh??

Use Case : This schema has to serve the whole student

community in this world
One Big Server?? How Big?
More than 1 Servers. How will that work?

Tuesday, September 21, 2010

WHY NOSQL ?

Scalability : Horizontal
Relational Databases do no good when distributed
NOSQL : Distributed, Flexible Schema, Relaxing Consistency

Tuesday, September 21, 2010

Issues with Relational DB

Scalability
Replication : Scaling by duplication
Partitioning(Sharding) : Scaling by division

Tuesday, September 21, 2010

Replication
Master - Slave
1 write = N * writes (N is number of slaves)
Faster reads ( Can Read from N nodes)
Critical Reads Go to Master (Application Aware)
Limitations of high volumes of data

Tuesday, September 21, 2010

Replication
Multi - Master
Adding more masters
Conflict resolution O(n^3) or O(n^2)

Tuesday, September 21, 2010

Partitioning(Sharding)
Scales Read as well as Writes
Application needs to be Partition Aware
Broken Relationships : Cartesian products across shards ??
Referential Integrity is no more
Rebalancing

Tuesday, September 21, 2010

Consistent Hashing
Hash Ring (Or Clock Face)

Balanced Distribution After Adding a new Node

Tuesday, September 21, 2010
Common Sharding Schemes
Vertical Partitioning
Range Based Partitioning
Hash Based Partitioning
Directory Based Partitioning

Tuesday, September 21, 2010

Can live without !!
UPDATE and DELETE
Loss of Information
Can be modeled as INSERT with versioning
Filter out inactive records

Tuesday, September 21, 2010

Avoid JOINS
Expensive, Fails with partitions
How to avoid?
De - normalize
Storage is cheap now
Burden of Consistency shifts to application

Tuesday, September 21, 2010

Still need ACID ??
Atomicity : Only Single key is enough
Consistency : CAP Theorem
Can only get any two of Consistency, Availability,
Partition Tolerance
Isolation : Not more than Read - Committed (Single Key)
Durability : Node failures. Peer Replication

Tuesday, September 21, 2010

Fixed Schema
Schema comes before Data
Modifying Schema is essential
Adding new features
Modifying Schema is hard
Locking of rows(Add/Modify a column)
Locking of table(Add/Remove index)

Tuesday, September 21, 2010

Model this!!
Hierarchal Data
Graphs

Tuesday, September 21, 2010

Desired Characteristics
High Scalability
Add nodes incrementally
No Diminishing Returns
High Availability
No single point of failure
Node Failures agnostic

Tuesday, September 21, 2010

Desired Characteristics
High Performance
Fast operations
Non - Blocking Writes
Consistency
No need of Strong consistency
Eventual Consistency, Read - Your - Write Consistency

Tuesday, September 21, 2010

Desired Characteristics
Deployment Flexibility
Add/Remove node automatically
NO DFS or shared storage
Should work with commodity heterogenous hardware
Modeling Flexibility
Key - Value Pairs, Hierarchal and Graph Data

Tuesday, September 21, 2010

Desired Characteristics
Query Flexibility
Multi Gets
Range Queries
Upserts

Tuesday, September 21, 2010

Inspiration
Memcached
In-memory Key Value
Blazing Fast
Infinite Horizontal Scalability

Tuesday, September 21, 2010

Key Value Stores
Simple Data Model
Amazon Dynamo
Amazon S3
Project Voldemort
Redis
Scalaris and lot others

Tuesday, September 21, 2010

Amazon Dynamo
Internal to Amazon
Distributed K-V store
Opaque Values
Partitioning
A variant of consistent hashing
Hash Ring division

Tuesday, September 21, 2010

Amazon Dynamo
Partitioning
Mapping Communication via Gossip protocol
Eventually consistent view of mappings
Replication
Each key is replicated on N nodes
Preference List

Tuesday, September 21, 2010

Amazon Dynamo
Replication
Read/Write through Coordinator nodes
Configurations
N = number of replicas
W = min. nodes that must ACK the receipt of a WRITE
R = min. nodes contacted for a READ
R+W > N will ensure Quorum
Tuesday, September 21, 2010
Amazon Dynamo
Tuning (N,R,W)
Increased W means more replication
Increased R mean high consistency low performance
Typical values for Amazon Apps (N,R,W)= (3,2,2)

Tuesday, September 21, 2010

Amazon Dynamo
Consistency
Eventually consistent
Uses Object versioning via Vector Clocks
Consistency Protocol
Return all versions
Reconcile divergent versions
Reconciled version superseding the current is written
Tuesday, September 21, 2010
Amazon Dynamo
Handling Temporary Failures
Hinted Handoff
Handling Permanent Failures
Node Sync

Tuesday, September 21, 2010

Amazon Dynamo
Ring membership
Add/Remove node needs rebalancing
Failure Detection
Gossip about failures
Check periodically about availability and gossip

Tuesday, September 21, 2010

Other K-V Stores
Check out others too. Worth a read and try.
S3,Voldemort,Redis,Scalaris.

Tuesday, September 21, 2010

Document Stores
Step further from K-V stores
Value is full blown record(document)
Document is not Opaque(Expose a structure to perform
operations)
Each document can have different schema e.g JSON
Relations are possible
One to Many and Many to Many
Tuesday, September 21, 2010
Document Stores
Mostly Similar to relational db(except upfront Schema)
Amazon Simple DB
Apache CouchDB
Riak
Mongo DB

Tuesday, September 21, 2010

Mongo DB
We use mongo in a large automated translation software
Data Model
Key - Value, value being binary serialized JSON(BSON)
4 Mb limit on BSON
For larger object use GridFS.
Collections : more of like a table
B-trees used for indexes
Tuesday, September 21, 2010
Mongo DB
Storage
Uses Memory Mapped Files(Cache controlled by OS VMM)
Writes
In place updates
partial updates
Single Document Atomic updates

Tuesday, September 21, 2010

Mongo DB
Queries
JSON style based syntax (powered by js engine)
Support for conditional operators,regex etc
Cursor support
Query optimizers
Map-Reduce over a collection

Tuesday, September 21, 2010

Mongo DB
Replication
Master Slave
Replica Pairs
Master - Master

Tuesday, September 21, 2010

Mongo DB
Partitioning
Auto Sharding Done through chunks(50 Mb max)
Easy node addition
Auto balancing
ZERO single point of failure
Automatic Failover

Tuesday, September 21, 2010

Column Family Stores
Sparse, Distributed, Persistent, Multi-Dimensional sorted Map
Column Keys are grouped into sets called column-families
BigTable
HBase
Cassandra

Tuesday, September 21, 2010

Big Table Column Family

Tuesday, September 21, 2010

Cassandra
Combines distributed architecture of Dynamo with column-
family data model of Big Table

Tuesday, September 21, 2010

Cassandra
Data Model : Multi Dimensional Map indexed by a key
Each app has its own key-space
Key can be any long string. Indexed by cassandra
Column - an attribute of record. Time Stamped
Column-Family: Grouping of columns. Similar to
relational table
Super Columns: List of columns
Tuesday, September 21, 2010
Cassandra
Data Model
Column family can contain any one of column/super
column
KeySpace.ColumnFamily.Key.[SuperColumn].Column
Sorting
Data is sorted at write time
Columns are sorted within their row by column name
(pluggable sorting providers)
Tuesday, September 21, 2010
Cassandra
Partitioning : Mostly Like Dynamo
Consistent hashing under order preserving hash function
Uses Chord approach to load balance(dynamo used v-
node)

Tuesday, September 21, 2010

Cassandra
Replication
Coordinator nodes and preference list as Dynamo
DataCenter aware, rack aware, rack-unaware
Rack aware uses Zookeeper
Membership based on ScuttleButt- anti-entropy gossip

Tuesday, September 21, 2010

Cassandra
Failure Detection
Modified version of Accrual failure detection
Failure Handling
Same as hinted handoff in Dynamo

Tuesday, September 21, 2010

Cassandra
Write
Writing to commit log, followed by an update to
memtable.
Dedicated disk for commit log(Makes write sequential)
No seeks-always sequential, so blazing fast
Atomic With in column family

Tuesday, September 21, 2010

Cassandra
Read
Similar to dynamo to figure out which nodes will serve
Similar to Big Table for storage level

Tuesday, September 21, 2010

Thanks!!!
Due regards to Reddy Raja for this invite.

Tuesday, September 21, 2010

Grokking The Advanced System Design Interview
91% (11)
Grokking The Advanced System Design Interview
397 pages
NOSQL
No ratings yet
NOSQL
23 pages
AWS Certified Solutions Architect - Professional
From Everand
AWS Certified Solutions Architect - Professional
VB Dev
No ratings yet
Amazon Dynamo DB - Presentation
100% (1)
Amazon Dynamo DB - Presentation
30 pages
Visual Guide To NoSQL Systems - Nathan Hurst's Blog
No ratings yet
Visual Guide To NoSQL Systems - Nathan Hurst's Blog
10 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
43 pages
Scalability Availability Stability:, & Patterns
No ratings yet
Scalability Availability Stability:, & Patterns
197 pages
Database Scalability: Jonathan Ellis
No ratings yet
Database Scalability: Jonathan Ellis
49 pages
NO SQL-Unit 3
No ratings yet
NO SQL-Unit 3
27 pages
Lecture 07 - Key-Value Databases
No ratings yet
Lecture 07 - Key-Value Databases
75 pages
T09 - NoSQL 1
No ratings yet
T09 - NoSQL 1
32 pages
CIS - 468 - 04 - NOSQL Databases and Big Data Storage Systems
No ratings yet
CIS - 468 - 04 - NOSQL Databases and Big Data Storage Systems
102 pages
Lect26 After
No ratings yet
Lect26 After
28 pages
Nosql: John Paul Ashenfelter CTO/Transitionpoint
No ratings yet
Nosql: John Paul Ashenfelter CTO/Transitionpoint
35 pages
CC - Lecture 6-Data
No ratings yet
CC - Lecture 6-Data
44 pages
BigData NoSQL
No ratings yet
BigData NoSQL
30 pages
Unit 2
No ratings yet
Unit 2
26 pages
Massively Parallel Cloud Data Storage Systems: S. Sudarshan IIT Bombay
No ratings yet
Massively Parallel Cloud Data Storage Systems: S. Sudarshan IIT Bombay
17 pages
Module 3
No ratings yet
Module 3
37 pages
Module 1
No ratings yet
Module 1
69 pages
Cheat Sheet v4
No ratings yet
Cheat Sheet v4
3 pages
4 - Key-Value Stores
No ratings yet
4 - Key-Value Stores
47 pages
CT113H Lecture 4 - Key-Value Stores
No ratings yet
CT113H Lecture 4 - Key-Value Stores
55 pages
Cheat Sheet v2
No ratings yet
Cheat Sheet v2
3 pages
CassandraTraining v3.3.4
100% (2)
CassandraTraining v3.3.4
183 pages
Intro No SQL
No ratings yet
Intro No SQL
44 pages
Storagesystems
No ratings yet
Storagesystems
41 pages
CC - Lecture 8-Final
No ratings yet
CC - Lecture 8-Final
51 pages
2 - Disadvantages of NoSQL Technology
No ratings yet
2 - Disadvantages of NoSQL Technology
3 pages
Big Data - No SQL Databases and Related Concepts
100% (1)
Big Data - No SQL Databases and Related Concepts
101 pages
CT113H Lecture 4 - Key-Value Stores
No ratings yet
CT113H Lecture 4 - Key-Value Stores
55 pages
Nosql 1
No ratings yet
Nosql 1
40 pages
Introduction To NOSQL and Cassandra: @rantav @outbrain
No ratings yet
Introduction To NOSQL and Cassandra: @rantav @outbrain
60 pages
Introduction To Nosql: Gabriele Pozzani
No ratings yet
Introduction To Nosql: Gabriele Pozzani
49 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
29 pages
Nosql Overview: Implementation Free
No ratings yet
Nosql Overview: Implementation Free
40 pages
Seminar Topic Nosql
No ratings yet
Seminar Topic Nosql
73 pages
Nosql Prepared
No ratings yet
Nosql Prepared
60 pages
Big Data Analysis
No ratings yet
Big Data Analysis
9 pages
RK NoSQL
No ratings yet
RK NoSQL
35 pages
Ado Lecture III 2024-26
No ratings yet
Ado Lecture III 2024-26
93 pages
Get Help With Your Essay: If You Need Assistance With Writing Your Essay, Our Professional Essay Writing
No ratings yet
Get Help With Your Essay: If You Need Assistance With Writing Your Essay, Our Professional Essay Writing
6 pages
4 NoSql
No ratings yet
4 NoSql
25 pages
Lecture 6 - NoSQL
No ratings yet
Lecture 6 - NoSQL
28 pages
01 NSQL
No ratings yet
01 NSQL
5 pages
4 - Key-Value Storage
No ratings yet
4 - Key-Value Storage
109 pages
Bda Module 3
No ratings yet
Bda Module 3
20 pages
Nosql Module 2
100% (1)
Nosql Module 2
87 pages
Unit 5 NOSQL
No ratings yet
Unit 5 NOSQL
102 pages
Module 5 - NoSQL Databases
No ratings yet
Module 5 - NoSQL Databases
33 pages
Explain The Term Nosql'. Describe Vertical and Horizontal Scaling
No ratings yet
Explain The Term Nosql'. Describe Vertical and Horizontal Scaling
13 pages
NoSQL Intro
No ratings yet
NoSQL Intro
26 pages
BDS Session 5 - NoSQL DB
No ratings yet
BDS Session 5 - NoSQL DB
51 pages
Dynamo DB
No ratings yet
Dynamo DB
20 pages
No SQL
No ratings yet
No SQL
109 pages
NoSql 2024 Assign2
No ratings yet
NoSql 2024 Assign2
189 pages
NoSQL Databases
No ratings yet
NoSQL Databases
20 pages
Dynamo: Amazon's Highly Available Key-Value Store
No ratings yet
Dynamo: Amazon's Highly Available Key-Value Store
21 pages
SQLite Database Programming for Xamarin: Cross-platform C# database development for iOS and Android using SQLite.XM
From Everand
SQLite Database Programming for Xamarin: Cross-platform C# database development for iOS and Android using SQLite.XM
Anthony Serpico
No ratings yet
Sass and Compass for Designers
From Everand
Sass and Compass for Designers
Ben Frain
No ratings yet
Please Excuse My Dear Aunt Sally: Equations
No ratings yet
Please Excuse My Dear Aunt Sally: Equations
6 pages
"Who Can Shave An Egg?" - Beckett, Mallarmé, and Foreign Tongues.
No ratings yet
"Who Can Shave An Egg?" - Beckett, Mallarmé, and Foreign Tongues.
29 pages
Math4You 2 (October 18) - Solutions
No ratings yet
Math4You 2 (October 18) - Solutions
8 pages
Responsable Apprentissage
No ratings yet
Responsable Apprentissage
13 pages
Reject Leftovers D K Olukoya Olukoya, D K Z Library
No ratings yet
Reject Leftovers D K Olukoya Olukoya, D K Z Library
51 pages
Form 5 Catch Us If You Can
No ratings yet
Form 5 Catch Us If You Can
103 pages
Chapel School: Home-Based Learning Package Week 3 Third Term
No ratings yet
Chapel School: Home-Based Learning Package Week 3 Third Term
21 pages
DataKinetics Batch Optimization Whitepaper
No ratings yet
DataKinetics Batch Optimization Whitepaper
7 pages
Python 04uple
No ratings yet
Python 04uple
7 pages
Toefl Junior Handbook
No ratings yet
Toefl Junior Handbook
64 pages
Full A Theory of Legal Argumentation The Theory of Rational Discourse As Theory of Legal Justification Alexy PDF All Chapters
No ratings yet
Full A Theory of Legal Argumentation The Theory of Rational Discourse As Theory of Legal Justification Alexy PDF All Chapters
67 pages
COs CSE S3 S8 With CO PO Mapping
No ratings yet
COs CSE S3 S8 With CO PO Mapping
16 pages
OBIEE OAC Resume NRH 2025
No ratings yet
OBIEE OAC Resume NRH 2025
10 pages
Dzexams 4am Anglais d2 20191 280428
No ratings yet
Dzexams 4am Anglais d2 20191 280428
4 pages
(Ebook PDF) Data Structures and Abstractions With Java 5th Edition by Frank M. Carrano Download
100% (1)
(Ebook PDF) Data Structures and Abstractions With Java 5th Edition by Frank M. Carrano Download
63 pages
Amharic Ocr
100% (2)
Amharic Ocr
119 pages
Caraga Regional Science High School: San Juan, Surigao City
No ratings yet
Caraga Regional Science High School: San Juan, Surigao City
3 pages
Siti Zahroh - Integrating Higher Order Thinking Skills HOTS To Increase Students Productive Skills
No ratings yet
Siti Zahroh - Integrating Higher Order Thinking Skills HOTS To Increase Students Productive Skills
12 pages
Amavasya Tharpana Sankalpa Mantra 2013-2014
No ratings yet
Amavasya Tharpana Sankalpa Mantra 2013-2014
4 pages
MCSD Data Science Study Plan 2021
No ratings yet
MCSD Data Science Study Plan 2021
2 pages
English 4 First Quarter
No ratings yet
English 4 First Quarter
3 pages
Java Multithreading and Concurrency Training
No ratings yet
Java Multithreading and Concurrency Training
11 pages
TG Q2 Week 8 1
No ratings yet
TG Q2 Week 8 1
18 pages
Object-Oriented Systems Analysis and Design Using UML
No ratings yet
Object-Oriented Systems Analysis and Design Using UML
55 pages
Unit 1 in English 505
No ratings yet
Unit 1 in English 505
13 pages
31 - PDF - Invata Engleza PDF
No ratings yet
31 - PDF - Invata Engleza PDF
15 pages
December - Progress Chart-Love & Humility
No ratings yet
December - Progress Chart-Love & Humility
9 pages
English Pedagogical Module 6: What Are You Passionate About?
50% (2)
English Pedagogical Module 6: What Are You Passionate About?
32 pages
Meeting 4 Reading, Writing, Listening
0% (1)
Meeting 4 Reading, Writing, Listening
10 pages
BRAIDOTTI Metamorphoses 2002
No ratings yet
BRAIDOTTI Metamorphoses 2002
26 pages

NOSQL Databases

Uploaded by

NOSQL Databases

Uploaded by

NOSQL Data Stores

Not Only SQL

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Use Case : This schema has to serve the whole student

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Balanced Distribution After Adding a new Node

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

Tuesday, September 21, 2010

You might also like