0% found this document useful (0 votes)

5 views

Module 3

Uploaded by

nehal1103sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Module 3

Uploaded by

nehal1103sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

NoSQL

Module 3
SQL Databases
RDMS Database
Era of Distributed Computing
But...
❑ Relational databases were not built
for distributed applications.
Because...
❑ Joins are expensive
❑ Hard to scale horizontally
❑ Impedance mismatch occurs
❑ Expensive (product cost, hardware,
Maintenance)
And....
It’s weak in:
❑ Speed (performance)
❑ High availability
❑ Partition tolerance
New Trends…
• Massive write performance.
• Fast key value look ups.
Characteristics • Flexible schema and data types.
Required – Use • No single point of failure.
Cases • Fast prototyping and development.
• Out of the box scalability.
• Easy maintenance.
Performance of
RDBMS
• Nothing. One size fits all? Not really.
• Impedance mismatch.
• Object Relational Mapping doesn't work
quite well.
What went • Rigid schema design.
wrong with • Harder to scale.
RDBMS? • Replication.
• Joins across multiple nodes? Hard.
• How does RDMS handle data growth? Hard.
• Need for a DBA.
Introduction to NoSQL

NoSQL stands for Not Only

SQL
It’s more than rows in tables

It’s free of joins

It’s schema-free
What is It works on many processors
NoSQL? It uses shared-nothing commodity computers

It supports linear scalability

It’s innovative
It’s not about the SQL language

It’s not only open source

What It’s not only big data
NoSQL is It’s not about cloud computing
NOT? It’s not about a clever use of RAM and SSD

It’s not an elite group of products

Volume

Velocity

Variability
NoSQL Business
Drivers Agility

• The most complex part of building applications

using RDBMSs is the process of putting data
into and getting data out of the database. If
your data has nested and repeated subgroups
of data structures, you need to include an
object-relational mapping layer.
• Atomicity: All or nothing.
• Consistency: Consistent state of data and
transactions.
• Isolation: Transactions are isolated from each
other.
• Durability: When the transaction is committed,
ACID state will be durable.

Semantics Any data store can achieve Atomicity, Isolation and

Durability but do you always need consistency? No.

By giving up ACID properties, one can achieve

higher performance and scalability.
A distributed system can support only
two of the following characteristics:
• Consistency
• Availability
Brewer’s
• Partition tolerance
CAP • Proven by Nancy Lynch et al. MIT labs.
Theorem
• http://www.cs.berkeley.edu/~brewer/
cs262b-2004/PODC-keynote.pdf

14
• Consistency: Clients should read the
same data. There are many levels of
consistency.
– Strict Consistency – RDBMS.
– Tunable Consistency – Cassandra.
– Eventual Consistency – Amazon
Consistency Dynamo.
• Client perceives that a set of
operations has occurred all at once –
Pritchett
• More like Atomic in ACID transaction
properties

14 August 2024 15
• Availability: Data to be available.
• Node failures do not prevent
survivors from continuing to operate
Availability – Wikipedia
• Every operation must terminate in an
intended response – Pritchett

14 August 2024 16
• Partial Tolerance: Data to be
partitioned across network segments
due to network failures.
• the system continues to operate
despite arbitrary message loss –
Partition Tolerance Wikipedia
• Operations will complete, even if
individual components are
unavailable – Pritchett

14 August 2024 17
➢ ACID:
• Strong consistency.
• Less availability.
• Pessimistic concurrency.
• Complex.
A Clash of ➢ BASE:
cultures • Availability is the most important thing.
Willing to sacrifice for this (CAP).
• Weaker consistency (Eventual).
• Best effort.
• Simple and fast.
• Optimistic.
Why NoSQL?

NoSQL stands for Not Only

SQL
▪ A new class of databases emerged, which
mainly follow the BASE properties
▪ These were dubbed as NoSQL databases
▪ E.g., Amazon’s Dynamo and Google’s Bigtable
NoSQL ▪ Main characteristics of NoSQL databases
Databases include:
▪ No strict schema requirements
▪ No strict adherence to ACID properties
▪ Consistency is traded in favor of Availability
• Key-Value Store – Stores data as
values in hash table of keys
• Column Store – Each storage block
NoSQL Data contains data from only one column
Architecture • Document Store – Stores documents
Patterns made up of tagged elements
• Graph Databases – Stores data as
nodes and relationships that can be
traversed

14 August 2024 21
▪ Keys are mapped to (possibly) more complex value
(e.g., lists)

▪ Keys can be stored in a hash table and can be

distributed easily
Key-Value
Stores ▪ Such stores typically support regular CRUD (create,
read, update, and delete) operations
▪ That is, no joins and aggregate functions

▪ E.g., Amazon DynamoDB and Apache Cassandra

Storing RDBMS data as Key-Value pair

Employee Table
(Name – employees)

Format for Key-value representation $table_name:$primary_key_value:$attribute_name = $value

employee:$employee_id:$attribute_name = $value
Key-Value form representation employee:1:first_name = "John"
of Employee Table employee:1:last_name = "Doe"
employee:1:address = "New York“
employee:2:first_name = "Benjamin"
employee:2:last_name = "Button"
employee:2:address = "Chicago"
Retrieving data from Key-value store
• Consider SQL query:
SELECT employee_id FROM employees WHERE address = “New York”;
• In Key-value, method call: getEmployeeIDList(attribute:"address", value:"New York");
• You should implement the above Java function to achieve this
functionality.
1. public List<Integer> getEmployeeIDList(String attribute, String value) {
2. List<Integer> employeeIDs = new ArrayList();
3.
4. DBIterator keyIterator = levelDBStore.iterator();
5. keyIterator.seek(bytes("employee")); // moves the iterator to the keys starting with "employee"
Retrieving data from Key-value store- cont…
6. try { while (keyIterator.hasNext()) {
7. String key = asString(keyIterator.peekNext().getKey()); // key arrangement : employee:$employee_id:$attribute_name = $value
8. String[] keySplit = key.split(":"); // split the key
9. int employeeID = Integer.parseInt(keySplit[1]);
10. if (keySplit[keySplit.length - 1].equals(attribute)) { // check the attribute
11. String storedValue = asString(levelDBStore.get(bytes(key)));
12. if(storedValue.equals(value)){ // check the value
13. employeeIDs.add(employeeID); } } // if both checks are valid, employee id is added
14. if (!keySplit[0].equals("employee")) break; // breaking condition : prefix is not "employee"

15. keyIterator.next(); } }
16. finally { keyIterator.close(); }
17. return employeeIDs; // return resulted employee ids
▪ Columnar databases are a hybrid of RDBMSs and
Key-Value stores
▪ Values are stored in groups of zero or more columns,
but in Column-Order (as opposed to Row-Order)
▪ Values are queried by matching keys
▪ E.g., HBase and Vertica
Columnar Record 1 Column A

Alice 3 25 Bob Alice Bob Carol

Databases 4
45
19 Carol 0 3
19
4
45
0 25

Row-Order Columnar (or Column-Order)

Column A = Group A

Alice Bob Carol

3 25 4 19
0 45
Column Family {B, C}
Columnar with Locality Groups
Row verses
• Representing RDBMS data in
Columnar HBASE or Cassandra
Databases
Column Family
Query on Columnar DB – Example HBase
• To create a new table : Specify table name and ColumnFamily name
• create ‘test’, ‘cf’
• list and describe are used to obtain information of table and it’s description
respectively
• list ‘test’ describe ‘test’
• To insert data into a table: Use put command
• put ‘test’, ‘row1’, ‘cf:a’, ‘value1’
• put ‘test’, ‘row2’, ‘cf:b’, ‘value2’
• Retrieval of data using get command
• get ‘test’, ‘row1’
• Output: Column Cell
cf:a timestamp=6782168192, value=value1
• Other shell commands: disable, enable, drop,
▪ Documents are stored in some
standard format or encoding (e.g.,
XML, JSON, PDF or Office Documents)
▪ These are typically referred to as Binary
Document Large Objects (BLOBs)

Stores ▪ Documents can be indexed

▪ This allows document stores to
outperform traditional file systems
▪ E.g., MongoDB and CouchDB (both
can be queried using MapReduce)
Sample JSON
document
• Here you can see that the
JSON document holds
primitive types as values as
well as other JSON objects
and array types.
• JSON documents allow you to
create a hierarchy of
embedded JSON objects to an
unlimited level.
• It's completely up to the user
what shape he or she wants
to give to the data stored in a
NoSQL document database.
Document Database as a • Data is structured in the form of
documents and collections.

Collection • A document can be a PDF,

Microsoft word doc, XML or JSON
file.
• A document contains key value
pairs. Each document does not
have to be in the same structure as
other documents.
• Simply add more documents
without having to change the
structure of the entire database.
• Documents are grouped into
collections, which serve a similar
purpose to a relational table.
• Separation of collections by entity
(orders and customer profiles).
Query with Document Store – Example MongoDB
▪ Data are represented as vertices and edges
▪ Some don’t consider it under NoSQL
▪ E.g., Neo4j and VertexDB
▪ Resource Description Framework (RDF) of
Graph WWW has a triplet form (SPO) which is a
type of graph store.
Databases ▪ SPARQL is a semantic query language for
retrieving/ manipulating data in RDF format.
▪ Graph databases are powerful for graph-like
queries (e.g., find the shortest path between
two elements)
Graph Model Components

Id: 2
Name: Bob
Age: 34

Id: 1
Name: Alice
Age: 27

Id: 3
Name:
Chess
Type: Group
Graph Store - Example
RDF Triple and SPARQL query - Example
Slumdog Millionaire
2008
Danny Boyle
➢RDF Triples
(id1, hasTitle, " Slumdog Millionaire "),
releaseY
hasTitle ear
(id1, releaseYear, "2009"),
hasName
(id1, directedBy,id7)
id1
id7 (id7,hasName,“Danny Boyle"),
directedBy (id1, hasCasting, id2),
hasCasting
(id2, roleName, “Latika"),
roleName (id2, actor, id11),
Latika
(id11, hasName, " Freida Pinto"),…….
id2
➢SPARQL query
actor Select ?title Where { ?p <hasTitle> ?title.
hasName ?p <hasCasting> ?s. ?s <actor> ?c.
id11 Freida Pinto ?c <hasName> “Freida Pinto“ }
Primary and Simplest Version to Structured Version
Application Areas
of NoSQL

Amazon DEA-C01 AWS Certified Data Engineer - Associate Dumps
No ratings yet
Amazon DEA-C01 AWS Certified Data Engineer - Associate Dumps
20 pages
CameoDataHubTutorial PDF
No ratings yet
CameoDataHubTutorial PDF
77 pages
SEMTECH2011E - NIEM Ontologies and Vocabularies - TopQuadrant
No ratings yet
SEMTECH2011E - NIEM Ontologies and Vocabularies - TopQuadrant
65 pages
Unit 3 NoSQL
No ratings yet
Unit 3 NoSQL
98 pages
NoSQL Big Data Management
No ratings yet
NoSQL Big Data Management
36 pages
NoSQL (1)
No ratings yet
NoSQL (1)
12 pages
Unit 2
No ratings yet
Unit 2
65 pages
Unit 2 Handouts
No ratings yet
Unit 2 Handouts
11 pages
Lecture 1
No ratings yet
Lecture 1
31 pages
Cs 620 / Dasc 600 Introduction To Data Science & Analytics: Lecture 6-Nosql
No ratings yet
Cs 620 / Dasc 600 Introduction To Data Science & Analytics: Lecture 6-Nosql
31 pages
Module 5_NoSQL databases
No ratings yet
Module 5_NoSQL databases
33 pages
Lecture 1 - NoSQL
No ratings yet
Lecture 1 - NoSQL
31 pages
Unit 2
No ratings yet
Unit 2
26 pages
No SQL Lecture Notes
No ratings yet
No SQL Lecture Notes
17 pages
Unit Ii - Nosql Databases
No ratings yet
Unit Ii - Nosql Databases
112 pages
NoSql 2024 Assign2
No ratings yet
NoSql 2024 Assign2
189 pages
NOSQL Lecture 1 Notes
No ratings yet
NOSQL Lecture 1 Notes
31 pages
Dbms Presentation
No ratings yet
Dbms Presentation
22 pages
NoSQL Tutorial - New
No ratings yet
NoSQL Tutorial - New
10 pages
Bda Unit-5 PDF
No ratings yet
Bda Unit-5 PDF
83 pages
What Is NoSQL
No ratings yet
What Is NoSQL
10 pages
NoSQL_Notes
No ratings yet
NoSQL_Notes
11 pages
CS8091-BIG DATA ANALYTICS UNIT V Notes
100% (4)
CS8091-BIG DATA ANALYTICS UNIT V Notes
31 pages
Bcse302l Dbms Module-7 Nosql
No ratings yet
Bcse302l Dbms Module-7 Nosql
30 pages
Big Data Unit 5
No ratings yet
Big Data Unit 5
16 pages
Module 1
No ratings yet
Module 1
34 pages
Lecture 3.1.2
No ratings yet
Lecture 3.1.2
47 pages
Unit II No-SQL Db Managment
No ratings yet
Unit II No-SQL Db Managment
33 pages
NOsql Presentation
No ratings yet
NOsql Presentation
20 pages
NoSQL Database
No ratings yet
NoSQL Database
8 pages
Practical 1 Aim: Introduction To Nosql Database
No ratings yet
Practical 1 Aim: Introduction To Nosql Database
16 pages
Unit 4
No ratings yet
Unit 4
7 pages
Bda Unit-2
No ratings yet
Bda Unit-2
29 pages
BDT Unit 4
No ratings yet
BDT Unit 4
93 pages
2 - Disadvantages of NoSQL Technology
No ratings yet
2 - Disadvantages of NoSQL Technology
3 pages
Bda CHP 3
No ratings yet
Bda CHP 3
75 pages
Big Data Analytics Unit-2
No ratings yet
Big Data Analytics Unit-2
30 pages
BIG DATA UNIT-II NOTES
No ratings yet
BIG DATA UNIT-II NOTES
7 pages
Features of Nosql: Non-Relational
No ratings yet
Features of Nosql: Non-Relational
7 pages
DBMS Unit 5 Notes
No ratings yet
DBMS Unit 5 Notes
57 pages
Bda QB 2
No ratings yet
Bda QB 2
15 pages
PPT 2.2.1
No ratings yet
PPT 2.2.1
26 pages
Case Study On Different Nosql Data Models
No ratings yet
Case Study On Different Nosql Data Models
6 pages
Big Data Analysis
No ratings yet
Big Data Analysis
9 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
29 pages
Chapter 6b - No SQL
No ratings yet
Chapter 6b - No SQL
27 pages
Introduction To: Nosql
No ratings yet
Introduction To: Nosql
27 pages
No SQL
No ratings yet
No SQL
38 pages
Chapter 2a Non Structured DataRozianiwati
No ratings yet
Chapter 2a Non Structured DataRozianiwati
43 pages
Nosql
No ratings yet
Nosql
13 pages
BIG - DATA - Unit 4
No ratings yet
BIG - DATA - Unit 4
99 pages
10 Nosql
No ratings yet
10 Nosql
23 pages
Introduction To Nosql: - Key Value Databases
No ratings yet
Introduction To Nosql: - Key Value Databases
14 pages
4 NoSql
No ratings yet
4 NoSql
25 pages
No SQL
No ratings yet
No SQL
109 pages
Learning Guide 2.1 - CloudDatabase - NOSQL PDF
No ratings yet
Learning Guide 2.1 - CloudDatabase - NOSQL PDF
44 pages
NoSQL Database
No ratings yet
NoSQL Database
45 pages
Unit 6
No ratings yet
Unit 6
143 pages
NoSQL
No ratings yet
NoSQL
18 pages
HBase
No ratings yet
HBase
36 pages
Bda - Unit 2
No ratings yet
Bda - Unit 2
30 pages
BDA Unit-3
No ratings yet
BDA Unit-3
13 pages
Mastering ScyllaDB: High-Performance NoSQL with C++
From Everand
Mastering ScyllaDB: High-Performance NoSQL with C++
Robert Johnson
No ratings yet
R programming
No ratings yet
R programming
1 page
Full Text 01
No ratings yet
Full Text 01
32 pages
Mobile Computing and Cloudsourcing
No ratings yet
Mobile Computing and Cloudsourcing
1 page
Compiler Design Lab Manual
No ratings yet
Compiler Design Lab Manual
56 pages
NS
No ratings yet
NS
2 pages
Module 5.1
No ratings yet
Module 5.1
43 pages
Lecture 60
No ratings yet
Lecture 60
6 pages
Module 4.2
No ratings yet
Module 4.2
42 pages
Dsa
No ratings yet
Dsa
44 pages
Digital Forensics Tools
No ratings yet
Digital Forensics Tools
11 pages
tmp3DEB TMP
No ratings yet
tmp3DEB TMP
15 pages
Development of Agriculture Chatbot Using Machine Learning Techniques
No ratings yet
Development of Agriculture Chatbot Using Machine Learning Techniques
5 pages
CSE Sem7 N 8
No ratings yet
CSE Sem7 N 8
51 pages
UNIT I FUNDAMENTALS OF SOCIAL NETWORKING
No ratings yet
UNIT I FUNDAMENTALS OF SOCIAL NETWORKING
23 pages
Cloud Computing For Cities
No ratings yet
Cloud Computing For Cities
275 pages
Knowledge Representation With Ontologies and Semantic Web Technologies To Promote Augmented and Artificial Intelligence in Systems Engineering
No ratings yet
Knowledge Representation With Ontologies and Semantic Web Technologies To Promote Augmented and Artificial Intelligence in Systems Engineering
6 pages
Soylu Et Al, 2017, Ontology-Based End-User Visual Query Formulation
No ratings yet
Soylu Et Al, 2017, Ontology-Based End-User Visual Query Formulation
33 pages
Neural Graph Reasoning: Complex Logical Query Answering Meets Graph Databases
No ratings yet
Neural Graph Reasoning: Complex Logical Query Answering Meets Graph Databases
65 pages
SWSN Ans
No ratings yet
SWSN Ans
10 pages
Fdsa Unit 1 Aids Sem 4
No ratings yet
Fdsa Unit 1 Aids Sem 4
26 pages
Download full Reanimating Industrial Spaces Conducting Memory Work in Post industrial Societies 1st Edition Hilary Orange ebook all chapters
No ratings yet
Download full Reanimating Industrial Spaces Conducting Memory Work in Post industrial Societies 1st Edition Hilary Orange ebook all chapters
55 pages
Semantic Web Practical Sessions Answers
No ratings yet
Semantic Web Practical Sessions Answers
43 pages
KMST Project Report 2018
No ratings yet
KMST Project Report 2018
25 pages
Marklogic Server: Concepts Guide
No ratings yet
Marklogic Server: Concepts Guide
108 pages
BIM and Ontology-Based Approach For Building Cost Estimation
No ratings yet
BIM and Ontology-Based Approach For Building Cost Estimation
10 pages
An Introduction To ML Lifecycle Ontology and Its Applications
No ratings yet
An Introduction To ML Lifecycle Ontology and Its Applications
17 pages
distributed-systems-theory-and-applications-2022055650-2022055651-9781119825937-9781119825944-9781119825951_compress
No ratings yet
distributed-systems-theory-and-applications-2022055650-2022055651-9781119825937-9781119825944-9781119825951_compress
563 pages
IT Convergence and Security: Proceedings of ICITCS 2021 (Lecture Notes in Electrical Engineering, 782) Hyuncheol Kim (Editor) & Kuinam J. Kim (Editor) all chapter instant download
100% (3)
IT Convergence and Security: Proceedings of ICITCS 2021 (Lecture Notes in Electrical Engineering, 782) Hyuncheol Kim (Editor) & Kuinam J. Kim (Editor) all chapter instant download
65 pages
1.6 - The Semantic Web
No ratings yet
1.6 - The Semantic Web
14 pages
The Web of Data - Aidan Hogan
No ratings yet
The Web of Data - Aidan Hogan
696 pages
Habilitation
No ratings yet
Habilitation
186 pages
Validating RDF Data 2017
No ratings yet
Validating RDF Data 2017
308 pages
Mashup Tool For Automatic Query Generation For Data Web
No ratings yet
Mashup Tool For Automatic Query Generation For Data Web
5 pages
Data Management and Query Processing in Semantic Web Databases - Compress
No ratings yet
Data Management and Query Processing in Semantic Web Databases - Compress
273 pages
Souri Oracle Semantic Technologies UTAustin
No ratings yet
Souri Oracle Semantic Technologies UTAustin
112 pages
Summary of New Features in 12.0
No ratings yet
Summary of New Features in 12.0
19 pages
Social Network Analysis Answers
No ratings yet
Social Network Analysis Answers
165 pages