0% found this document useful (0 votes)
33 views

NoSQL Database

The document discusses NoSQL databases and MongoDB. It introduces NoSQL and its categories including key-value stores, document stores, wide-column stores and graph stores. It then focuses on MongoDB as an example of a document store, explaining its data model using JSON-like documents and collections.

Uploaded by

chloegao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

NoSQL Database

The document discusses NoSQL databases and MongoDB. It introduces NoSQL and its categories including key-value stores, document stores, wide-column stores and graph stores. It then focuses on MongoDB as an example of a document store, explaining its data model using JSON-like documents and collections.

Uploaded by

chloegao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 45

Lecture9

NoSQL Database

IIMT 3601 Database Management


HKU Business School
Instructor: Dr. Shengjun Mao
Agenda
• Introduction to NoSQL
• Categories of NoSQL Databases
• Example: MongoDB
Summary of RDBMS
• RDBMSs have been around for decades
• Data stored in tables
• Schema-based, i.e., structured tables
• Each row (data item) in a table has a primary key that is unique within that table
• Relationships between entities are realized by primary-foreign keys
• Queried using SQL, sometimes also called SQL databases
Popular RDBMS
Requirements of Today’s Workloads
• Speed and volume
• Large and unstructured data
• Incremental scalability
• Adaption to changes in data structure

• Especially, web-based applications caused spikes


• A rapid reduction in storage cost
• Hooking RDBMS to web-based application becomes troublesome
NoSQL Data Model
• NoSQL: abbreviated from “Not Only SQL”
• A category of data storage and retrieval technologies that are not based on the
relational model.
• NoSQL DBMSs provide opportunities for “schema on read”, instead of “schema
on write”.
• Schema on write– data model is predefined
• Schema on read – the reporting and analysis organization of the data will be determined at
the time of the use of the data
• NoSQL DBMSs allow “scaling out”, instead of “scaling up”
• Scale up = grow your cluster capacity by replacing with more powerful machines
• Scale out = incrementally grow your cluster capacity by adding more machines
• Most are from open source communities.
• Designed for big data
NoSQL Data Model
• Schema on write vs Schema on read
NoSQL Data Model
• Key features:
• Non-relational
• Do not require schema
• Horizontal scalable
• Data are replicated to multiple nodes and can be partitioned
• down nodes easily replaced
• no single point of failure
• Cheap, easy to implement (open-source)
• Massive write performance
• Fast key-value access
• …
Benefits of NoSQL
• Elastic Scaling
• RDBMS scale up–bigger load, bigger server
• NoSQL scale out–distribute data across multiple hosts seamlessly
• DBA Specialists
• RDBMS require highly trained expert to monitor DB
• NoSQL require less management, automatic repair and simpler data models
• Big Data
• Huge increase in data RDBMS: capacity and constraints of data volumes at its limits
• NoSQL designed for big data
Benefits of NoSQL
• Flexible data models
• Change management to schema for RDBMS have to be carefully managed
• NoSQL databases more relaxed in structure of data
• Database schema changes do not have to be managed as one complicated change unit
• Application already written to address an amorphous schema
• Economics
• RDBMS rely on expensive proprietary servers to manage data
• NoSQL: clusters of cheap commodity servers to manage the data and transaction volumes
• Cost per gigabyte or transaction/second for NoSQL can be lower than the cost for a RDBMS
Drawbacks of NoSQL
• Support
• RDBMS vendors provide a high level of support to clients
• Stellar reputation
• NoSQL –are open source projects with startups supporting them
• Reputation not yet established
• Maturity
• RDBMS mature product: means stable and dependable
• Also means old no longer cutting edge nor interesting
• NoSQL are still implementing their basic feature set
Drawbacks of NoSQL
• Administration
• RDBMS administrator well defined role
• NoSQL’s goal: no administrator necessary; however, NoSQL still requires effort to maintain
• Lack of Expertise
• Whole workforce of trained and seasoned RDBMS developers
• Still recruiting developers to the NoSQL camp
• Analytics and Business Intelligence
• RDBMS designed to support decision-making
• NoSQL designed to meet the needs of an Web 2.0 application - not designed for ad hoc query
of the data
• Tools are being developed to address this need
• More flexible to include new and unstructured data
NoSQL Databases

See more in https://db-engines.com/en/ranking


Who are using them?
NoSQL categories
• key-value stores
• Example: Redis, Amazon DynamoDB, Microsoft Azure Cosmos DB
• Document stores
• Example: MongoDB, Amazon DynamoDB
• Wide-column stores
• Example: Cassandra, Hbase, Google BigTable
• Graph stores
• Example: Neo4j, Microsoft Azure Cosmos DB
Key-value Stores
• NoSQL databases generally rely on key-value store.
• Format: key: value
• Examples:
• Business: Key  Value
• twitter.com: tweet id  information about tweet
• amazon.com: item number  information about it
• facebook.com: user id  user profile, photos, etc.
• kayak.com: flight number  information about flight, e.g., availability
• yourbank.com: account number  account balances, transaction histories
Key-value stores
• Data model: collection of Key-value pairs

Example: REDIS
• Standard key-value stores
• Values can be strings, lists, sets, hashes etc.
Document-based
• Can model more complex objects
• Data model: collection of documents

• Document: a structured set of data formatted using a standard such as JSON.


• A document has its structures.
• Contents can be accessed and modified based on the structure
• Document itself is accessed via “key”
Document-based
• Example: (MongoDB) document
{
Name:"Jaroslav",
Address:"Malostranske nám. 25, 118 00 Praha 1",
Grandchildren: {Claire: "7", Barbara: "6", "Magda: "3",
"Kirsten: "1", "Otis: "3", Richard: "1"},
Phones: ["123-456-7890", "234-567-8963"]
}
Wide column stores
• Tables similarly to RDBMS, but handle semi-structured data
• But each row have different columns structure
• Data model:
• Collection of Column Families
• Column family = (key, value), where value = set of related columns
• One column family can have variable numbers of columns
• Example: Cassandra
Graph-based
• Designed for modeling the connecting data
• Based on graph theory (Vertex and Edges)
• Data model: (property graph) nodes and relationships
• Nodes have “names” (labels),
• Relationships have “names” (types), with properties
• Nodes and relationships are associated with properties, which are key-values pairs
• Collections of properties associated with each node may vary
• Example: Neo4j
Example: MongoDB
• Developed by 10gen
• Founded in 2007
• Document-based, NoSQL database
• Hash-based, schema-less database
• No Data Definition Language
• In practice, this means you can store hashes with any keys and values that you choose
• Keys are a basic data type but in reality stored as strings
• Document Identifiers (_id) will be created for each document, field name reserved by system
• Uses BSON format
• Written in C++
• Supports APIs (drivers) in many computer languages
• JavaScript, Python, Ruby, Perl, Java, Java Scala, C#, C++, Haskell, Erlang

17
For more details, refer to https://www.mongodb.com/docs/manual/
MongoDB: Hierarchical Objects
• A MongoDB instance may have zero or more ‘databases’
• A database may have zero or more ‘collections’
• A collection may have zero or more ‘documents’
• A document may have one or more ‘fields’
0 or more database
collections More collections
Documents More Documents Documents More Documents
Fields More Fields More Fields More Fields More
Fields Fields Fields Fields
MongoDB: Hierarchical Objects
• Document
• BSON format: Binary-encoded object notation (Binary JSON), JSON-like documents
• Identified by a pair of curly brackets {}
• Key-value pairs are stored
• Fields are separated from each other by “,”
• An array is stored in brackets []
• “_id” field is required in the database
• Can have embedded documents

{
name: "travis",
salary: 30000,
designation: "Computer
Scientist",
teams: [ "front-end",
"database" ]
}
MongoDB: Hierarchical Objects
• Collection
• A set of documents that are intended to be stored together
• Each document in the collection may have different structures.
• Only requirement is _id should be unique in the collection

{
_id : <ObjectId2>,
username : "John Backus”,
birth : ISODate("1924-12-
03T05:00:00Z")
}
MongoDB Concepts to RDBMS

RDBMS MongoDB
Database Database Collection is not strict about what it
stores
Table, View Collection
Row Document (BSON) Schema-less
Column Field
Hierarchy is evident in the design
Index Index
Join Embedded Document Embedded document
MongoDB Processes and Configuration
• mongod – Database instance
• Replica set consists of multiple mongod servers
• Replica set members are mirrors of each other
• One is primary
• Others are secondary
• mongos – Sharding processes
• Analogous to a database router (or work allocator)
• Processes all requests
• Decides how many and which mongods should receive the query
• mongos collates the results, and sends it back to the client

23
MongoDB Processes and Configuration
• mongosh – MongoDB Shell, an interactive shell (a client)
• Fully functional JavaScript and Node.js environment for interacting with a MongoDB

23
Querying MongoDB
• Basic CRUD Operations
• Create
• db.collection.insertOne(<document>)
• db.collection.insertMany(<documents>)
• Read
• db.collection.find(<query>, <projection>)
• Update
• db.collection.updateOne(<query>, <update>, <options>)
• db.collection.updateMany(<query>, <update>, <options>)
• Delete
• db.collection. deleteMany(<query>)
Create Operations
• db.collection specifies the collection or the ‘table’ to store the document
• db.collection.insertOne( <document> )
• db.collection.insertMany( <document> )
• Omit the _id field to have MongoDB generate a unique key
• Example:
• db.example.insertOne( {type: "screwdriver", quantity: 15 })
• db.example.insertOne({_id:10, type: "hammer", quantity: 1 })

31
Create Operations
• Create and insert data in collection “inventory”
db.inventory.insertMany([
{ item: "journal", qty: 25, size: { h: 14, w: 21, uom: "cm" }, status: "A" },
{ item: "notebook", qty: 50, size: { h: 8.5, w: 11, uom: "in" }, status: "A" },
{ item: "paper", qty: 100, size: { h: 8.5, w: 11, uom: "in" }, status: "D" },
{ item: "planner", qty: 75, size: { h: 22.85, w: 30, uom: "cm" }, status: "D" },
{ item: "postcard", qty: 45, size: { h: 10, w: 15.25, uom: "cm" }, status: "A" }
]);
Read Operations
• db.collection.find( <query doc>, <projection doc> )
• Provides functionality similar to the SELECT command
• <query> WHERE condition , <projection> fields in SELECT set
• Empty doc parameters: db.collection.find({}) returns all data (SELECT * FROM collection)
• Query document
• Equality condition example { <field1>: <value1>, ... }
• Example: SELECT * FROM inventory WHERE status = ‘D’
• db.inventory.find( { status: "D" } )
• Result:

32
Read Operations
• db.collection.find( <query doc>, <projection doc> )
• Query Operators
• Specify conditions: {<field1>:{<operator1>: <value1> }, ... }
• Example: SELECT * FROM inventory WHERE status IN ("A", "D")
• db.inventory.find( { status: { $in: [ "A", "D" ] } } )
• Result:

32
Read Operations
• db.collection.find( <query doc>, <projection doc> )
• Query Operators
• Specify AND Conditions {<field1>:<value1>, <field2>:<value2>... }
• Example: SELECT * FROM inventory WHERE status = "A" AND qty < 30
• db.inventory.find( { status: "A", qty: { $lt: 30 } } )
• Result:

32
Read Operations
• db.collection.find( <query doc>, <projection doc> )
• Query Operators
• Specify OR Conditions {$or:[<condition1>, <condition2>... }]}
• Example: SELECT * FROM inventory WHERE status = "A" OR qty < 30
• db.inventory.find( { $or: [ { status: "A" }, { qty: { $lt: 30 } } ] } )
• Result:

32
Read Operations
• db.collection.find( <query doc>, <projection doc> )
• Project Fields: <projection doc>
• <field>: 1 to include a field in the returned documents
• <field>: 0 to exclude a field in the returned documents
• Example: SELECT item, qty, size FROM inventory WHERE status = “D”;
• Note: by default _id will be returned. To exclude it in the results set it to 0 in projection doc.
• db.inventory.find( {status: “D” }, {item: 1, qty: 1, size: 1 } );
• Result:

32
Read Operations
• db.collection.find( <query doc>, <projection doc> )
• sort(<order doc>) pipeline to sort the result
• <field>: 1 sort the result in ascending order
• <field>: -1 sort the result in descending order
• Example: SELECT _id, item, qty, size FROM inventory WHERE status = “D” ORDER BY qty;
• db.inventory.find( {status:“D” }, { item: 1, qty: 1, size:
1 } ).sort({qty : 1});
• Result:

32
Query Operators
Name Description
$eq Matches value that are equal to a specified value
$gt, $gte Matches values that are greater than (or equal to a specified value
$lt, $lte Matches values less than or ( equal to ) a specified value
$ne Matches values that are not equal to a specified value
$in Matches any of the values specified in an array
$nin Matches none of the values specified in an array
$or Joins query clauses with a logical OR returns all
$and Join query clauses with a loginal AND
$not Inverts the effect of a query expression
$exists Matches documents that have a specified field

Update Operations
• db.collection.updateOne( <query>, <update>, <options> )
•update the first document that satisfies the query conditions
•<query doc> is same as query doc in find
•<update doc>
• $set operator updates values of identified fields.

34
Update Operations
• Example: Update for the first “paper” item: size.uom as “cm” and status as “P”
and add an indicator lastModified to show it is updated.
• $currentDate operator sets the value of a field to the current date
db.inventory.updateOne(
{ item: "paper" },
{
$set: { "size.uom": "cm", status: "P" },
$currentDate: { lastModified: true }
}
)

34
Update Operations
• db.collection.updateMany( <query>, <update>, <options> )
•update all the document that satisfies the query conditions
•Example: Update for items with less than 50 in stock: size.uom as “in” and status
as “P”
db.inventory.updateMany( db.inventory.find({ "qty": { $lt: 50 } })
{ "qty": { $lt: 50 } },
{
$set: { "size.uom": "in", status: "P" },
$currentDate: { lastModified: true }
}
)

34
Update Operations
• db.collection.updateMany( <query>, <update>, <options> )
•Example
• UPDATE inventory SET status = “P” WHERE qty < 50;
db.inventory.updateMany(
{ "qty": { $lt: 50 } },
{
$set: {status: "P" }
}
)

34
Delete Operations
• db.collection.deleteMany( <query> )
• Removes all documents that match the filter from a collection
• <query> is the same as find query
• Example: DELETE FROM inventory WHERE status = “A”;
db.inventory.deleteMany({ status : "A" })

• Example: DELETE FROM inventory;


db.inventory.deleteMany({})

35
Announcement
• Final Exam
• Datetime & Venue: Next lecture session
• 9:30 – 11:30am, April 26
• MBG07, this lecture room
• written exam
• Closed books and notes
• Pencils, erasers, pens
• Assignment 2
• Due today, 11:59pm, April 19.
• STLF: Student Teaching&Learning Feedback
• Please complete the SFTL online form at http://sftl.hku.hk/ before May 5.
• Submit in Moodle the snapshot when finished.
• It counts one time in the class participation.
• The End
• Thanks

You might also like