Exercise10 DocumentStores
Exercise10 DocumentStores
Introduction
This exercise will cover document stores. As a representative of document stores, MongoDB was chosen for the
practical exercises. Instructions are provided to install it on the Azure Portal.
1. Document stores
A record in a document store is a document . Document encoding schemes include XML, YAML, JSON, and BSON, as
well as binary forms like PDF and Microsoft Office documents (MS Word, Excel, and so on). MongoDB documents are
similar to JSON objects. Documents are composed of field-value pairs and have the following structure:
The values of fields may include other documents, arrays, and arrays of documents. Data in MongoDB has a flexible
schema in the same collection. All documents do not need to have the same set of fields or structure, and common
fields in a collection's documents may hold different types of data.
Questions
1. What are advantages of document stores over relational databases?
2. Can the data in document stores be normalized?
3. How does denormalization affect performance?
4. How does a large number of small documents affect performance?
5. What makes document stores different from key-value stores?
a.
{
name: "O'Reilly Media",
founded: 1980,
location: "CA",
books: [12346789, 234567890, ...]
}
{
_id: 123456789,
title: "MongoDB: The Definitive Guide",
author: [ "Kristina Chodorow", "Mike Dirolf" ],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English"
}
{
_id: 234567890,
title: "50 Tips and Tricks for MongoDB Developer",
author: "Kristina Chodorow",
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English"
}
b.
{
_id: "joe",
name: "Joe Bookreader"
}
{
patron_id: "joe",
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: "12345"
}
c.
{
_id: "joe",
name: "Joe Bookreader"
}
{
patron_id: "joe",
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: "12345"
}
{
patron_id: "joe",
street: "1 Some Other Street",
city: "Boston",
state: "MA",
zip: "12345"
}
d.
{
_id: "joe",
name: "Joe Bookreader",
address: {
street: "123 Fake Street",
city: "Faketon",
state: "MA",
state: "MA",
zip: "12345"
}
}
e.
{
_id: "oreilly",
name: "O'Reilly Media",
founded: 1980,
location: "CA"
}
{
_id: 123456789,
title: "MongoDB: The Definitive Guide",
author: [ "Kristina Chodorow", "Mike Dirolf" ],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English",
publisher_id: "oreilly"
}
{
_id: 234567890,
title: "50 Tips and Tricks for MongoDB Developer",
author: "Kristina Chodorow",
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English",
publisher_id: "oreilly"
}
f.
{
_id: "joe",
name: "Joe Bookreader",
addresses: [
{
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: "12345"
},
{
street: "1 Some Other Street",
city: "Boston",
state: "MA",
zip: "12345"
}
]
}
3. MongoDB
Setup an Azure VM
1. Click on Create a Resource, then in the search box type Ubuntu Server 18.04 LTS and then click Create.
2. In the next page, choose your subscription, a resource group and the virtual machine name.
2. In the next page, choose your subscription, a resource group and the virtual machine name.
3. In the Administration account section choose Password and set up your username/password.
4. Finally in the Inbound port rules select Allow selected ports and from the dropdown list choose SSH (22).
Press Next.
5. In the Disks and Networking tabs press Next without changing anything.
6. In the Management tab set Boot diagnostics to Off.
7. Press Next until you reach the Review+Create tab. If you have done everything correctly, you can press
Create. Otherwise, fix the errors. Wait for your VM to spawn (< 3 minutes).
8. In order to connect to your Virtual Machine, find it in the dashboard, click on it and then on theConnect button.
In the pop-up modal copy the Login using VM Account and connect using SSH from a terminal.
When you have connected on the VM type type the following commands:
wget https://raw.githubusercontent.com/mongodb/docs-assets/primer-dataset/primer-dataset.json
to retrieve the data. Use mongoimport to insert the documents into the restaurants collection in the test database .
If the collection already exists in the test database, the operation will drop the restaurants collection first.
mongo --shell
In the mongo shell connected to a running MongoDB instance, switch to the test database.
use test
Try to insert a document into the restaurants collection. In addition, you can see the structure of documents the in
the collection.
db.restaurants.insert(
{
"address" : {
"street" : "2 Avenue",
"zipcode" : "10075",
"building" : "1480",
"coord" : [ -73.9557413, 40.7720266 ]
},
"borough" : "Manhattan",
"cuisine" : "Italian",
"grades" : [
{
"date" : ISODate("2014-10-01T00:00:00Z"),
"grade" : "A",
"score" : 11
},
{
"date" : ISODate("2014-01-16T00:00:00Z"),
"grade" : "A",
"score" : 17
}
],
],
"name" : "Vella",
"restaurant_id" : "41704620"
}
)
db.restaurants.find()
db.restaurants.findOne()
To format the printed result, you can add .pretty() to the operation, as in the following:
db.restaurants.find().limit(1).pretty()
Query Documents
For the db.collection.find() method, you can specify the following optional fields:
3.4 Questions
Write queries in MongoDB that return the following:
1. All restaurants in borough (a town) "Brooklyn" and cuisine (a style of cooking) "Hamburgers".
2. The number of restaurants in the borough "Brooklyn" and cuisine "Hamburgers".
3. All restaurants with zipcode 11225.
4. Names of restaurants with zipcode 11225 that have at least one grade "C".
5. Names of restaurants with zipcode 11225 that have as first grade "C" and as second grade "A".
6. Names and streets of restaurants that don't have an "A" grade.
7. All restaurants for which at least one rating has a grade C and a score greater than 50.
8. All restaurants with a grade C or a score greater than 50.
9. A table with zipcode and number of restaurants that are in the borough "Queens" and have "Brazilian" cuisine.
https://docs.mongodb.com/getting-started/shell/query/
https://docs.mongodb.com/getting-started/shell/aggregation/
https://docs.mongodb.com/manual/aggregation/
4. Indexing in MongoDB
Indexes support the efficient resolution of queries. Without indexes, MongoDB must scan every document of a
collection to select those documents that match the query statement. Scans can be highly inefficient and require
MongoDB to process a large volume of data.
Indexes are special data structures that store a small portion of the data set in an easy-to-traverse form. The index
stores the value of a specific field or set of fields, ordered by the value of the field as specified in the index.
MongoDB supports indexes that contain either a single field or multiple fields depending on the operations that this
index type supports.
By default, MongoDB creates the _id index, which is an ascending unique index on the _id field, for all collections
when the collection is created. You cannot remove the index on the _id field.
db.restaurants.find({"borough" : "Brooklyn").explain()
In the mongo shell, you can create an index by calling the createIndex() method.
db.restaurants.createIndex( { borough : 1 })
db.restaurants.find({"borough" : "Brooklyn").explain()
The value of the field in the index specification describes the kind of index for that field. For example, a value of 1
specifies an index that orders items in ascending order. A value of -1 specifies an index that orders items in
descending order.
To remove all indexes, you can use db.collection.dropIndexes() . To remove a specific index you can use
db.collection.dropIndex() , such as db.restaurants.dropIndex({ borough : 1 }) .
Questions
1. Write an index that will speed up the following query: db.restaurants.find({"borough" : "Brooklyn"})
2. We have an index on address field as follows:
db.restaurants.createIndex( { address : -1 })
db.restaurants.find({"address.zipcode" : "11225" })
1.
2.
3.
4.
5.
6.
7.
SELECT COUNT(user_id)
FROM users
a.
db.users.find(
{ age: { $gt: 25, $lte: 50 } }
)
b.
db.users.find(
{ },
{ user_id: 1, status: 1, _id: 0 }
{ user_id: 1, status: 1, _id: 0 }
)
c.
db.createCollection( "users")
d.
db.users.insert(
{ user_id: "bcd001", age: 45, status: "A" }
)
e.
f.
db.users.find(
{ $or: [ { status: "A" } ,
{ age: 50 } ] }
)
g.
db.users.find()
6. True or False
Say if the following statements are true or false.
1. In document stores, you must determine and declare a table's schema before inserting data.
2. Documents stores are not subject to data modeling and support only one denormalized data model.
3. Different relationships between data can be represented by references and embedded documents.
4. MongoDB provides the capability to validate documents during updates and insertions.
5. There are no joins in MongoDB.