Nosql Database
Nosql Database
Overview
NoSQL (Not Only / Non SQL) databases are a new and emerging set of database technologies that provide means to
store and retrieve data without the use of relational tables. The term was originally coined by Carlo Strozzi in 1998
and was referred as Strozzi NoSQL but contained relational database. The term was reintroduced by Johan Okarsson
and was used for open source non-relational databases and it is in this context that it became popular and more
widely recognized. However, in recent times many non-relational databases do provide support for SQL and hence
the term NoSQL is also used as an abbreviation of Not only SQL.
2. Availability
NoSQL databases are usually implemented as distributed systems. When one node of this distributed network goes
down, the data request are handled by the other nodes in the system. This ensures continuous system availability.
3. Scalability
NoSQL databases support horizontal scalability. As mentioned earlier, NoSQL databases are implemented using
distributed architecture. As the amount of data and data operations increase, system capability can be increased
by simply adding more nodes to the system.
5. Replication
Database replication is the electronic copying of data from a database in one computer or server and duplicating
that data into another database so that all users share the same level of information (Rouse). With the vertical
scalability of a traditional RDBMS, executing replication is difficult and has become tedious, but it can be done via a
semi-manual process. However, automatic replication is highlighted in most NoSQL databases by virtually providing
seamless availability and higher retention rate of disaster recoveries.
6. Auto-sharding
NoSQL databases supports auto-sharding, meaning that data will automatically spread across all servers within the
same database. This feature offered by a NoSQL database can prove to be quite valuable when a server goes down
because data is evenly distributed; thus information from the down server can be quickly recovered from the other
operating servers with minimal downtime and no application disruptions.
Since NoSQL databases are designed with horizontal scaling in mind, sharding and replication work in conjunction
with one another. These automation characteristics of NoSQL provides transparency and simple configurations for
todays infrastructure.
2. No standard interface
Unlike relational DBMS, NoSQL databases do not have a standard interface. Interfaces, support for SQL and other
querying techniques, operating system and programing language compatibility is vendor specific.
3. Security issues
NoSQL databases do not provide the same level of safety and authentication as that provided by relational
databases. Users of NoSQL databases have to create their own set of security mechanisms to protect the database.
Being loosely coupled with the other systems, the databases are also easily susceptible to injection attacks.
4. Transactional integrity
NoSQL databases provide high performance and scalability using the concept of soft state. In order to make these
databases transactional friendly, complex constraints will have to be applied which will result in the degradation of
both performance and scalability. Hence they are not suitable for applications of a transactional nature.
2. Document databases
These are a more complex version of the key value stores. The values in this case are documents. These documents
can be structured, semi structured or unstructured. Some databases also support JSON or XML document storage.
Storing documents in these formats makes it easier to query them. Examples include Couchbase, MongoDB and
IBM Cloudant.
3. Column stores
In these databases data is stored on the basis of columns instead of rows. Its like inverting a relational database
table. Column compression techniques are used and it allows queries to run faster than a standard relational
database table would. Examples of column stores include Hadoop/Hbase, Amazon Simple DB, Google Big Table and
Cassandra.
4. Graph databases
Graphs are used to store huge amounts of data with a high degree of interrelation between the different objects
represented by the data. The graphs contains nodes and edges. The nodes represent the objects and have
properties associated with each nodes. The edges represent the relationship between two nodes. Thus queries can
traverse a huge number of nodes starting from a single or a small set of nodes. Examples include Google Maps,
Neo4j and Titan.
NoSQL applications:
Following are some applications of NoSQL databases:
Financial applications
Online File storage systems
Hybrid systems
Internet of things
Data mining
Big Data applications
Social networking websites
Navigation apps
Exploring MongoDB
MongoDB is an open-source document database that provides high performance, high availability, and automatic
scaling. MongoDB obviates the need for an Object Relational Mapping (ORM) to facilitate development.
Databases
MongoDB allows users to create logical databases in the system to store the collections and documents. Your
application might have different modules and each module might require its own database due overlapping entity
names and other constraints.
Following are some database related commands:
show databases / show dbs: This command shows a list of available databases in the system.
use databasename : Creates a new database with specified name if it does not exist. Switched to the specified
database if it already exists.
db.runCommand({dropDatabase:1}): Used to delete a database.
Collections
MongoDB stores documents in collections. Collections are analogous to tables in relational databases. Unlike a table,
however, a collection does not require its documents to have the same schema.
Following are some collection related commands:
db.createCollection(name, options): Creates a collection with the specified name and options. Options as the
name suggests are optional and include features like enable capped collection, maximum collection size, maximum
documents allowed, auto-indexing, etc.
e.g.: db.createCollection("log", { capped : true, size : 5242880, max : 5000 } )
db.collectionName.drop(): Drops the specified collection from the database.
Documents
A record in MongoDB is a document, which is a data structure composed of field and value pairs. MongoDB documents
are similar to JSON objects. The values of fields may include other documents, arrays, and arrays of documents.
Following are some document related commands:
db.collectionName.insert( {JSON Document} ): Inserts the document specified in the round braces into the
collection
db.collectionName.remove({key:value},{options}): Removes the document or documents containing the
specified key value pair. Options are used to specify whether to clear one document or multiple document.
db.collectionName.find({key:value}) : Find the documents with the specified key value pair. Can also be used
to perform complex search using the AND, OR, greater than and less than operators.
db.collectionName.update({search key value pair}, {$set:{key value pair}}): Uses the search key value pair to
find the document and update the kay value specified in $set function. Updates the first document it comes
across.
Modelling relationships:
MongoDB documents can model 1:1 and 1:M relationships just like relational databases. However the way data is
stored is depends on application constrains.Every document in MongoDB has a unique id field which can be used to
identify the document. In 1:1 or 1:M relationship we have the option to either use these ids as reference fields or embed
entire documents in the main document.
Example of 1:1 embedded modelling:
Using MongoDB we can also model documents using tree hierarchies and a few other application contexts.
Conclusion
NoSQL databases are still evolving and we find that more organizations are moving from traditional relational databases
to non-relational databases. With the capacity of horizontal scalability, we safely assume the future of NoSQL is in the
procurement of a variety of database tools and wider scope projects involving large unstructured distributed data with
high requirements on scaling. Also, NoSQL databases is in its primitive stage, which requires database designers to have
a good understand of its architecture.
The goal of this study is to define and understand the term NoSQL and how it has affected RDBMS. From this research
paper and our evaluation, we have realized that NoSQL has created an alternative to RDBMS and how application
designers are given options depending on the type of application they are using.
Discussion Questions
1. Do you think NoSQL databases will take the place of traditional RDBMS?
2. What does the future hold for MongoDB?
3. When
should
you
use
NoSQL
over
SQL/RDBMS?
References:
a) Websites:
1. http://nosql-database.org/
2. http://www.christof-strauch.de/nosqldbs.pdf
3. http://www.couchbase.com/
4. http://www.planetcassandra.org/
5. http://www.tutorialspoint.com/mongodb
6. https://www.mongodb.com/
Rouse,
Margaret,
Database
replication
definition,
TechTarget.com,
April
2012,
Asadulla Khan Zaki NoSQL DATABASES: NEW MILLENNIUM DATABASE FOR BIG DATA, BIG USERS, CLOUD
COMPUTING AND ITS SECURITY CHALLENGES, International Journal of Research in Engineering and
Technology. eISSN: 2319-1163 | pISSN: 2321-7308
Jaroslav Pokorny, NoSQL databases: a step to database scalability in web environment, DOI:
10.1145/2095536.2095583, The 13th International Conference on Information Integration and Web-based
Applications and Services, 5-7 December 2011, Ho Chi Minh City, Vietnam.