10gen Top 5 NoSQL Considerations
10gen Top 5 NoSQL Considerations
10gen Top 5 NoSQL Considerations
Table of Contents
Introduction
Data Model
Document Model
Graph Model
Key-Value and Wide Column Models
2
2
2
2
Query Model
Document Database
Graph Database
Key Value and Wide Column Databases
3
3
3
3
Consistency Model
Consistent Systems
Eventually Consistent Systems
4
4
4
APIs
Idiomatic Drivers
Thrift or RESTful APIs
SQL-Like APIs
5
5
5
5
6
6
6
6
Conclusion
We Can Help
Introduction
Relational databases have a long-standing position in most
organizations, and for good reason. Relational databases
underpin existing applications that meet current business
needs; they are supported by an extensive ecosystem of
tools; and there is a large pool of labor qualified to
implement and maintain these systems.
But organizations are increasingly considering alternatives
to legacy relational infrastructure. In some cases the
motivation is technical such as a need to handle new,
multi-structured data types or scale beyond the capacity
constraints of existing systems while in other cases the
motivation is driven by the desire to identify viable
alternatives to expensive proprietary database software
and hardware. A third motivation is agility or speed of
development, as companies look to adapt to the market
more quickly and embrace agile development
methodologies.
These motivations apply both to analytical and operational
applications. Companies are shifting workloads to Hadoop
for their bulk analytical workloads, and they are building
online, operational applications with a new class of
so-called NoSQL or non-relational databases.
Data Model
The primary way in which non-relational databases differ
from relational databases is the data model. Although there
are dozens of non-relational databases, they primarily fall
into one of the following three categories:
Document Model
Whereas relational databases store data in rows and
columns, document databases store data in documents.
These documents typically use a structure that is like
JSON (JavaScript Object Notation), a format popular
among developers. Documents provide an intuitive and
natural way to model data that is closely aligned with
object-oriented programming each document is
effectively an object. Documents contain one or more
fields, where each field contains a typed value, such as a
string, date, binary, or array. Rather than spreading out a
record across multiple columns and tables connected with
foreign keys, each record and its associated (i.e., related)
data are typically stored together in a single document.
This simplifies data access and, in many cases, eliminates
Graph Model
Graph databases use graph structures with nodes, edges
and properties to represent data. In essence, data is
modeled as a network of relationships between specific
elements. While the graph model may be counter-intuitive
and takes some time to understand, it can be useful for a
specific class of queries. Its main appeal is that it makes it
easier to model and navigate relationships between entities
in an application.
Applic
Applications:
ations: Graph databases are useful in cases where
traversing relationships are core to the application, like
navigating social network connections, network topologies
or supply chains.
Examples: Neo4j and Giraph.
Applic
Applications:
ations: Key value stores and wide column stores
are useful for a narrow set of applications that only query
data by a single key value. The appeal of these systems is
their performance and scalability, which can be highly
optimized due to the simplicity of the data access patterns
and opacity of the data itself.
Examples: Riak and Redis (Key-Value); HBase and
Cassandra (Wide Column).
TAKEAWAYS
All of these data models provide schema flexibility.
The key-value and wide-column data model is opaque in
the system - only the primary key can be queried.
The document data model has the broadest applicability.
The document data model is the most natural and most
productive because it maps directly to objects in
modern object-oriented languages.
The wide column model provides more granular access
to data than the key value model, but less flexibility than
the document data model.
Query Model
Each application has its own query requirements. In some
cases, it may be acceptable to have a very basic query
model in which the application only accesses records
based on a primary key. For most applications, however, it is
important to have the ability to query based on several
different values in each record. For instance, an application
that stores data about customers may need to look up not
only specific customers, but also specific companies, or
Document Database
Document databases provide the ability to query on any
field within a document. Some products, such as
MongoDB, provide a rich set of indexing options to
optimize a wide variety of queries, including text indexes,
geospatial indexes, compound indexes, sparse indexes,
time to live (TTL) indexes, unique indexes, and others.
Furthermore, some of these products provide the ability to
analyze data in place, without it needing to be replicated to
dedicated analytics or search engines. MongoDB, for
instance, provides both the Aggregation Framework for
providing real-time analytics (along the lines of the SQL
GROUP BY functionality), and a native MapReduce
implementation for other types of sophisticated analyses.
To update data, MongoDB provides a find and modify
method so that values in documents can be updated in a
single statement to the database, rather than making
multiple round trips.
Graph Database
These systems tend to provide rich query models where
simple and complex relationships can be interrogated to
make direct and indirect inferences about the data in the
system. Relationship-type analysis tends to be very
efficient in these systems, whereas other types of analysis
may be less optimal. As a result, graph databases are rarely
used for more general purpose operational applications.
TAKEAWAYS
The biggest difference between non-relational
databases lies in the ability to query data efficiently.
Document databases provide the richest query
functionality, which allows them to address a wide
variety of operational and real-time analytics
applications.
Key-value stores and wide column stores provide a
single means of accessing data: by primary key. This
can be fast, but they offer very limited query
functionality and may impose additional development
costs and application-level requirements to support
anything more than basic query patterns.
Consistent Systems
Each application has different requirements for data
consistency. For many applications, it is imperative that the
data be consistent at all times. As development teams have
worked under a model of consistency with relational
databases for decades, this approach is more natural and
familiar. In other cases, eventual consistency is an
acceptable trade-off for the flexibility it allows in the
systems availability.
Document databases and graph databases can be
consistent or eventually consistent. MongoDB provides
tunable consistency. By default, data is consistent all
writes and reads access the primary copy of the data. As
an option, read queries can be issued against secondary
copies where data maybe eventually consistent if the write
operation has not yet been synchronized with the
secondary copy; the consistency choice is made at the
query level.
Consistency Model
TAKEAWAYS
Most applications and development teams expect
consistent systems.
Different consistency models pose different trade-offs
for applications in the areas of consistency and
availability.
MongoDB provides tunable consistency, defined at the
query level.
Eventually consistent systems provide some advantages
for inserts at the cost of making reads, updates and
deletes more complex, while incurring performance
overhead through read repairs and compactions.
APIs
There is no standard for interfacing with non-relational
systems. Each system presents different designs and
capabilities for application development teams. The
maturity of the API can have major implications for the time
and cost required to develop and maintain the application
and database.
Idiomatic Drivers
There are a number of popular programming languages,
and each provides different paradigms for working with
data and services. Idiomatic drivers are created by
development teams that are experts in the given language
and that know how programmers prefer to work within that
language. This approach can also benefit from its ability to
leverage specific features in a programming language that
might provide efficiencies for accessing and processing
data.
For programmers, idiomatic drivers are easier to learn and
use, and they reduce the onboarding time for teams to
begin working with the underlying database. For example,
idiomatic drivers provide direct interfaces to set and get
SQL-Like APIs
Some non-relational databases have attempted to add a
SQL-like access layer to the database, in the hope this will
reduce the learning curve for those developers and DBAs
already skilled in SQL. It is important to evaluate these
implementations before serious development begins,
considering the following:
Most of these implementations fall a long way short
compared to the power and expressivity of SQL, and will
demand SQL users learn a feature-limited dialect of the
language.
Most support queries only, with no support for write
operations. Therefore developers will still need to learn
the databases native query language.
SQL-based BI, reporting, and ETL tools will not be
compatible with a custom SQL implementation.
While some of the syntax may be familiar to SQL
developers, data modeling will not be. Trying to impose a
relational model on any non-relational database will
have disastrous consequences for performance and
application maintenance.
TAKEAWAYS
The maturity and functionality of APIs vary significantly
across non-relational products.
MongoDBs idiomatic drivers minimize onboarding time
for new developers and simplify application
development.
Not all SQL is created equal. Carefully evaluate the
SQL-like APIs offered by non-relational databases to
ensure they can meet the needs of your application and
developers
Commercial Support
TAKEAWAYS
Community size and commercial strength is an
important part of evaluating non-relational databases.
MongoDB has the largest commercial backing; the
largest and most active community; support teams
spread across the world providing 24x7 coverage;
user-groups in most major cities; and extensive
documentation.
Community Strength
There are significant advantages of having a strong
community around a technology, particularly databases. A
database with a strong community of users makes it easier
to find and hire developers that are familiar with the
product. It makes it easier to find best practices,
documentation and code samples, all of which reduce risk
Figur
Figure
e 1: MongoDB Nexus Architecture, blending the best
of relational and NoSQL technologies
Expr
Expressive
essive query language. Users should be able to
access and manipulate their data in sophisticated ways
with powerful query, projection, aggregation and update
operators, to support both operational and analytical
applications.
Conclusion
We Can Help
We are the MongoDB experts. Over 2,000 organizations
rely on our commercial products, including startups and
more than a third of the Fortune 100. We offer software
and services to make your life easier:
MongoDB Enterprise Advanced is the best way to run
MongoDB in your data center. Its a finely-tuned package
of advanced software, support, certifications, and other
services designed for the way you do business.
MongoDB Cloud Manager is the easiest way to run
MongoDB in the cloud. It makes MongoDB the system you
worry about the least and like managing the most.
MongoDB Professional helps you manage your
deployment and keep it running smoothly. It includes
support from MongoDB engineers, as well as access to
MongoDB Cloud Manager.
Development Support helps you get up and running quickly.
It gives you a complete package of software and services
for the early stages of your project.
Resources
For more information, please visit mongodb.com or contact
us at sales@mongodb.com.
Case Studies (mongodb.com/customers)
Presentations (mongodb.com/presentations)
Free Online Training (university.mongodb.com)
Webinars and Events (mongodb.com/events)
Documentation (docs.mongodb.org)
MongoDB Enterprise Download (mongodb.com/download)
New York Palo Alto Washington, D.C. London Dublin Barcelona Sydney Tel Aviv
US 866-237-8815 INTL +1-650-440-4474 info@mongodb.com
2015 MongoDB, Inc. All rights reserved.