0% found this document useful (0 votes)
14 views

Comparison Between NoSQL and RDBMS

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Comparison Between NoSQL and RDBMS

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Comparison Between NoSQL and RDBMS (5 Marks)

Feature NoSQL Databases RDBMS


Schema Schema-less; flexible structure Fixed schema; structured tables
Document, key-value, column-family,
Data Model Table-based (rows and columns)
or graph-based
Horizontally scalable; easy to add Vertically scalable; limited
Scalability
servers horizontal scaling
Query Limited query capabilities; often lacks SQL, supports complex queries and
Language joins joins
Eventual consistency; prioritizes Strong consistency with ACID
Consistency
availability properties
Optimized for transactions and
Performance Optimized for high-speed read/write
complex queries
Big data, unstructured data, real-time Financial applications, ERP, CRM
Best Use Cases
analytics systems

Comparison Between MongoDB and RDBMS (5 Marks)

Feature MongoDB RDBMS


Document-oriented (JSON/BSON
Data Model Table-based (rows and columns)
documents)
Schema Fixed schema; requires predefined
Flexible; schema-less
Flexibility schema
High horizontal scalability with Vertical scaling, with limited
Scalability
sharding horizontal scaling
Eventual consistency; prioritizes Strong consistency with ACID
Consistency
speed compliance
MongoDB Query Language
Query Language SQL (structured queries with joins)
(simple)
Supports indexing on fields for Comprehensive indexing for
Indexing
faster queries optimized queries
Dynamic data applications, real- Financial, transactional, relational
Best Use Cases
time analytics data

NoSQL data stores and their characteristic features


Apache's HBase
HDFS compatible, open-source and non-relational data store written inJava .A column-
family based NoSQL data store, data store providing BigTable-like capabilities scalability,
strong consistency, versioning, configuring and maintaining data store characteristics.
Apache's MongoDB
HDFS compatible; master-slave distribution model (Section 3.5.1.3); document-oriented
data store with JSON-like documents and dynamic schemas; open-source, NoSQL, scalable
and non-relational database; used by Websites Craigslist, eBay, Foursquare at the backend.
Apache's Cassandra
HDFS compatible DBs; decentralized distribution peer-to-peer model open source; NoSQL;
scalable, non-relational, column- family based, fault-tolerant and tuneable consistency used
by Facebook and Instagram.
Apache's CouchDB
A project of Apache which is also widely used database for the web. CouchDB consists of
Document Store. It uses theJSON data exchange format to store its documents,JavaScript
for indexing, combining and transforming documents, and HTTP APis
Oracle NoSQL
Step towards NoSQL data store; distributed key-value data store; provides transactional
semantics for data manipulation , horizontal scalability, simple administration and
monitoring
Riak
An open-source key-value store; high availability (using replication concept), fault tolerance,
operational simplicity, scalability and written in Erlang
Explain Apache Sqoop's Import and Export Methods with Diagram.
Sqoop is a tool designed to transfer data between Hadoop and relational databases.

Sqoop is used to -import data from a relational database management system (RDBMS) into the
Hadoop Distributed File System(HDFS),

- transform the data in Hadoop and

- export the data back into an RDBMS.


Sqoop import method:

Sqoop import The data import is done in two steps :

1) Sqoop examines the database to gather the necessary metadata for the data to be imported.

2) Map-only Hadoop job : Transfers the actual data using the metadata.

The imported data are saved in an HDFS directory.

Sqoop will use database name for the directory or the user can specify any alternative directory
where the files should be populated, By default these files contain comma delimited.with new lines
separating different records.

Sqoop Export method :

Data export from the cluster works in a similar fashion. The export is done in two steps :

1) examine the database for metadata.

2) Map-only Hadoop job to write the data to the database. Sqoop divides the input data set into
splits, then uses individual map tasks to push the splits to the database.
Illustrate the Key-Value Based Data Architecture Pattern with Example & Mention Its
Limitation
The simplest way to implement a schema-less data store is to use key-value pairs. The data
store characteristics are high performance, scalability and flexibility. Data retrieval is fast in
key-value pairs data store. A simple string called, key maps to a large data string or BLOB
(Basic Large Object). Key-value store accesses use a primary key for accessing the values.
Therefore, the store can be easily scaled up for very large data. The concept is similar to a
hash table where a unique key points to a particular item(s) of data. Figure 3.4 shows key
value pairs architectural pattern and example of students' database as key-value pairs

Limitations of key-value store architectural pattern are:


1. No indexes are maintained on values, thus a subset of values is not searchable.
2. Key-value store does not provide traditional database capabilities, such as atomicity of
transactions, or consistency when multiple transactions are executed simultaneously. The
application needs to implement such capabilities.
3. Maintaining unique values as keys may become more difficult when the volume of data
increases. One cannot retrieve a single result when a key- value pair is not uniquely
identified.
4. Queries cannot be performed on individual values. No clause like 'where' in a relational
database usable that filters a result set.
Explain the Shared Nothing Architecture
The columns of two tables relate by a relationship. A relational algebraic equation specifies
the relation. Keys share between two or more SQL tables in RDBMS. Shared nothing (SN) is a
cluster architecture. A node does not share data with any other node. Data of different data
stores partition among the number of nodes (assigning different computers to deal with
different users or queries). Processing may require every node to maintain its own copy of
the application's data, using a coordination protocol. Examples are using the partitioning
and processing are Hadoop, Flink and Spark.
The features of SN architecture are as follows:
l. Independence: Each node with no memory sharing; thus possesses computational self
sufficiency
2. Self-Healing: A link failure causes creation of another link
3. Each node functioning as a shard: Each node stores a shard (a partition of large DBs)
4. No network contention
Explain MongoDB Database with Features

MongoDB Features

1. Open Source and Non-relational: MongoDB is an open-source, NoSQL database


management system, which means it is freely available and does not require the
relational structure of traditional databases.
2. Document-based Data Store: Data in MongoDB is stored in BSON (Binary JSON)-
like documents, which supports a flexible and dynamic schema that allows storing
various types of information together in the same collection.
3. Schema-less Structure: MongoDB does not require a predefined schema. This
allows documents in the same collection to have different structures, making it easier
to handle unstructured data and make schema changes as needed.
4. Collections and Documents: Collections in MongoDB are analogous to tables in
relational databases, but they store documents (analogous to rows) that can have
different fields. This allows schema-less, flexible storage and eliminates the need for
rigid table definitions.
5. BSON Serialization: MongoDB stores documents in BSON format, a binary
representation of JSON. BSON enables faster data retrieval and is optimized for
efficient storage, especially with hierarchical data.
6. Cross-platform Compatibility: MongoDB is compatible with multiple platforms,
including Windows, Linux, and macOS, making it suitable for a variety of
development environments.
7. Indexing and Querying: MongoDB allows indexing on any field within a document,
enhancing query performance. It supports a powerful query language with dynamic
queries and real-time aggregation capabilities.
8. High Availability through Replication: MongoDB supports replication, where
multiple copies of data are maintained across different servers. This enhances data
availability and provides fault tolerance, reducing downtime.
9. Horizontal Scalability: MongoDB supports horizontal scaling, which allows you to
distribute data across multiple servers. This helps in handling high-traffic and large
datasets effectively.
10. Atomic Operations on Single Documents: MongoDB supports atomic operations on
individual documents, ensuring that single-document changes are safe and consistent,
though it does not fully support ACID transactions across multiple documents.
11. No Complex Joins: MongoDB avoids complex joins and instead uses embedded
documents and linking, which improves performance and simplifies queries.
12. Real-time Updates and Fast In-place Updates: MongoDB supports in-place
updates, meaning updates can be made without moving data or creating new storage
locations. This improves performance for frequently updated data.
13. Memory Management: MongoDB utilizes memory-mapped files, automatically
caching frequently accessed data in RAM to improve performance.
14. Application-friendly: MongoDB does not require data mapping or transformation
between the application layer and the database. This minimizes complexity and
enables faster development and prototyping.
15. Use Cases: MongoDB is commonly used for content management systems, e-
commerce applications, analytics, real-time data processing, logging, and mobile
applications.

You might also like