1.5 Module-1
1.5 Module-1
1.5 Module-1
Session Topic
1.1 Types of Digital Data-Characteristics of Data – Evolution of Big Data -
Definition of Big Data – Challenges with Big Data.
1.2 3Vs of Big Data – Non-Definitional traits of Big Data – BI vs. Big Data - Data
warehouse and Hadoop environment – Coexistence.
1.3 Big Data Analytics: Classification of analytics – Data Science
1.4 Terminologies in Big Data – CAP Theorem – BASE Concept.
1.5 NoSQL: Types of Databases – Advantages – NewSQL - SQL vs. NOSQL vs
NewSQL.
1.6 Introduction to Hadoop: Features – Advantages - Versions – Overview of
Hadoop Eco systems
1.7 Hadoop distributions – Hadoop vs. SQL – RDBMS vs. Hadoop
1.8 Hadoop Components – Architecture-HDFS
1.9 Map Reduce: Mapper – Reducer - Combiner -Partitioner – Searching – Sorting
– Compression
1.10 Hadoop 2 (YARN): Architecture – Interacting with Hadoop Eco systems.
MODULE 1 Introduction to Big Data
1.5 NoSQL
Course Outcome:
Upon completion of the session, students shall have ability to
Features of NoSQL
NoSQL 1. NoSQL databases are non-
Not Only SQL. relational
• non-relational 2. Distributed
• open source 3. No Support for ACID properties
• distributed They adherence to CAP theorem.
databases. 4. No fixed table schema
Disadvantages:
Limited query capabilities
RDBMS databases and tools are comparatively mature
It does not offer any traditional database capabilities, like
consistency when multiple transactions are performed
simultaneously.
When the volume of data increases it is difficult to
maintain unique values as keys become difficult
Doesn't work as well with relational data
Open source options so not so popular for enterprises.
No support for join and group-by operations.
NewSQL Characterisitcs:
SQL interface for application interaction
ACID support for transactions
An architecture that provides higher per node
performance vis-a-vs
traditional RDBMS solution
Scale out, shared nothing architecture
Non-locking concurrency control mechanism so that real
time reads
will not conflict with writes. MODULE 1 Introduction to Big Data
SQL vs. NOSQL vs NewSQL.
SQL NoSQL NewSQL
Adherence to ACID
Yes No Yes
properties
OLTP/OLAP Yes No Yes
Yes
Schema rigidity
Adherence to No Maybe
Adherence to data model
relational model
Scale out
Scale up Scale
Scalability Horizontal
Vertical Scaling out
Scaling
Slowly
Community Support Huge Growing
growing
MODULE 1 Introduction to Big Data
Test your Knowledge
Test your Knowledge
Test your Knowledge
Test your Knowledge
Next Session…