Big Data Analytics - Unit1
Big Data Analytics - Unit1
Big Data Analytics - Unit1
Unit - 1
What is data?
Dictionary Definition:
The quantities, characters, or symbols on which
operations are performed by a computer, which may
be stored and transmitted in the form of electrical
signals and recorded on magnetic (audio tape), optical
(CD), or mechanical recording media (Phonographic
disc)
What is big data?
It is a collection of data that is huge in volume and yet
growing exponentially with time.
Unstructured (80%)
Structured (10%)
Semi Structured (10%)
Un-structured Data
10 Characteristics of Big Data
Values Visualization
Volume Data clustering,
Getting value
Data Size sunbursts
out of data
Velocity
Vulnerability
Speed at which
Security Concerns
data is generated
The 10 V’s
Variety Volatility
Different types of of Big data
Data governance
data
Variability
Dynamic Evolving Validity
behavior in data Veracity Data quality check
science Confidence or Trust in
data
Data accuracy
Unstructured data for Analytics
Business Documents
Emails
Social Media
Customer feedback
Webpages
Open-ended survey responses
Images, Audio and Video
Clean Data
Reduce noise
Eliminate unwanted information
Implement Technology
NoSQL databases
Data visualization using Tableau, Google data studio
Unstructured Data Analytics
Data • Association Rule
Mining mining
• Regression Analysis
• Collaborative filtering
Dealing Text
with Mining
unstructure
d data
NLP
Noisy
Text
Analysis
Unstructured Data Analytics Tools
MonkeyLearn – Used for Text Analytics.
This tool makes it simple to clean, label and visualize customer
feedback
Word Clouds - textual data visualization which allows anyone to
see in a single glance the words which have the highest frequency within
a given body of text
Listen to customer’s voice – open surveys and emails.
Aspect is based on sentiment analysis
Amazon AWS
Microsoft Azure
IBM Cloud
The Advantages of Deploying Big Data
Better Decision making
Cost Reduction
Risk Analysis
Collection of Data
Industries using Big Data
CA technology have done a global study in which clearly the benefits of Big
data outweigh the obstacles in implementation
The percentage of organizations that plan to and already have implemented
a big data project is 84%
Acquision has increased to 54%, revenue has improved by 88%.
Optimize Price
Forecast demand
Manufacturing
Quality Assurance
Less downtime