11.2 Everything Generates Data-MIQ
11.2 Everything Generates Data-MIQ
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 2
Big Data
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 3
What is Big Data?
What is Big Data? ▪ Data is information that comes from a variety
of sources, such as people, pictures, text,
sensors, web sites and technology devices.
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 5
What is Big Data?
Large Datasets
▪ Companies do not necessarily have to
generate their own Big Data.
▪ There are sources of free data sets
available, ready to be used and
analyzed.
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 6
What is Big Data?
Lab – Database Search
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 7
Where is Big Data Stored?
What are the Challenges of Big Data?
▪ IBM’s Big Data estimates conclude that
“each day we create 2.5 quintillion bytes
of data”.
▪ Five major storage problems with Big
Data:
• Management
• Security
• Redundancy
• Analytics
• Access
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 8
Where is Big Data Stored?
Where Can We Store Big Data?
▪ Big data is typically stored on multiple
servers, in data centers.
▪ Fog computing utilizes end-user clients or
“edge” devices to do a substantial amount of
the pre-processing and storage.
• Data from that pre-processed analysis can be
fed back into the companies’ systems to modify
processes if required.
• Communications to and from the servers and
devices is quicker and requires less bandwidth
than constantly going out to the cloud.
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 9
Where is Big Data Stored?
The Cloud and Cloud Computing
▪ The cloud is a collection of data centers or groups of connected
servers.
▪ Cloud services for individuals include:
• Storage of data, such as pictures, music, movies, and emails.
• Access many applications instead of downloading onto local device.
• Access data and applications anywhere, anytime, and on any device.
▪ Cloud Services for an Enterprise include:
• Access to organizational data anywhere and at any time.
• Streamlines the IT operations of an organization.
• Eliminates or reduces the need for onsite IT equipment, maintenance,
and management.
• Reduces cost for equipment, energy, physical plant requirements, and
personnel training needs.
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 10
Where is Big Data Stored?
Distributed Processing ▪ Distributed data processing takes the large
volume of data and breaks it into smaller pieces.
▪ These smaller pieces are distributed in many
locations to be processed by many computers.
▪ Each computer in the distributed architecture
analyzes its part of the Big Data picture (horizontal
scaling).
▪ Hadoop was created to deal with these Big Data
volumes. It has two main features that has made it
the industry standard:
• Scalability - Larger cluster sizes improve
performance and provide higher data processing
capabilities.
• Fault tolerance – Hadoop automatically replicates
data across clusters.
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 11
Supporting Business with Big Data
Why Do Businesses Analyze Data?
▪ Data analytics allows businesses to better
understand the impact of their products and
services, adjust their methods and goals, and
provide their customers with better products faster.
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 12
Supporting Business with Big Data
Sources of Information ▪ Data originates from sensors and anything that has
been scanned, entered, and released to the Internet.
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 14
Supporting Business with Big Data
Chart Types
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 15
Supporting Business with Big Data
Analyzing Big Data for Effective Use in Business
▪ Data analysis is the process of inspecting,
cleaning, transforming, and modeling data to
uncover useful information.
▪ Having a strategy helps a business determine
the type of analysis required and the best tool to
do the analysis.
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 16
Supporting Business with Big Data
Excel lab: Forecasting
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 17
Chapter Summary
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 18
Chapter Summary
Summary
▪ Three characteristics of Big Data:
• large amount of data that increasingly requires more storage space (volume)
• growing exponentially fast (velocity)
• generated in different formats (variety)
▪ Fog computing utilizes end-user clients or “edge” devices to do pre-processing and storage.
• Designed to keep the data closer to the source for pre-processing.
▪ The cloud is a collection of data centers or groups of connected servers giving anywhere,
anytime access to software, storage, and services using a browser interface.
• Provide increased data storage and reduce the need for onsite IT equipment, maintenance, and
management.
▪ Distributed data processing takes large volumes of data from a source and breaks it into
smaller pieces and distributes to many locations to be processed.
• Each computer in the distributed architecture analyzes its part of the Big Data picture.
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 19
Chapter Summary
Summary (Cont.)
▪ Businesses gain value by collecting and analyzing data to understand the impact of their
products and services, adjust their methods and goals, and provide their customers with
better products faster.
▪ Structured data is created by applications that use “fixed” format input such as spreadsheets or
medical forms.
▪ Unstructured data is generated in a “freeform” style such as audio, video, web pages, and tweets.
▪ Data mining is the process of turning raw data into meaningful information by discovering patterns
and relationships in large data sets.
▪ Data visualization is the process of taking the analyzed data and using charts such as line, column,
bar, pie, or scatter to present meaningful information.
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 20