0% found this document useful (0 votes)
32 views21 pages

11.2 Everything Generates Data-MIQ

Big data comes from a variety of sources and is characterized by its large volume, velocity, and variety. It is challenging to store and analyze due to its size. Businesses use big data analytics to better understand their customers and operations. Data is stored in data centers, the cloud, and edge devices. Businesses analyze structured and unstructured data to gain insights, make decisions, and improve products and services.

Uploaded by

Haidar Azzafran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views21 pages

11.2 Everything Generates Data-MIQ

Big data comes from a variety of sources and is characterized by its large volume, velocity, and variety. It is challenging to store and analyze due to its size. Businesses use big data analytics to better understand their customers and operations. Data is stored in data centers, the cloud, and edge devices. Businesses analyze structured and unstructured data to gain insights, make decisions, and improve products and services.

Uploaded by

Haidar Azzafran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Everything Generates Data

Introduction to the Internet of Things v2.0


Sections & Objectives
▪ Big Data
• Explain the concept of Big Data.
• Describe the sources of Big Data.
• Explain the challenges and solutions to Big Data storage.
• Explain how Big Data analytics are used to support Business.

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 2
Big Data

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 3
What is Big Data?
What is Big Data? ▪ Data is information that comes from a variety
of sources, such as people, pictures, text,
sensors, web sites and technology devices.

▪ Three characteristics that indicate an


organization may be dealing with Big Data:
• A large amount of data that increasingly
requires more storage space (volume).
• An amount of data that is growing exponentially
fast (velocity).
• Data that is generated in different formats
(variety).
▪ Examples of data amounts collected by
sensors:
• One autonomous car can generate 4,000
gigabits (Gb) of data per day.
• One smart connected home can produce as
much as 1 gigabyte (GB) of information a week.4
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
What is Big Data?
Does the Business Generate Big Data?

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 5
What is Big Data?
Large Datasets
▪ Companies do not necessarily have to
generate their own Big Data.
▪ There are sources of free data sets
available, ready to be used and
analyzed.

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 6
What is Big Data?
Lab – Database Search

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 7
Where is Big Data Stored?
What are the Challenges of Big Data?
▪ IBM’s Big Data estimates conclude that
“each day we create 2.5 quintillion bytes
of data”.
▪ Five major storage problems with Big
Data:
• Management
• Security
• Redundancy
• Analytics
• Access

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 8
Where is Big Data Stored?
Where Can We Store Big Data?
▪ Big data is typically stored on multiple
servers, in data centers.
▪ Fog computing utilizes end-user clients or
“edge” devices to do a substantial amount of
the pre-processing and storage.
• Data from that pre-processed analysis can be
fed back into the companies’ systems to modify
processes if required.
• Communications to and from the servers and
devices is quicker and requires less bandwidth
than constantly going out to the cloud.

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 9
Where is Big Data Stored?
The Cloud and Cloud Computing
▪ The cloud is a collection of data centers or groups of connected
servers.
▪ Cloud services for individuals include:
• Storage of data, such as pictures, music, movies, and emails.
• Access many applications instead of downloading onto local device.
• Access data and applications anywhere, anytime, and on any device.
▪ Cloud Services for an Enterprise include:
• Access to organizational data anywhere and at any time.
• Streamlines the IT operations of an organization.
• Eliminates or reduces the need for onsite IT equipment, maintenance,
and management.
• Reduces cost for equipment, energy, physical plant requirements, and
personnel training needs.
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 10
Where is Big Data Stored?
Distributed Processing ▪ Distributed data processing takes the large
volume of data and breaks it into smaller pieces.
▪ These smaller pieces are distributed in many
locations to be processed by many computers.
▪ Each computer in the distributed architecture
analyzes its part of the Big Data picture (horizontal
scaling).
▪ Hadoop was created to deal with these Big Data
volumes. It has two main features that has made it
the industry standard:
• Scalability - Larger cluster sizes improve
performance and provide higher data processing
capabilities.
• Fault tolerance – Hadoop automatically replicates
data across clusters.
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 11
Supporting Business with Big Data
Why Do Businesses Analyze Data?
▪ Data analytics allows businesses to better
understand the impact of their products and
services, adjust their methods and goals, and
provide their customers with better products faster.

▪ Value comes from two primary types of processed


data, transactional and analytical.

▪ Transactional information is captured and


processed as events happen.
• Used to analyze daily sales reports and production
schedules to determine how much inventory to carry.
▪ Analytical information supports managerial analysis
tasks like determining whether the organization
should build a new manufacturing plant.

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 12
Supporting Business with Big Data
Sources of Information ▪ Data originates from sensors and anything that has
been scanned, entered, and released to the Internet.

▪ Collected data can be categorized as structured or


unstructured.

▪ Structured data is created by applications that use


“fixed” format input such as spreadsheets. May need
to be manipulated into a common format such as
CSV.

▪ Unstructured data is generated in a “freeform” style


such as audio, video, web pages, and tweets.

▪ Examples of tools to prepare unstructured data for


processing are:
• “Web scraping” tools automatically extract data from
HTML pages.
• RESTful application program interfaces (APIs).
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 13
Supporting Business with Big Data
Data Visualization
▪ Data mining is the process of turning raw data
into meaningful information.

▪ The mined data must be analyzed and


presented to managers and decision makers.

▪ Determining the best visualization tools to use


will vary based on the following:
• Number of variables
• Number of data points in each variable
• Is the data representing a timeline
• Items require comparisons
▪ Popular charts include line, column, bar, pie, and
scatter.

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 14
Supporting Business with Big Data
Chart Types

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 15
Supporting Business with Big Data
Analyzing Big Data for Effective Use in Business
▪ Data analysis is the process of inspecting,
cleaning, transforming, and modeling data to
uncover useful information.
▪ Having a strategy helps a business determine
the type of analysis required and the best tool to
do the analysis.

▪ Tools and applications range from using an


Excel spreadsheet or Google Analytics for small
to medium data samples, to the applications
dedicated to manipulating and analyzing really
big datasets.
▪ Examples include Knime, OpenRefine, Orange,
and RapidMiner.

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 16
Supporting Business with Big Data
Excel lab: Forecasting

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 17
Chapter Summary

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 18
Chapter Summary
Summary
▪ Three characteristics of Big Data:
• large amount of data that increasingly requires more storage space (volume)
• growing exponentially fast (velocity)
• generated in different formats (variety)
▪ Fog computing utilizes end-user clients or “edge” devices to do pre-processing and storage.
• Designed to keep the data closer to the source for pre-processing.
▪ The cloud is a collection of data centers or groups of connected servers giving anywhere,
anytime access to software, storage, and services using a browser interface.
• Provide increased data storage and reduce the need for onsite IT equipment, maintenance, and
management.
▪ Distributed data processing takes large volumes of data from a source and breaks it into
smaller pieces and distributes to many locations to be processed.
• Each computer in the distributed architecture analyzes its part of the Big Data picture.
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 19
Chapter Summary
Summary (Cont.)
▪ Businesses gain value by collecting and analyzing data to understand the impact of their
products and services, adjust their methods and goals, and provide their customers with
better products faster.
▪ Structured data is created by applications that use “fixed” format input such as spreadsheets or
medical forms.

▪ Unstructured data is generated in a “freeform” style such as audio, video, web pages, and tweets.

▪ Both forms of data need to be manipulated into a common format to be analyzed.

▪ Data mining is the process of turning raw data into meaningful information by discovering patterns
and relationships in large data sets.

▪ Data visualization is the process of taking the analyzed data and using charts such as line, column,
bar, pie, or scatter to present meaningful information.

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 20

You might also like