Big Data Ibm 2014
Big Data Ibm 2014
Big Data Ibm 2014
Agenda
What is Big Data?
– Concepts
– Characteristics
Business Motivation
– Big Data Challenges
– How Big Data Impacts Every Aspect of Your Business
– A Big Data Journey
Get Started
“Big data technologies describe a new generation of technologies and architectures, designed to
economically extract value from very large volumes of a wide variety of data, by enabling
high velocity capture, discovery and/or analysis.”
Source: Matt Eastwood, IDC
3 © 2013 IBM Corporation
IBM Big Data & Analytics
50x 30 Billion
35 ZB RFID 80% of the
sensors and worlds data is
counting unstructured
2010 2020
44x 2020
Business leaders frequently make
as much Data and Content
Over Coming Decade
35 zettabytes
1 in 3 decisions based on information they
don’t trust, or don’t have
2009
800,000 petabytes
80%
Of world’s data
of their visionary plans
to enhance competitiveness
IT
Business Users
Delivers a platform to
Determine what enable creative
question to ask discovery
IT Business
Structures the Explores what
data to answer questions could be
that question asked
8
IBM Big Data & Analytics
Cost-effectively
analyze
Accelerators
Petabytes of Index and
structured and federated
unstructured Hadoop Stream Data Contextual discovery for
information System Computing Warehouse Discovery contextual
collaborative
insights
Govern data
quality and
manage Deliver deep insight
information with advanced
Information Integration & Governance
lifecycle in-database
analytics and
operational analytics
Warehousing Zone
BI &
Reporting
Enterprise
Warehouse
Connectors
Predictive
Analytics
Hadoop
Documents
in variety of formats ETL, MDM, Data Governance
TECHNOLOGY
Cognos
SPSS Cognos Cognos Cognos Consumer
Modeler RTM BI Insight Insight
Predictive Real-time Reporting / Analysis Export and Social Media
Analytics Dashboards Explore Analysis
PureData InfoSphere
Systems BigInsights
Raw data
Raw data
Schema Storage
to filter (unfiltered,
raw data)
Schema
to filter
Storage
(pre-filtered data) Output
DB export
Optional Analytics and discovery “Apps” Administrative and
IBM and Text Web Crawler
DB import
development tools
Accelerator for
partner processing
engine and
social data
analysis Ad hoc query
offerings library Boardreader
Web console
BigSheets Accelerator for Machine
Distrib file copy learning
machine data
analysis • Monitor cluster health, jobs, etc.
... Data
• Add / remove nodes
processing
• Start / stop services
• Inspect job status
• Inspect workflow status
Infrastructure ZooKeeper Jaql Pig • Deploy applications
• Launch apps / jobs
Integrated • Work with distrib file system
Enhanced Oozie HBase Hive
installer security •Work with spreadsheet interface
•Support REST-based API
•...
Adaptive
Text compression Indexing Lucene MapReduce
MapReduce
Find and analyze information stored on disk Analyze data in motion – before it is stored
Query-driven: submits queries to static data Data driven – bring data to the analytics
Real-time
Analytics
18
IBM Big Data & Analytics
Filter / Sample
Modify Annotate
Analyze
Fuse
Classify
Windowed
Score Aggregates
IBM Big Data & Analytics
Statistics
Predictive
(IBM Research)
∑ R(s , a )
population
t t (included with
Streams)
Image & Video
Geospatial (Open Source)
(IBM Research)
20
IBM Big Data & Analytics
Discover
Netezza
Appliance Model
Visualize
& Publish
InfoSphere
Warehouse IBM SPSS
InfoSphere
InfoSphere Streams
BigInsights
IBM Cognos
Measure
Score
Streaming Data
Sources
21
IBM Big Data & Analytics
IBM Big Data & Analytics
IBM Big Data & Analytics
IBM Big Data & Analytics
Application
(Map-Reduce)
Faster response times due to
Storage
increased opportunity for query (HBase, HDFS)
Faster DB Query*
Cognos BI
+ Dynamic
Dynamic
Dynamic
Cubes
Cubes Compatible
Dynamic
Dynamic Cubes
Query
Compatible
Query
+ C1 C2 C3 C4 C5 C6 C7 C8
29
IBM Big Data & Analytics
Increased network
Exploit Instrumented availability by identifying
PureData for
Assets and fixing holes
Analytics
Analysis time on 2 PB of
Instant Awareness of PureData for data cut from 26 hours to 2
Risk and Fraud Analytics minutes
Get started!
Execute and
Think Big Pick your Spot
Deliver Value
New
New insights
insights and
and Identify
Identify and
and prioritize
prioritize Evolve
Evolve your
your existing
existing
new possibilities
new possibilities business
business use
use cases
cases analytics
analytics capabilities
capabilities
Process
Process and
and performance
performance Ensure
Ensure that
that the
the business
business Build
Build or
or acquire
acquire new
new
improvement
improvement is
is engaged
engaged skills
skills required
required
New
New revenue
revenue Agree
Agree on
on the
the key
key Measure
Measure and
and
opportunities
opportunities communicate
measures
measures for
for success
success communicate success
success
IBM Big Data & Analytics
Thank You