Big Data and Analytics Challenges and Issues

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 12

Big Data And Analytics

Challenges and Issues


Big Data and Analytics is thought to be about:

Business Intelligence and Analytics Computational Science But, it is much


more than that!

Demographic Analysis
Geointelligence: Spatial Analysis
The Grand Challenges in Science
Medicine: Processing 3-D hyperspectral high resolution images for
diagnostics, genomic research, Proteonomics, etc.

Media Analysis: Processing text, audio, video, imagery


And much, much more …..

So, we want to introduce you to the issues and challenges at the frontiers of
Advanced Analytics!
The Business Analytics
Strategies
Social, Email, Blogs, Video, Mobile
Marketing, Sales - Product Listing, Promotions
Applications
ERP, CRM, Databases, Internal Applications,
Customer/Consumer facing applications
Context
Web, Customers, Products, Business Systems,
Processes and Services
Support Systems
CRM, Recommendation Systems
Data warehouses, Business Intelligence
Emerging Analytics
• Extending diagnostic analytics to different domains
• Developing new predictive and prescriptive analytics based on
advanced analytic techniques
• Prediction based on scenario development rather than just probabilities
• Prescription based on advanced simulation and visualization capabilities
• Development of Analytic Scientist curricula and degree programs
• Expansion beyond the traditional business intelligence applications
and scientific application based on descriptive analytics.
Advanced Analytics
• Advanced analytics:
• the application of multiple analytic methods that address the diversity of big data – structured or
unstructured –
• to provide descriptive results, and
• to yield actionable predictive and prescriptive results that facilitate decision-making.
• Beyond data mining and statistical processing methods to encompass logic-based methods,
qualitative analytics, and non-statistical quantitative methods.
• A diverse set of techniques that require new software architectures and application frameworks
to solve complex problems.
• New metrics that focus on the contributions of the value of the analysis as a holistic result are
required to assess and evaluate the outcomes of advanced analytics.
Types of Analytics
• Descriptive: A set of techniques for reviewing and examining the data set(s) to
understand the data and analyze business performance .
• Diagnostic: A set of techniques for determine what has happened and why
• Predictive: A set of techniques that analyse current and historical data to
determine what is most likely to (not) happen
• Prescriptive: A set of techniques for computationally developing and analyzing
alternatives that can become courses of action – either tactical or strategic –
that may discover the unexpected
• Decisive: A set of techniques for visualizing information and recommending
courses of action to facilitate human decision-making when presented with a
set of alternatives.
Steps to be followed to deploy a Big Data solution

Data Ingestion
• The first step for deploying a big data solution is the data ingestion i.e. extraction of data from various sources. The data
source may be a CRM like Salesforce, Enterprise Resource Planning System like SAP, RDBMS like MySQL or any other log files,
documents, social media feeds etc. The data can be ingested either through batch jobs or real-time streaming. The
extracted data is then stored in HDFS.
• Data Storage
After data ingestion, the next step is to store the extracted data. The data either be stored in HDFS or NoSQL database (i.e.
HBase). The HDFS storage works well for sequential access whereas HBase for random read/write access.

Data Processing
• The final step in deploying a big data solution is the data processing. The data is processed through one of the processing
frameworks like Spark, MapReduce, Pig, etc.
Descriptive Analytics
• Process:
– Identify the attributes, then assess/evaluate the attributes
– Estimate the magnitude to correlate the relative contribution of each attribute to the final solution
– Accumulate more instances of data from the data sources
– If possible, perform the steps of evaluation, classification and categorization quickly
– Yield a measure of adaptability within the OODA loop
• At some threshold, crossover into diagnostic and predictive analytics
Diagnostic Analytics
• Process:
• Begin with descriptive analytics
• Extract patterns from large data quantities via data mining
• Correlate data types for explanation of near-term behavior – past and present
• Estimate linear/non-linear behavior not easily identifiable through other approaches.
• Example: by classifying past insurance claims, estimate the number of future claims to flag for
investigation with a high probability of being fraudulent.
Predictive Analytics
• Process:
– Begin with descriptive AND diagnostic analytics
– Choose the right data based on domain knowledge and relationships among variables
– Choose the right techniques to yield insight into possible outcomes
– Determine the likelihood of possible outcomes given initial boundary conditions
– Remember! Data driven analytics is non-linear; do NOT treat like an engineering project
Prescriptive Analytics
Process:
Begin w/ predictive analytics
Determine what should occur and how to make it so
Determine the mitigating factors that lead to desirable/undesirable outcomes
“What-if” analysis w/ local or global optimization
Ex: Find the best set of prices and advertising frequency to maximize revenue
Ex: And, the right set of business moves to make to achieve that goal
Decisive Analytics
•Process:
• Given a set of decision alternatives, choose the one course of action to do from
possibly many
• But, it may not be the optimal one.
• Visualize alternatives – whole or partial subset
• Perform exploratory analysis – what-if and why
• How do I get to there from here?
• How did I get here from there?

You might also like