Eslsca business school logo
Big Data & Business Analytics
Module (02) – Business Analytics & Descriptive Statistics
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 Learning Objectives
Module Objectives:
Understand what is business analytics?
Business analytics specialties & software tools
Distinguish between the different data types
Descriptive Statistics (Position, Spread, Shape) with examples
What to Study for Exam:
Module 2 Lecture Notes (with emphasis on above topics)
© 2020 Eslsca. All Rights Reserved 2
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 1st What is Business Analytics
James R. Evans, “Business Analytics: Methods, Models and Decisions” 3rd edition; 2019, Pearson
© 2020 Eslsca. All Rights Reserved 3
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 1st What is Business Analytics
Analytics is the use of:
data,
information technology,
statistical analysis,
quantitative methods, and
mathematical or computer-based models
to help managers gain improved insight about their business operations and make better, fact-
based decisions.
© 2020 Eslsca. All Rights Reserved 4
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 2nd Examples of Business Analytics
Pricing
setting prices for consumer and industrial goods, government contracts, and
maintenance contracts
Customer segmentation
identifying and targeting key customer groups for example in retail, insurance, and
credit card industries
Merchandising
determining brands to buy, quantities, and allocations
Location
finding the best location for bank branches and ATMs, or where to service
industrial equipment
Social Media
understand trends and customer perceptions; assist marketing managers and
product designers
© 2020 Eslsca. All Rights Reserved 5
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 3rd Benefits & Challenges
Benefits
◦ reduced costs
◦ better risk management
◦ faster decisions
◦ better productivity and enhanced bottom-line performance such as profitability and customer
satisfaction.
Challenges
◦ lack of understanding of how to use analytics
◦ insufficient analytical skills
◦ difficulty in getting good data and sharing information
◦ not understanding the benefits versus perceived costs
© 2020 Eslsca. All Rights Reserved 6
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 4th Business Analytics Tools
Database queries and analysis
Spreadsheets
Data visualization
Dashboards to report key performance measures
Mathematical Techniques:
◦ Data and Statistical methods
◦ Machine Learning (supervised versus unsupervised)
© 2020 Eslsca. All Rights Reserved 7
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 4th Business Analytics Tools
SQL various databases
Excel Spreadsheets
Tableau Software Simple drag and drop tools for visualizing data from
spreadsheets and other databases
SAS / SPSS / Rapid Miner predictive modeling, data mining, machine learning
and visualization using visual workflows (not for free)
R / Python Advanced programing-based tool to handle predictive modeling, data
mining, machine learning and visualization (for free, open source)
R Studio https://www.youtube.com/watch?v=_V8eKsto3Ug
SAS https://www.youtube.com/watch?v=PJOqwQJT_NA
© 2020 Eslsca. All Rights Reserved 8
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 5th Data Types
New developments: Web behavior –
Internal Social Media – Mobile - IOT
o Annual reports o page views
o Accounting audits o visitor’s country
o Financial profitability analysis o visitor’s demographics
o Operations management performance o time of view & duration of view
o Human resource measurements o products they searched for and
viewed
o products purchased
External o what reviews they read
o Economic trends and others …
o Marketing research The effective use of big data has
high potential to transform
economies in the new era
© 2020 Eslsca. All Rights Reserved 9
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 5th Data Types
Discrete – derived from counting discrete incidences
For example:
o Quantity of order items
o Number of errors/ faulty products
Some discrete metrics would be a proportion
for example; the number of incomplete orders each day, and the number of errors
per invoice, etc.
Continuous based on a continuous scale of measurement:
o Any continuous measurement for example; cost, length, time, weight
© 2020 Eslsca. All Rights Reserved 10
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 5th Data Types
Categorical data –
sorted into categories / conceptual criteria
Ordinal data –
They are numerical and thus can be ordered or ranked
Interval data –
ordinal but have constant differences between observations
Ratio data –
ordinal but have a natural zero in its scale
© 2020 Eslsca. All Rights Reserved 11
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 5th Data Types
© 2020 Eslsca. All Rights Reserved 12
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 6th Descriptive Statistics
Methods of describing the characteristics of a data set.
Useful because they allow you to make sense of the data.
Helps exploring and making conclusions about the data in order to make
rational decisions.
The following measures are used in descriptive statistics:
o Measures of position (also referred to as central tendency or location
measures)
o Measures of spread (also referred to as variability or dispersion
measures)
o Measures of shape
© 2020 Eslsca. All Rights Reserved 13
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 6th Descriptive Statistics
Measures of Position
o Position Statistics measure the data central tendency.
o Central tendency refers to where the data is centered.
o You may have calculated an average of some kind.
Despite the common use of average, there are different statistics by
which we can describe the average of a data set:
• Mean
• Median
• Mode
© 2020 Eslsca. All Rights Reserved 14
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 6th Descriptive Statistics
Mean
o The total of all the values divided by the size of the data set
o It is the most commonly used statistic of position
o The mean of a sample is denoted by ‘x-bar’
© 2020 Eslsca. All Rights Reserved 15
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 6th Descriptive Statistics
Median
o The middle value where exactly half of the data values are above it and half
are below it.
o Less widely used.
o It can reduce the effect of outliers (occurrence of extreme values in the
sample).
o Often used when the data is nonsymmetrical.
o Ensure that the values are ordered before calculation.
o With an even number of values, the median is the mean of the two middle
values.
© 2020 Eslsca. All Rights Reserved 16
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 6th Descriptive Statistics
Median
© 2020 Eslsca. All Rights Reserved 17
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 6th Descriptive Statistics
Mode
The value that occurs the most often in a data set
It is rarely used as a central tendency measure
© 2020 Eslsca. All Rights Reserved 18
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 6th Descriptive Statistics
Measures of Spread
o The Spread refers to how the data deviates from the position
measure.
o It gives an indication of the amount of variation/dispersion.
There are different statistics by which we can describe the spread of a
data set:
• Range
• Standard deviation
© 2020 Eslsca. All Rights Reserved 19
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 6th Descriptive Statistics
Range
o The difference between the highest and the lowest values
o The simplest measure of variability
o It can be misleading when the data has outliers, just one outlier will increase
the range dramatically
© 2020 Eslsca. All Rights Reserved 20
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 6th Descriptive Statistics
Standard Deviation
o The average distance of the data points from their own mean
o A low standard deviation indicates that the data points are clustered around
the mean
o A large standard deviation indicates that they are widely scattered around the
mean
o It is a more robust measure of variability than Range
o The standard deviation of a sample is denoted by ‘s’/ sigma
© 2020 Eslsca. All Rights Reserved 21
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 6th Descriptive Statistics
Standard Deviation Calculation
o Standard deviation is computed as follows:
© 2020 Eslsca. All Rights Reserved 22
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 6th Descriptive Statistics
Outlier
o A data point that is significantly greater or smaller than other
data points in a data set.
o It is useful when analyzing data to identify outliers
o They may affect the calculation of descriptive statistics
o You need to decide whether to exclude them before carrying
out your analysis
o An outlier should be excluded if it is due to measurement or
human error
© 2020 Eslsca. All Rights Reserved 23
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 6th Descriptive Statistics
Types of Shape
o Data can be plotted into a histogram to have a general idea of its shape, or
distribution
o The shape can reveal a lot of information about the data
o If the data is symmetrical, then we may use the mean or median to
measure the central tendency as they are almost equal
o If the data is skewed, then the median will be a more appropriate to
measure the central tendency
Very Common data distributions/shapes:
• Normal Distribution
• Uniform Distribution
• Camel-back Distribution
© 2020 Eslsca. All Rights Reserved 24
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 6th Descriptive Statistics
Examples of Normal Distribution
Shape
o Scores of students in a test
o IQ (intelligence question) test
score in population
o Birthweight of newborn babies is
normally distributed with a mean
of 7.5 pounds
© 2020 Eslsca. All Rights Reserved 25
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 6th Descriptive Statistics
Examples of Uniform Distribution Shape
Source: https://corporatefinanceinstitute.com/resources/knowledge/other/uniform-distribution/
© 2020 Eslsca. All Rights Reserved 26
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 6th Descriptive Statistics
Examples of Camel back Distribution Shape
o Also known as Bimodal or Double-Peaked distribution
o The bimodal distribution looks like the back of a two-
humped camel.
o This indicates that two processes with different peaks
are combined in one set of data. Thus, the data needs
to be split into two subsets (filtered by the variable that
causes the variation)
o For example, a distribution of production data from a
two-shift operation might be bimodal, if each shift
produces a different distribution of results.
o Histogram of person height in a class with males and
females joining the class. A peak will occur for the
average height of females, and another for the average
height of males.
© 2020 Eslsca. All Rights Reserved 27
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2 Questions
© 2018 MegaSoft. All Rights Reserved 28
Big Data & Business Analytics
Course Name Module
Module 2: Name
Business Analytics & Descriptive Statistics
Module 02
Module 2
Module Completed
Module 02