Introduction To Data Analtsis
Introduction To Data Analtsis
Introduction To Data Analtsis
DESCRIPTIVE ANALYSIS
Descriptive analysis is an important first step for conducting statistical analyses. It gives you an idea of the
distribution of your data, helps you detect outliers and typos, and enable you identify associations among variables,
thus preparing you for conducting further statistical analysis.
Predictive Analysis
Predictive analytics uses historical data to predict future events. Typically, historical data is used to build a mathematical
model that captures important trends. That predictive model is then used on current data to predict what will happen next,
or to suggest actions to take for optimal outcomes.
Prescriptive Analytics
Prescriptive Analytics is the area of data analytics that focuses on finding the best course of action in a scenario given the
available data.
Analysis life Cycle
1. Problem identification
2. Hypothesis formulation
3. Data collection
4. Data exploration/preparation
5. Model building
6. Model validation and evalution
Analysis life Cycle
1. Problem identification
-The problem is situation which is judged to be corrected or solved
-Problem can be identified through
i) comparative/benchmarking stidies
ii) performance reporting
iii) asking some basic questions
a) who are affected by the problem
b) what will happen if problem is not solved
c) when and where does the problem occur
d) Why the problem occurring
e) how are the people currently handling the problem
Analysis life Cycle
2.Hypothesis formulation
i) Frame the questions which need to be answered
ii) Develop a comprehensive list of all possible issues related to the problem.
iii) Reduce the list by eliminating duplicates and combining overlapping issues
iv) Using consensus building get down to a major issue list
Analysis life Cycle
3. Data collection
i) Using data that is already collected by ather
ii) Systematically selecting and watching charateristics of people,objects and events
iii) Oral questioning respondents either individually or as a group
iv) Collecting data based on answers provided by the respondents in written format
Analysis life Cycle
4. Data Exploration
i) Importing data
ii) Variable Idewnfication
iii) Data Cleaning
iv) Summarizing data
v) Selecting subset of data
5. Model Building
Building a Model is a very iterative process because there is no such thing as final and
perfect solution
Many of the machine learning and statistical techniques are available in traditional technology
platform
8
Population
Researcher Elisabeth Kvaavik and others studied factors that affect the eating habits of
adults in their mid-thirties. (Source: Kvaavik E, et. Al. Psychological explanatorys
of eating habits among adults in their mid-30’s (2005) International Journal of
Behavioral Nutrition and Physical Activity (2)9.) Classify each of the following
variables considered in the study as qualitative or quantitative.
a. Nationality
Qualitative
b. Number of children
c. Household income in theQuantitative
previous year
d. Level of education Quantitative
Qualitative
e. Daily intake of whole grains (measured in grams per day)
Quantitative
A discrete variable is a quantitative variable that either has a finite number
of possible values or a countable number of possible values. The term
“countable” means the values result from counting such as 0, 1, 2, 3, and so
on.
Researcher Elisabeth Kvaavik and others studied factors that affect the eating habits of
adults in their mid-thirties. (Source: Kvaavik E, et. Al. Psychological explanatorys
of eating habits among adults in their mid-30’s (2005) International Journal of
Behavioral Nutrition and Physical Activity (2)9.) Classify each of the following
quantitative variables considered in the study as discrete or continuous.
a. Number of children
b. Household income in theDiscrete
previous year
Continuous
c. Daily intake of whole grains (measured in grams per day)
Continuous
25
Measuring Variables
To establish relationships between variables, researchers must observe the variables
and record their observations. This requires that the variables be measured.
The process of measuring a variable requires a set of categories called a scale of
measurement and a process that classifies each individual into one category.
Scales of Measurement