DSUR Notes-1
DSUR Notes-1
DSUR Notes-1
Types of Data:
“Data is the new oil.” Today data is everywhere in every field. Whether you are a
data scientist, marketer, businessman, data analyst, researcher, or you are in any
other profession, you need to play or experiment with raw or structured data. This
data is so important for us that it becomes important to handle and store it
properly, without any error. While working on these data, it is important to know the
types of data to process them and get the right results. There are two types of
data: Qualitative and Quantitative data, which are further classified into:
• Nominal data.
• Ordinal data.
• Discrete data.
• Continuous data.
Now business runs on data, and most companies use data for their insights to
create and launch campaigns, design strategies, launch products and services or
try out different things. According to a report, today, at least 2.5 quintillion bytes of
data are produced per day.
Qualitative or Categorical Data
Qualitative data tells about the perception of people. This data helps market
researchers understand the customers’ tastes and then design their ideas and
strategies accordingly.
Nominal Data
Nominal Data is used to label variables without any order or quantitative value. The
colour of hair can be considered nominal data, as one colour can’t be compared
with another colour.
The name “nominal” comes from the Latin name “nomen,” which means “name.”
With the help of nominal data, we can’t do any numerical tasks or can’t give any
order to sort the data. These data don’t have any meaningful order; their values are
distributed into distinct categories.
Examples of Nominal Data:
• Colour of hair (Blonde, red, Brown, Black, etc.)
• Marital status (Single, Widowed, Married)
• Nationality (Indian, German, American)
• Gender (Male, Female, Others)
• Eye Color (Black, Brown, etc.)
Ordinal Data
Ordinal data have natural ordering where a number is present in some kind of order
by their position on the scale. These data are used for observation like customer
satisfaction, happiness, etc., but we can’t do any arithmetical tasks on them.
Ordinal data is qualitative data for which their values have some kind of relative
position. These kinds of data can be considered “in-between” qualitative and
quantitative data. The ordinal data only shows the sequences and cannot use for
statistical analysis. Compared to nominal data, ordinal data have some kind of
order that is not present in nominal data.
Quantitative Data
1) Discrete Data
The term discrete means distinct or separate. The discrete data contain the values
that fall under integers or whole numbers. The total number of students in a class
is an example of discrete data. These data can’t be broken into decimal or fraction
values.
The discrete data are countable and have finite values; their subdivision is not
possible. These data are represented mainly by a bar graph, number line, or
frequency table.
Examples of Discrete Data:
• Total numbers of students present in a class
•Cost of a cell phone
• Numbers of employees in a company
• The total number of players who participated in a competition
• Days in a week
2)Continuous Data
Continuous data are in the form of fractional numbers. It can be the version of an
android phone, the height of a person, the length of an object, etc. Continuous data
represents information that can be divided into smaller levels. The continuous
variable can take any value within a range.
The key difference between discrete and continuous data is that discrete data
contains the integer or whole number. Still, continuous data stores the fractional
numbers to record different types of data such as temperature, height, width, time,
speed, etc.
Examples of Continuous Data:
• Height of a person
• Speed of a vehicle
• “Time-taken” to finish the work
• Wi-Fi Frequency
• Market share price
Analytics
What does Analytics Mean?
Analytics leads us to find the hidden patterns in the world around us, from
consumer behaviors, athlete and team performance, to finding connections
between activities and diseases. This can change how we look at the world, and
usually for the better. Sometimes we think that a process is already working at its
best, but sometimes data tells us otherwise, so analytics helps us to improve our
world.
• Web analytics
• Fraud analysis
• Risk analysis
• Advertisement and marketing
• Enterprise decision management
• Market optimization
• Market modelling
Types of Analytics
As organizations collect more data, what they use it for and how they analyze and
interpret that data becomes more nuanced. Data without analytics doesn’t make
much sense, but analytics is a broad term that can mean a lot of different things
depending on where you sit on the data analytics maturity model.
Understanding the what, why, when, where, and how of your data analytics helps
to drive better decision making and enables your organization to meet its business
objectives.
Descriptive analytics answer the question, “What happened?”. This type of analytics
is by far the most commonly used by customers, providing reporting and analysis
centered on past events. It helps companies understand things such as:
• Data modeling fundamentals and the adoption of basic star schema best
practices,
• Communicating data with the right visualizations, and
• Basic dashboard design skills.
Diagnostic analytics, just like descriptive analytics, uses historical data to answer a
question. But instead of focusing on “the what”, diagnostic analytics addresses the
critical question of why an occurrence or anomaly occurred within your data.
Diagnostic analytics also happen to be the most overlooked and skipped step
within the analytics maturity model. Anecdotally, I see most customers attempting
to go from “what happened” to “what will happen” without ever taking the time to
address the “why did it happen” step. This type of analytics helps companies
answer questions such as:
• Why are a specific basket of products vastly outperforming their prior year
sales figures?
Diagnostic analytics tends to be more accessible and fit a wider range of use cases
than machine learning/predictive analytics. You might even find that it solves
some business problems you earmarked for predictive analytics use cases.
Prescriptive analytics is the fourth, and final pillar of modern analytics. Prescriptive
analytics pertains to true guided analytics where your analytics is prescribing or
guiding you toward a specific action to take. It is effectively the merging of
descriptive, diagnostic, and predictive analytics to drive decision making. Existing
scenarios or conditions (think your current fleet of freight trains) and the
ramifications of a decision or occurrence (parts breakdown on the freight trains)
are applied to create a guided decision or action for the user to take (proactively
buy more parts for preventative maintenance).
Data Scientists collect data and explore, analyze, and visualize it. They apply
mathematical and statistical models to find patterns and solutions in the data.
• A teacher assumes that 60% of his college's students come from lower-
middle-class families.
• A doctor believes that 3D (Diet, Dose, and Discipline) is 90% effective for
diabetic patients.
Now that you know about hypothesis testing, look at the two types of hypotheses
testing in statistics.