Stats Assignment 1
Stats Assignment 1
Stats Assignment 1
261972884
Section: I
Types of statistics:
Statistics is mainly divided into the following two categories.
1. Descriptive Statistics
2. Inferential Statistics
Descriptive Statistics
In the descriptive Statistics, the Data is described in a summarized way. The
summarization is done from the sample of the population using different
parameters like Mean or standard deviation. Descriptive Statistics are a way of
using charts, graphs, and summary measures to organize, represent, and explain a
set of Data. Data is typically arranged and displayed in tables or graphs
summarizing details such as histograms, pie charts, bars, or scatter plots.
Descriptive Statistics are just descriptive and thus do not require normalization
beyond the Data collected.
Inferential Statistics
In the Inferential Statistics, we try to interpret the Meaning of descriptive
Statistics. After the Data has been collected, analyzed, and summarized we use
Inferential Statistics to describe the Meaning of the collected Data. Inferential
Statistics use the probability principle to assess whether trends contained in the
research sample can be generalized to the larger population from which the
sample originally comes. Inferential Statistics are intended to test hypotheses and
investigate relationships between variables and can be used to make population
predictions. Inferential Statistics are used to draw conclusions and inferences, i.e.,
to make valid generalizations from samples.
Example
In a class, the Data is the set of marks obtained by 50 students. Now when we
take out the Data average, the result is the average of 50 students’ marks. If the
average marks obtained by 50 students are 88 out of 100, based on the outcome,
we will draw a conclusion.
Types of variables:
1. Qualitative variable
2. Quantitative variable
Quantitative:
A quantitative variable is a variable that reflects a notion of magnitude, that is, if
the values it can take are numbers. A quantitative variable represents thus a
measure and is numerical.
Quantitative variables are divided into two types: discrete and continuous. The
difference is explained in the following two sections.
Discrete:
Quantitative discrete variables are variables for which the values it can take
are countable and have a finite number of possibilities. The values are often (but
not always) integers. Here are some examples of discrete variables:
Number of children per family
Number of students in a class
Number of citizens of a country
Even if it would take a long time to count the citizens of a large country, it is still
technically doable. Moreover, for all examples, the number of possibilities
is finite. Whatever the number of children in a family, it will never be 3.58 or
7.912 so the number of possibilities is a finite number and thus countable.
Continuous:
On the other hand, quantitative continuous variables are variables for which the
values are not countable and have an infinite number of possibilities. For
example:
Age
Weight
Height
For simplicity, we usually referred to years, kilograms (or pounds) and
centimeters (or feet and inches) for age, weight and height respectively. However,
a 28-year-old man could be 28 years, 7 months, 16 days, 3 hours, 4 minutes, 5
seconds, 31 milliseconds, 9 nanoseconds old.
For all measurements, we usually stop at a standard level of granularity, but
nothing (except our measurement tools) prevents us from going deeper, leading
to an infinite number of potential values. The fact that the values can take an
infinite number of possibilities makes it uncountable.
Qualitative
In opposition to quantitative variables, qualitative variables (also referred as
categorical variables or factors in R) are variables that are not numerical and
which values fits into categories.
In other words, a qualitative variable is a variable which takes as its values
modalities, categories or even levels, in contrast to quantitative variables which
measure a quantity on each individual.
Qualitative variables are divided into two types: nominal and ordinal.
Nominal
A qualitative nominal variable is a qualitative variable where no ordering is
possible or implied in the levels.
For example, the variable gender is nominal because there is no order in the
levels (no matter how many levels you consider for the gender—only two with
female/male, or more than two with female/male/ungendered/others, levels
are unordered). Eye color is another example of a nominal variable because there
is no order among blue, brown or green eyes.
A nominal variable can have:
two levels (e.g., do you smoke? Yes/No, or are you pregnant? Yes/No), or
a large number of levels (what is your college major? Each major is a level
in that case).
Note that a qualitative variable with exactly 2 levels is also referred as
a binary or dichotomous variable.
Ordinal
On the other hand, a qualitative ordinal variable is a qualitative variable with
an order implied in the levels. For instance, if the severity of road accidents has
been measured on a scale such as light, moderate and fatal accidents, this
variable is a qualitative ordinal variable because there is a clear order in the
levels.
Another good example is health, which can take values such as poor, reasonable,
good, or excellent. Again, there is a clear order in these levels so health is in this
case a qualitative ordinal variable.
Data collection:
In Statistics, data collection is a process of gathering information from all the
relevant sources to find a solution to the research problem. Depending on the
type of data, the data collection method is divided into two categories namely,
Primary Data Collection methods
Secondary Data Collection methods
Quantitative Data Collection Methods
It is based on mathematical calculations using various formats like close-ended
questions, correlation and regression methods, mean, median or mode measures.
This method is cheaper than qualitative data collection methods and it can be
applied in a short duration of time.
Qualitative Data Collection Methods
It does not involve any mathematical calculations. This method is closely
associated with elements that are not quantifiable. This qualitative data collection
method includes interviews, questionnaires, observations, case studies, etc. There
are several methods to collect this type of data. They are
Observation Method
Observation method is used when the study relates to behavioural science. This
method is planned systematically. It is subject to many controls and checks. The
different types of observations are:
Structured and unstructured observation
Controlled and uncontrolled observation
Participant, non-participant and disguised observation
Interview Method
The method of collecting data in terms of verbal responses. It is achieved in two
ways, such as
Personal Interview – In this method, a person known as an interviewer is
required to ask questions face to face to the other person. The personal
interview can be structured or unstructured, direct investigation, focused
conversation, etc.
Telephonic Interview – In this method, an interviewer obtains information
by contacting people on the telephone to ask the questions or views,
verbally.
Questionnaire Method
In this method, the set of questions are mailed to the respondent. They should
read, reply and subsequently return the questionnaire. The questions are printed
in the definite order on the form. A good survey should have the following
features:
Short and simple
Should follow a logical sequence
Provide adequate space for answers
Avoid technical terms
Should have good physical appearance such as colour, quality of the paper
to attract the attention of the respondent
Measurement scale:
Different measurement scales allow for different levels of exactness, depending
upon the characteristics of the variables being measured. The four types of scales
available in statistical analysis are
Nominal: A scale that measures data by name only. For example, religious
affiliation (measured as Christian, Jewish, Muslim, and so forth), political
affiliation (measured as Democratic, Republican, Libertarian, and so forth),
or style of automobile (measured as sedan, sports car, SUV, and so forth).
Ordinal: A scale that measures by rank order only. Other than rough order,
no precise measurement is possible. For example, medical condition
(measured as satisfactory, fair, poor, guarded, serious, and critical);
socioeconomic status (measured as lower class, lower‐middle class, middle
class, upper‐middle class, upper class); or military officer rank (measured as
lieutenant, captain, major, lieutenant colonel, colonel, general). Such
rankings are not absolute but rather relative to each other: Major is higher
than captain, but we cannot measure the exact difference in numerical
terms. Is the difference between major and captain equal to the difference
between colonel and general? We cannot say.
Interval: A scale that measures by using equal intervals. Here you can
compare differences between pairs of values. The Fahrenheit temperature
scale, measured in degrees, is an interval scale, as is the centigrade scale.
The temperature difference between 50°C and 60°C (10 degrees) equals the
temperature difference between 80°C and 90°C (10 degrees). Note that the
0 in each of these scales is arbitrarily placed, which makes the interval scale
different from ratio.
Ratio: Like an interval scale, a ratio scale includes a 0 measurement that
signifies the point at which the characteristic being measured vanishes
(absolute zero). For example, income (measured in dollars, with 0 equal to
no income at all), years of formal education, items sold, and so forth, are all
ratio scales.