Ace Reviewer Lbolytc

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

PROPERTY OF BUSINESS MANAGEMENT SOCIETY ACER REVIEWER

INTRODUCTION AND TERMINOLOGIES


Purpose of Statistics the process of collecting information
· To provide information from the population
· To provide comparisons
· To help discern relationships Survey
· To aid in decision making the process of collecting information
· To justify claims or assertions from the sample
· To estimate unknown quantities
· To predict future outcomes Parameter and statistic
Paramater
STATISTICS a summary or numerical measure used
to describe a population
A science that deals with the collection, Statistic
organization, presentation, analysis and a summary or numerical measure used
interpretation of data to describe a sample

Branches of Statistics Constant


DESCRIPTIVE STATISTICS a characteristic or property of a
Consists of methods concerned with population or sample which makes the
collection, organization, summarization members similar to each other.
and presentation of a set of data
Variables
INFERENTIAL STATISTICS any characteristic or information
Comprised of those methods concerned measurable or observable on every
with making predictions or inferences element of the population or sample
about an entire population based on
information provided by the sample Qualitative (Categorical) Variables
variables that indicate what kind of a
given characteristic an individual, object,
Population and sample or event possesses.
Population
Consists of the totality of all the Quantitative (Numerical) Variables
elements or entities from which you variables that indicate how much a
want to obtain an information given characteristic an individual, object,
Sample or event possesses
A subset of the population

Census and survey


Census

PROPERTY OF BUSINESS MANAGEMENT SOCIETY Page 1 out of 17


PROPERTY OF BUSINESS MANAGEMENT SOCIETY ACER REVIEWER

Types of quantitative variables categories with an implied


ordering in these labels;
Discrete Variables ● Ranking can be done on the data
variables whose values are obtained ● Distance between two labels can
through the process of counting not be determined.
Continuous Variables
variables whose values are obtained
through the process of measuring Interval
● Variables whose values can be
ordered and distance between
Dependent any two labels are of known size;
a variable which is affected by another ● Always numeric and have no true
variable zero point.
Ex. “test scores” is dependent on
number of hours spent in studying, IQ,
Ratio
attitude towards studying, etc…
● Variables whose values have all
Independent
the properties of the interval
a variable which affects the dependent
scale and the ratio of two values
variable
is meaningful
Ex. “number of hours spent in studying”
● Has a true zero point;
affects test scores
● Highest level of measurement

Scales of measurement of variables

Nominal
● Variables whose values are
simply labels or names or
categories without any explicit or
implicit ordering of the labels;
● Lowest level of measurement
known as categorical scale.

Ordinal
● Variables whose values are
simply labels or names or

PROPERTY OF BUSINESS MANAGEMENT SOCIETY Page 2 out of 17


PROPERTY OF BUSINESS MANAGEMENT SOCIETY ACER REVIEWER

DATA PRESENTATION
Presentation of Data k – no. of non-overlapping intervals

Numerical quantities focus on expected Step 3: Compute for the class size,
values, and graphical summaries on denoted by c.
unexpected values. (John Tukey).
c – the quotient of steps 1 and 2.
Textual
c=R/k
Data are presented in paragraph form. It
involves enumeration of important - if the quotient is decimal always
characteristics, emphasizing significant round up
figures, and identifying the important
features of the data.
Step 4: Identify the class intervals, CI.
Tabular

Sometimes we could hardly grasp Step 5: Identify the frequency in each CI or


information from a textual presentation of tallying.
data. Thus, we may present data using
tables. Class Size /Class Width – The difference
between the upper (or lower) class limits of
Frequency Distribution Table
consecutive classes. All classes should
It is a tabular summary of data showing have the same class width.
the frequency (or number) of items in each
of several non-overlapping classes.
Lower Class Limit – The least value that
can belong to a class.
Steps in Constructing Frequency
Distribution Table
Upper Class Limit – The greatest value
that can belong to a class.
Step 1: Determine the range, denoted by
R.
Additional Info about FDT
R – the difference between the
highest value and the lowest value
Class Boundaries (CB)– the numbers that
R = HV - LV separate classes without forming gaps
between them.

Step 2: Decide on the number of classes,


denoted by k.

PROPERTY OF BUSINESS MANAGEMENT SOCIETY Page 3 out of 17


PROPERTY OF BUSINESS MANAGEMENT SOCIETY ACER REVIEWER

Class Mark / Midpoint (CM) – the middle - Histogram [no gaps between bars] –
value of each data class. To find the class continuous data
midpoint, average the upper and lower
class limits. 3] Line graph
- Frequency polygon – continuous data

Relative Frequency (RF)– obtained by Rules to remember in constructing graphs:


dividing the frequency of the given class by
the total number of observations. 1] Labels:
FDT - Figure number [below the graph]
- Figure title [below the graph]
Less than CF (<CF) – total number of - for Pie chart, % should be indicated
observations within a class whose values - for Bar graph, the axis should be
do not exceed the upper limit of the class labeled

2] Textual explanation should also follow


Greater than CF (>CF) – total number of
any graph
observations within a class whose values
are not less than the lower limit of the class

Cumulative frequency of a data class –


the number of data elements in that class
and all previous classes. (may be
ascending or descending.)

Graphical

Types of Graphs:
1] Pie chart/ circle graph – any data

MOST POPULAR

2] Bar graph
- Bar chart [with gaps between bars]
– discrete data

PROPERTY OF BUSINESS MANAGEMENT SOCIETY Page 4 out of 17


PROPERTY OF BUSINESS MANAGEMENT SOCIETY ACER REVIEWER

NUMERICAL DESCRIPTIVE MEASURE

A. Measures of Central Tendency Properties:


● No calculations are required (for
It describes the “center” of a given data the ungrouped mode).
set. It is a single value about which the ● It may not exist.
observation tends to cluster. ● It may not be unique.
Considerations for Choosing a Measure
of Central Tendency
1. The Arithmetic Mean (or simply Mean)
● For a nominal variable, the mode
-the sum of all observations divided by
is the only measure that can be
the total number of observations,
used.
denoted by x̄ ● For ordinal variables, the mode
and the median may be used.
Σ𝑥 The median provides more
𝑥̄ = 𝑛 information (taking into account
the ranking of categories.)
Properties: ● For interval-ratio variables, the
● It always exists for quantitative mode, median, and mean may all
variables. be calculated. The mean
● It is unique. provides the most information
● It takes into account every item of about the distribution, but the
the data. median is preferred if the
- Thus, it is easily affected by distribution is skewed.
extreme values.

2. The Median - the middle value of an B. Measures of Position


array, denoted by Md
They are measures that discriminate a
Properties: group of scores from another group in the
● Not easily affected by extreme same data set.
values.
● It always exists and is unique.
Defn: Quantiles – divides data into an
3. The Mode – the observation(s) that equal number of parts.
occur most frequently in the data set,
denoted by Mo The Quartiles –are values that divide a set
of observations into four equal
parts, denoted by Qi, i = 1, …, 4.

PROPERTY OF BUSINESS MANAGEMENT SOCIETY Page 5 out of 17


PROPERTY OF BUSINESS MANAGEMENT SOCIETY ACER REVIEWER

the measure of dispersion will


The Deciles – are values that divide a set be
of observations into 10 equal parts,
denoted by Di, i = 1, 2, …, 10.
Range (R)

The Percentiles – are values that divide a the Difference between the highest and
set of observations into 100 equal parts, lowest value in the data set.
denoted by Pi, i = 1, 2, …, 100.
R = HV - LV
UNGROUPED MEASURES OF POSITION
The range is rarely used in scientific work
To locate the desired quantile: as it is fairly insensitive
● It depends on only two scores in
the set of data, HV and LV
𝑘 (𝑛+1)
Use 𝑃𝑘 = 100
to locate the position. ● Two very different sets of data
can have the same range:

𝑘 (𝑛+1)
Variance (s2 or σ2)
2. If 𝑃𝑘 = 100
is not exact, then do -the mean squared differences of the
interpolation. observations from their mean.

● This difference is called a deviate or a


C. Measures of Variability deviation score
● It describes the extent to which ● The deviate tells us how far a given
the data are dispersed. score is from the typical, or average,
● Variability is descriptive statistics score
that describe how similar a set of ● Thus, the deviate is a measure of
scores are to each other dispersion for a given score
o The more similar the scores
2
2
are to each other, the lower σ =
Σ (𝑥− 𝑥̄)
𝑁
the measure of dispersion will
be
Standard Deviation (s or σ)
o The less similar the scores are
- the positive square root of the variance.
to each other, the higher the
measure of dispersion will be
● Since squared units of measure
o In general, the more spread
are often awkward to deal with,
out a distribution is, the larger the square root of variance is
often used instead

PROPERTY OF BUSINESS MANAGEMENT SOCIETY Page 6 out of 17


PROPERTY OF BUSINESS MANAGEMENT SOCIETY ACER REVIEWER

● The standard deviation is the Pearsonian coefficient of skewness (Sk)


square root of variance formula

3 (𝑚𝑒𝑎𝑛−𝑚𝑒𝑑𝑖𝑎𝑛)
𝑆𝑘 = 𝑠

standard deviation = 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒

2 If the mean is greater than the median,


variance = 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
we have positively skewed curve, but if the
mean is less than the median, we have a
Coefficient of Variation (CV)
negatively skewed curve. Now, with the use
the ratio of the standard deviation to its
of the standard deviation, it is possible to
mean expressed in percent.
obtain a measure of skewness which
● compare variability of two
indicates both the direction and the
populations that are expressed in
magnitude (or the extent) of skewness of a
different units of measurement
frequency data.
● expressed as a percentage
rather than in terms of the units Skewness
of the particular data 2
Σ(𝑥−𝑥̄)
𝑠 = 𝑛−1
𝑠
𝐶𝑉 = 𝑥̄
𝑥 100%
Skewness

D. Measures of Skewness
If SK < 0, then the distribution has a
Skew is a measure of symmetry in negative skew
the distribution of scores
If SK > 0 then the distribution has a positive
skew
Measures of Skewness
A frequency curve that is not
If SK = 0 then the distribution is symmetrical
symmetrical about the mean is said to be
skewed. If it tails off to the right, we
describe it as positively skewed, but if it E. Measures of Kurtosis
tails off to the left, we say it is negatively
skewed. The relationship between the Kurtosis measures whether the scores are
mean and the median is related to the spread out more or less than they would be
direction skewness. in a normal (Gaussian) distribution

● Leptokurtic (K > 3)
● Mesokurtic (K = 3)

PROPERTY OF BUSINESS MANAGEMENT SOCIETY Page 7 out of 17


PROPERTY OF BUSINESS MANAGEMENT SOCIETY ACER REVIEWER

● Platykurtic (K < 3)
● Kurtosis

Kurtosis
4
Σ (𝑥 − 𝑥̄)
𝑘 = 4
𝑛𝑠

SAMPLING TECHNIQUES
Each member of the population is given
Population: equal chance or opportunity of being
included in the sample.
a set which includes all measurements
of interest to the researcher (The collection of Non-probability sampling
all responses, measurements, or counts Each member of the population does
that are of interest) not have equal chance or opportunity of
being included in the sample.
Sample:
A subset of the population

Why sampling?
Probability v/s Non-Probability
● Impossible to study the whole
population. Probability sampling
● Manageability of data
● Economic Reasons 1. You have a complete sampling frame
● Time and effort 2. You can select a random sample from
your population
3. You can generalize your results from a
Types of Sampling random sample
4. Can be more expensive and
Probability sampling
time-consuming

PROPERTY OF BUSINESS MANAGEMENT SOCIETY Page 8 out of 17


PROPERTY OF BUSINESS MANAGEMENT SOCIETY ACER REVIEWER

Non-Probability sampling
1. Used when there isn’t an exhaustive
population list available. Probability Sampling
2. Not random Simple Random Sampling (SRS)
3. Can be effective when trying to generate All members of the population have a
ideas and getting feedback chance of being included in the sample.
4. More convenient and less costly Stratified Sampling
This technique is use when the
Biased and Unbiased population can be subdivided into
several smaller groups (or strata) and
Non-Probability Sampling then SRS is applied to get samples from
each stratum
Convenience Sampling
The researcher uses subjects that are Cluster Sampling
readily available or includes only people This technique employs the use of
who are easy to reach. cluster (groups) instead of individuals
example that are randomly chosen
Using student volunteers as subjects for
the research. Systematic Sampling
It selects every kth member of the
Non-Probability Sampling population with the starting point
determine at random
Purposive sampling
The researcher looks for predefined Sample Size (n)
groups that will serve as samples
Most statisticians agree that the
example
minimum sample size to get any kind of
The researcher wants to know what it
meaningful result is 100. If your population is
takes to graduate summa cum laude in
less than 100 then you really need to survey all
college, the only people who can give
of them.
the researcher first hand advice are the
individuals who graduated summa cum
laude.
IN RESEARCH: THE MORE SAMPLES
Probability Sampling WE GET THE BETTER!

● Simple Random Sampling Determining Sample Size


● Stratefied Sampling
● Cluster Sampling 1. Using A Census for Small Population (n
● Systematic Sampling ≤ 100)
● Multi-Stage Sampling 2. Using Sample Size which is 10% of N

PROPERTY OF BUSINESS MANAGEMENT SOCIETY Page 9 out of 17


PROPERTY OF BUSINESS MANAGEMENT SOCIETY ACER REVIEWER

3. Using Published Tables ● To use the formula, first figure out what
4. Using Formulas to Determine Sample you want your error of tolerance to be.
Size: Slovin’s Formula ● For example, you may be happy with a
confidence level of 95 percent (giving a
margin error of 0.05), or you may
Slovin’s Formula require a tighter accuracy of a 98
𝑁
percent confidence level (a margin of
𝑛 = 2 error of 0.02). Plug your population size
1 + 𝑁𝑒
and required margin of error into the
● n is the sample size formula. The result will be the number of
● N is the population size samples you need to take.
● e is the margin of error

HYPOTHESIS TESTING

What is a Hypothesis?
Steps in Hypothesis Testing
● an assumption about
the population parameter Step 1. Formulate Ho and Ha
Step 2. Set the level of significance ,
● an educated guess
usually it is given in the problem.
about the population parameter
Step 3. Formulate the decision rule (when to
● deciding between what is reality and reject Ho); Find the critical value/P-value.
what is coincidence! Step 4. Test Statistics; do the computation.
Step 5. Make your decision
Step 6. Write a conclusion.

Hypotheses Testing:
This is the process of making an inference
or generalization on population parameters
based on the results of the study on
samples.
Types of Statistical Hypotheses
Statistical Hypotheses: Null Hypothesis (Ho): is always hoped to be
It is a guess or prediction made by the rejected
researcher regarding the possible outcome
of the study. Always contains “=“ sign

PROPERTY OF BUSINESS MANAGEMENT SOCIETY Page 10 out of 17


PROPERTY OF BUSINESS MANAGEMENT SOCIETY ACER REVIEWER

Ho: The average GPA of this class is 3. Two-tailed test: Non-directional


3.5
– this is used if Ha uses ¹ symbol
H0 : = 3.5

CRITERION:
Alternative Hypothesis (Ha):
● Challenges Ho One-tailed test (right directional)
● Never contains “=“ sign
“Reject H0 if Zc ≥ Zt”
● Uses “< or > or ¹“
● It generally represents the idea which One-tailed test (left directional)
the researcher wants to prove.
“Reject H0 if Zc ≤ Zt”
Ha: The average GPA of this class
a) higher than 3.5 (Ha: m > 3.5) Two-tailed test (both sides)
b) lower than 3.5 (Ha: m < 3.5)
c) not equal to 3.5 (Ha: m ¹ 3.5) “Reject H0 if Zc ≥ Zt”
and

“Reject H0 if Zc ≤ Zt”
Level of Significance, a and the Rejection
Region Decisions made regarding Ho

α = 0.05, means the probability of being right (Reject Ho/Do not reject Ho)
is 95% and the probability of being wrong is
5%.
If we reject Ho, it means it is wrong!
α = 0.01, means the researcher is taking a 1%
risk of being wrong and a 99% risk of being If we accept Ho, it doesn’t mean it is correct,
right. we just don’t have enough evidence to reject it!
Types of Hypotheses Tests Errors in Hypothesis Testing
1. One-tailed left directional test Type I ( ɑ error )
– this is used if Ha uses < symbol Rejecting a true Ho!
2. One-tailed right directional test Type II ( β error )
– this is used if Ha uses > symbol Accepting a false Ho!

PROPERTY OF BUSINESS MANAGEMENT SOCIETY Page 11 out of 17


PROPERTY OF BUSINESS MANAGEMENT SOCIETY ACER REVIEWER

F-TEST - ANALYSIS OF VARIANCE

F-Test

· The F-test is a parametric test used to (𝐺𝑇)


2
𝐶𝐹 =
compare the means of two or more groups 𝑁

of independent samples. It is also known as Compute the following to construct the


the Analysis of Variance (ANOVA). ANOVA table

Three kinds of analysis of variance : 1. TSS – the total sum of squares


minus CF, the correction factor
2. BSS – the between sum of squares
1. One-way analysis of variance – only 1 minus the CF
variable involved 3. WSS – within sum of squares or it is
2. Two-way analysis of variance – 2 the difference between the TSS
variables involved, the column and the minus the BSS
row variables.
- used to know if there are ANOVA Table
significant differences between
and among columns and rows
3. Three-way analysis of variance – 3
variables involved

F-Test

Why use?

To find if there is a significant difference


between and among the means of the two or ANOVA Table
more independent groups.
● The Mean Squares Between (MSB) is
When to use? equal to BSS/df
● The Mean Squares Within (MSW) is
If there is normal distribution and when the equal to WSS/df
level of measurement is expressed in interval ● To get the F-computed value, divide
or ratio data (like t- test & z-test) MSB/MSW
● F-computed value must be compared
How to use?
with the F-tabular value at a given level
To get the F computed value, use formula

PROPERTY OF BUSINESS MANAGEMENT SOCIETY Page 12 out of 17


PROPERTY OF BUSINESS MANAGEMENT SOCIETY ACER REVIEWER

of significance with the corresponding


dfs of BSS and WSS This means there is a significant difference
between and among the means of the different
groups
Note:

● If F-computed value > F-tabular value,

Disconfirm null hypothesis in favor of the


research hypothesis

CORRELATION AND REGRESSION


CORRELATION
CORRELATION COEFFICIENT
● Finding the relationship between ● Statistic showing the degree of
two quantitative variables without relation between two variables
being able to infer causal
relationships SIMPLE CORRELATION COEFFICIENT
● Correlation is a statistical (R)
technique used to determine the ● It is also called Pearson's
degree to which two variables are correlation or product moment
related correlation coefficient.
● It measures the nature and
SCATTER DIAGRAM strength between two variables of
● Rectangular coordinate the quantitative type.
● Two quantitative variables
● One variable is called (X) and the The sign of r denotes the nature of
second is (Y) association
● Points are not joined ● while the value of r denotes the
● No frequency table strength of association
● If the sign is + positive this means
SCATTER PLOTS the relation is direct (an increase in
one variable is associated with an
● The pattern of data is indicative of increase in the other variable and a
the type of relationship between decrease in one variable is
your two variables: associated with a
o positive relationship decrease in the other variable).
o negative relationship ● While if the sign is negative this
o no relationship means an inverse or indirect

PROPERTY OF BUSINESS MANAGEMENT SOCIETY Page 13 out of 17


PROPERTY OF BUSINESS MANAGEMENT SOCIETY ACER REVIEWER

relationship (which means an


increase in one variable is ● Correlation describes the strength
associated with a decrease in the of a linear relationship between two
other). variables
● The value of r ranges between ( -1) ● Linear means “straight line”
and ( +1) ● Regression tells us how to draw
● The value of r denotes the strength the straight line described by the
of the association. correlation

REGRESSION
If r = Zero this means no association ● Calculates the “best-fit” line for a
correlation between the two variables. certain set of data
● The regression line makes the
If 0 < r < 0.25 - weak correlation. sum of the squares of the residuals
smaller than for any other line
If 0.25 ≤ r < 0.75 - intermediate correlation. ● Regression minimizes residuals
● By using the least squares method
If 0.75 ≤ r < 1 - strong correlation. (a procedure that minimizes the
vertical deviations of plotted points
If r = 1 - perfect correlation. surrounding a straight line) we are
able to construct a best fitting
REGRESSION ANALYSES straight line to the scatter diagram
points and then formulate a
● Regression: technique concerned regression equation.
with predicting some variables by
knowing others Regression Equation
Regression equation describes the
● The process of predicting variable regression line mathematically
Y using variable X ● Intercept
● Slope
Regression
● Uses a variable (x) to predict some
outcome variable (y)
● Tells you how values in y change
as a function of changes in values
of x

CORRELATION AND REGRESSION

PROPERTY OF BUSINESS MANAGEMENT SOCIETY Page 14 out of 17


PROPERTY OF BUSINESS MANAGEMENT SOCIETY ACER REVIEWER

MULTIPLE LINEAR REGRESSION

What is MRA ? coefficient of determination r2x100% and


correlation if it is +/-.
● Multiple Regression Analysis is used to
predict the dependent variable y given
the independent variables xs.
● Aside from predictions we can also see How do we use the MRA ?
relationship between the dependent
Many mathematical formulas can serve
variable and the different independent
to express relationship among more than two
variables.
variables, but the most commonly used in
Example: statistics are linear equations.

We can make better predictions of the y = b0 + b1x1 + b2x2 + … + bnxn


performance of a newly hired employee if we
y = b0 + b1x1 + b2x2 + … + bnxn
consider not only their education but also their
years of experience x1, personality x2, attitude
Where :
x3, and other variables that may influence
y = dependent variable to be predicted
performance.
x1, x2, …, xn = known independent
variables that may influence y
b0, b1, b2, …, bn = numerical constant
When do we use the MRA? which must be determined from
observed data
● We use the MRA when predicting y
dependent variable with 2 or more When there are two independent variables x1
independent variables xs. and x2 and we want to fit the equation
● We want to know if there is a
relationship that exists between y = b0 + b1x1 + b2x2
dependent variable and among the
We must solve the three normal equations:
independent variables.

Why do we use MRA?


∑y = nb0 + ∑x1b1 + ∑x2b2
The MRA is used because we
want to know the extent of influence that the ∑x1y = ∑x1b0 + ∑x12b1 + ∑x1x2b2
independent variables have on the dependent
variable: ∑x2y = ∑x2b0 + ∑x1x2b1 + ∑x22b2

PROPERTY OF BUSINESS MANAGEMENT SOCIETY Page 15 out of 17

You might also like