Lecture of BIOSTATISTICS 12.2022 RMDC
Lecture of BIOSTATISTICS 12.2022 RMDC
Lecture of BIOSTATISTICS 12.2022 RMDC
• Median
• Mode
Mean
5 + 7 + 11 + 20 + 10
MEAN =
5
61
=
5
= 12.2
Median
= 11
or
= 11 + 3 /2 = 7
Mode
MODE = 1, 7, 5, 7, 1, 6, 7, 1, 4, 1
= 1
or
MODE = 1, 7, 5, 7, 1, 6, 7, 1, 4, 1, 7
= 1&7
Measures of Dispersion
Algebric Graphical
• The Range is the Difference Between the Lowest and Highest Values.
• To Find the Range, Simply Subtract the Lowest Value From the
Greatest Value, Ignoring the Others.
• The Sample Range is to Compare Variability Between Different
Distributions of Data. The Maximum, and the Minimum.
• Use en Dash (Not Hyphen or em Dash) With No Spaces Either Side
– 2-10
Find The Range
= 16-4 = 12
or
= 21-9 = 12
Mean Deviation
• Formula M. D. = Ʃ ( x - x )
ƞ
Where:
ƞ = No of Observations
x = Each Point Value
x = Sample Mean
Example
• The Weight of 10 indiviuals are 83, 75,81, 79, 71, 95, 75,
77, 84, 90. Calculate Standard Deviation
x Arthmatic Mean (x - x)
83 81 2 Mean Deviation =
75 81 -6 Deviation form Mean / No. of
Sample Observation
81 81 0
79 81 -2
Mean Deviation =
71 81 - 10 56 / 10
95 81 14
75 81 -6
Mean Deviation =
77 81 -4 5.6
84 81 3
90 81 9
Total = 810 Mean = (810/10) ( ƞ = 10 ) Total = 56
Standard Deviation
• Standard Deviation is a Measure of How Much the Data is Dispersed
From its Mean.
– A High Standard Deviation Implies that the Data Values are More
Spread Out From the Mean.
– A Low Standard Deviation Means that Data Values are More Cluster
Around the Mean.
– Standard Deviation can be Zero (if all the values in the variable are the
same)
• Standard Deviation is Denoted by:
– “SD” and Greek symbol “σ” is used for Population Standard
Deviation.
– Latin letter “s” is used for Sample Standard Deviation.
Standard Deviation
• Formula for Population Standard Deviation and If Sample Size is
More than 30 Use following Formula:
Standard Deviation
• For Sample Standard Deviation and If Samle Size is less
than 30 use following Formula:
Individul Raw Point/Score
Symbol of Sum
Square of Result
Symbol for
Sample Standard Deviation
Sample Mean
(x - x)
• Square Each Deviation.
( x - x )2
• Add the Squared Deviation.
Ʃ ( x - x )2
• Divide the Results by Number of Observation.
For mOre than 30 = ƞ and For Less tha 30 = ƞ - 1
• Last Square Root that Gives Standard Deviation.
Example
• The Weight of 10 indiviuals are 83, 75,81, 79, 71, 95, 75,
77, 84, 90. Calculate Standard Deviation
x (x - x) (x - x)2
83 2 4
75 -6 36
81 0 -
79 -2 2
71 - 10 100
95 14 196
75 -6 36
77 -4 16
84 3 9
90 9 81
Total = 810 , Mean = (810/10) ( ƞ = 10 ) Total = 482
Example
√ 482 = 7.31
10-1
√ 482
9
Quartile Deviation
• The Quartiles are Defined as the 25th Percentile and the 75th Percentile.
• For the Normal Distribution, these Define a Narrower Interval Than Does
One Standard Deviation on Each Side of the Mean.
• Quartile Deviation Measures the Deviation in the Middle of the Data.
• Where n represents the total number of observations in the given data set.
• Thus Q2 is the median of the given data set, Q1 is the median of the lower
half of the data set and Q3 is the median of the upper half of the data set.
• Lower Quartile (25th Percentile) (Q1) = (N+1) * 1 / 4.
• Middle Quartile (50th Percentile) (Q2) = (N+1) / 2.
• Upper Quartile (75th Percentile) (Q3 )= (N+1) * 3 / 4.
• Interquartile Range = Q3 – Q1.
Example of Quartile
• Find the Quartiles and Quartile Deviation of the following data:
• 17, 2, 7, 27, 15, 5, 14, 8, 10, 24, 48, 10, 8, 7, 18, 28
Solution:
• Ascending order of the given data is:
• 2, 5, 7, 7, 8, 8, 10, 10, 14, 15, 17, 18, 24, 27, 28, 48(n = 16)
• Q2 = Median of the given data set
• n is even, median = (1/2) [(n/2)th observation and (n/2 + 1)th
observation]
• = (1/2)[8th observation + 9th observation]
• = (10 + 14)/2 = 24/2 = 12
• Q2 = 12
Example
= 21
Example
= (21 – 7.5)/2
= 13.5/2
= 6.75
Normal Distribution
• Normal distribution, also known as the Gaussian distribution, is a
probability distribution that is symmetric about the mean.
• It shows that data near the mean are more frequent in occurrence
than data far from the mean.
• In graphical form, the normal distribution appears as a "bell
curve".
• In a Normal Curve:
– The Area between One Standard Deviation on either side of the Mean will
include 68% of the Value in the Distribution.
– The Area between Two Standard Deviation on either side of the Mean will
cover 96% of the Value in the Distribution.
– The Area between Three Standard Deviation on either side of the Mean will
include 99.7% of the Value in the Distribution.
Characteristic of Normal Distribution
• The Distant of a Value (x) from the Mean (µ) of the Curve in Units of
• Use the Negative z Score Table to Find Values on the Left of the
• Respresent the Area Under the Bell Curve to the Left of “z”.
Negative z Score Graph
Negative Z Table
Positive z Score
• Respresent the Area under the Bell curve to the Left of “z”.
Positive z Score Graph
Positive Z Table
Example:
Pulse of Group of Normal Healthy Males was 72.
Randomly Chosen Male have Pulse of 80.
Standard Deviation is 2.
z = ( x - x)
σ
= ( 80 - 72)
2
z = 8
2
= 4
Example
• Mean Hb of Selected Group is 12
• Standard Deviation of 2 gm
• Probability of a Person Picked is having 16 or more.
z = ( x - x)
σ
z = (16 - 12) = 4
2 2
z = 2 = 0.4772
We are dealing with half curve, the area beyond 2 would be:
0.5 - 0.4772 = 0.0228 = 228/10000
There is a Probability of 228 persons having Hb 16 or more out of 10000.
Example
• Mean Anaemic Value of Selected Group is 12 gm
• Standard Deviation of 2 gm
• Cut of Value of Amaemia is 10.
z = ( x - x)
σ
z = (10 - 12) = -2
2 2
z = -1 = 0.3413
We are dealing with half curve, the area beyond 2 would be:
0.5 - 0.3413 = 0.1587 = 1587/10000
There is a Probability of 1587 persons having Hb 10 or more out of 10000.
Table 01
• Interpretation:
– Probability is that only 3 out of 100,000 would likely to have Pulse Rate
of 80 or Higher.
Sampling
• Repeated Sampling from Same Population may Differ from Each Other
Results to Some Extent, this Variation is called Sampling Error.
• This difference is because Results are Drawn from Samples not from
Entire Population.
• Factors Influencing Sampling Error:
– Sample Size.
– Natural Variability of Individuals
Hypothesis
• Any Statement about the Population in a Research or Study is
Called Hypothesis.
• Two types of Hypothesis:
– Null Hypothesis HO Means No Relationship Between Two groups being
comapred. Every Researcher tends to Reject the Null Hypothesis.
• Null Hypothesis is either rejected or failed to be rejected during study.
• eg. Comparison between Abdominal Hysterectomy (AH) and Vaginal Hysterectomy
(VH).
• No difference between the two procedures AH and VH.
– Alternative Hypothesis HA Means either:
• Difference in complications between AH and VH (Two tail) or No Directional.
• VH has less complications than AH (One tail) or Directional
• AH has less complications than VH (One tail) or Directional
Situation: Failure to Reject the Ho
• Two Possibilities:
– True Siuation it should not have been hence we sr right in doing so
and our decision is Correct.
– True Situation it should have been rejected so we are worng in doing
so and our decision is incorrect.
Standard Error
• If we take Random Sample (ƞ) from the Population and repeat them over
and over again, we find that every Sample will have Different Mean (x).
• The Distribution of the Sample Means is almost a Normal Distribution and
is same as that of Population Mean (μ).
• The Standard Deviation of Mean is a Measure of Sample Error and the
Formula is (SD = σ / √ƞ) and called Standard Error of Mean.
• Nearly 95% of the Sample Means will lie within limits of Two Standard
Error [μ ± 2 (σ / √ƞ).
Confidence Interval
• We can Construct an Interval between which a Population Mean (μ)
may be drawn but we cannot exactly draw Population Mean (μ) with
the help of a Sample.
• Still we cannot be 100% Sure rather we can be 100% Confident.
• Formula for Population Mean (μ)± is:
μCI = X ± 2SE or X ± 2 SD/√N
Example: Sample Mean of 100 women as 12 gm with a SD of 2.
μCI = 12 ± 2 (2/√100 = 12 ± 4/10
= 12 ± 0.4 = 11.6 to 12.4
Chance of Error or p Value
• Chances of Error in research is always present.
• Chances of Error in research are kept at the minimum (Lower than
5% or 0.05 in terms of Probability).
• Chance of Error (5%) that is Set prior to Start of Study is known as
Level of Significance of Alpha (α).
• Definition of p Value.
– Probability of Committing Type I Error.
– Probability of Rejecting Ho when it is True.
– Probability of Falsely Rejecting Ho.
– Probability of Getting a Result By Chance.
• Level of Significance or Alpha (α) and p Value have same definition.
Test of Significance
• Standard Error of Mean:
– To Answer How Accurate is Sample Mean and What can be said True
Mean of the Universe, we Calculate Standard Error.
– We Take and Example:
• Random Sample of 25 Males, Age 20-24 Years, Mean Temp. is 98.14
deg. F, Standard Deviation is 0.6. Identify True Mean of Universe.
• SE (x) = s / √ƞ
= 0.6 / √25
= 0.12
Confidence Interval = 98.14 ± (2 x 0.12) = 97.88 to 98.38 deg. F
The Chances would be only 1 in 20 (P =0.05) that the Population
would be Outside these Limits.
Standard Error of Proportion
• Proportion of Male in Village is 52%, a Random Sample of 100 people taken, Male
proportion in the Sample was found to be 40%. Conclusion from the Sample and
Possible Range of Male in the Samle with 95% Confidence Limits.
– SE of Proportion =√ pxq/ƞ (Where “p” is Males and “q” is Females)
= √ 52 x 48 / 100
= 5
We Take two Standard Error on Either side of 52 as our Criterion.
Value Range in True Representative Sample = 52 + 2(5) = 62
= 52 - 2(5) = 42
Since Observed Proportion of Males was only 40% and well outside Confidence
Limit.
Relative Deviate = 52 - 40 / 5 = 2.4
Relative Deviate Exceed Two therefore Deviation is Significant.
Standard Error of Differences Between Two Means
= √ 8.67 + 48.4
=√ 57.07
= 7.5
Standard Error of Differences Between Two Proportions
= √ 24.4x75.6 + 16.2x83.8
90 86
= 6.02
Chi Square Test
df = (2 -1) X (2 - 1) =1X1 =1
Hence Degree of Freedom is of 2 X 2 Table is 1
Degree of calculated freedom one (1) at 5% level of significance in the table is 3.84.
While we compare the Chi Square Value 0.8 with cut off value 3.84.
The difference is too small to be significant therefore Vaccine A and B are equally
effective and the difference among the two vaccine is by chance.
Students t Test
• It is used to compare Two Means (Continuous or Numerical Data).
• Formula of t Test = X-μ
(SD)/√ƞ)
Example:
A Population has a mean Hb of μ = 12 gm.
Sample of 46
X = 11.6
Standard Deviation = 0.2 gm
Confidence Interval 95%
Continue
• Correlation Analysis:
– Correlation is a single statistic, or data point.
– Measures the Strength between two Study Variables.
• Regression Analysis:
– It is the entire equation with all of the data points that are
represented with a line.
– Derives a Prediction equation for estimating the value of one
variable given the value of the Second.
– Regression allows us to see how one affects the other.
• Data Representation:
– Pairs of Measurments made on same study subject.
– Notation:
• Independent Variable on X axis.
• Dependent or Response Variables on Y axis
Continue: Study Types
• In Epidemiological studies:
• In Experimental studies:
– Indenpendent Variable X are fixed by Investigators.