BIOSTATISTICS

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 14

ARITHMETIC MEAN

The arithmetic mean, often simply called the "mean," is a fundamental concept in
statistics and mathematics. It is a measure of central tendency used to find the
average or typical value of a set of numbers. To calculate the arithmetic mean, you
add up all the values in a data set and then divide by the number of values. Here's
the formula:

Arithmetic Mean (μ) = (Sum of all values) / (Number of values)

Let's illustrate this with a couple of examples:

Example 1 - Simple Arithmetic Mean:

Suppose you have a set of exam scores for a class of students:

Scores: 85, 90, 78, 92, 88

To find the mean, add up all the scores and then divide by the number of scores:

Mean (μ) = (85 + 90 + 78 + 92 + 88) / 5 Mean (μ) = 433 / 5 Mean (μ) = 86.6

So, the arithmetic mean of these exam scores is 86.6.

Example 2 - Weighted Arithmetic Mean:

In some cases, you might want to calculate a weighted mean when different values
have different levels of importance or influence. For instance, let's say you have a set
of grades for a course, and the final grade is based on two components: midterm
exam (worth 40%) and final exam (worth 60%). The scores are:

Midterm Exam: 85 Final Exam: 78

To find the weighted mean, you multiply each value by its respective weight and then
add them up:

Weighted Mean (μ) = (0.4 * 85) + (0.6 * 78) Weighted Mean (μ) = 34 + 46.8
Weighted Mean (μ) = 80.8

So, the weighted arithmetic mean for these grades is 80.8.

The arithmetic mean is a useful statistical tool for summarizing data and finding a
representative value.
GEOMETRIC MEAN
The geometric mean is another measure of central tendency used in statistics and
mathematics. Unlike the arithmetic mean, which is calculated by summing up all
values and dividing by the number of values, the geometric mean is calculated by
multiplying all values and taking the nth root of the result, where n is the number of
values. Here's the formula:

Geometric Mean (G) = (Product of all values)^(1/n)

Let's illustrate this with a couple of examples:

Example 1 - Simple Geometric Mean:

Suppose you have a set of numbers representing the annual growth rates of a
company's profits over the last five years:

Growth Rates: 5%, 8%, 10%, 12%, 15%

To find the geometric mean, multiply all the growth rates together and then take the
fifth root (since there are five values):

Geometric Mean (G) = (1.05 * 1.08 * 1.10 * 1.12 * 1.15)^(1/5)

Geometric Mean (G) = 1.65446

So, the geometric mean of these growth rates is approximately 1.65446.

Example 2 - Geometric Mean with Different Units:

In some cases, you may need to find the geometric mean of values with different
units. For example, suppose you want to calculate the average rate of return for an
investment over three years:

Year 1: 10% Year 2: 8% Year 3: 12%

To find the geometric mean in this case, you first convert the percentages to their
decimal equivalents (e.g., 10% becomes 0.10), then calculate the geometric mean as
follows:

Geometric Mean (G) = (1.10 * 1.08 * 1.12)^(1/3)

Geometric Mean (G) = 1.10016


So, the geometric mean of these annual returns is approximately 1.10016, which
corresponds to a 10.02% average annual return.

The geometric mean is often used when dealing with quantities that are inherently
multiplicative in nature, such as growth rates, investment returns, and geometric
sequences. It has the property of giving less weight to extreme values compared to
the arithmetic mean, making it suitable for situations where relative proportions are
more important than absolute values.
HARMONIC MEAN
The harmonic mean is another measure of central tendency used in statistics and
mathematics. Unlike the arithmetic mean and geometric mean, which involve
addition and multiplication, respectively, the harmonic mean is based on reciprocals.
It is calculated by taking the reciprocal of the arithmetic mean of the reciprocals of a
set of values. Here's the formula:

Harmonic Mean (H) = n / [(1/value₁) + (1/value₂) + ... + (1/value ₙ)]

Where:

 n is the number of values.


 value₁, value₂, ..., valueₙ are the individual values in the dataset.

Let's illustrate this with an example:

Example - Harmonic Mean:

Suppose you have a dataset of three numbers representing the speeds of three
vehicles (in miles per hour) on a highway:

Vehicle 1 Speed: 60 mph Vehicle 2 Speed: 70 mph Vehicle 3 Speed: 80 mph

To find the harmonic mean of these speeds, you calculate the reciprocals of the
speeds, find their arithmetic mean, and then take the reciprocal of that result:

Harmonic Mean (H) = 3 / [(1/60) + (1/70) + (1/80)]

First, calculate the reciprocals:

1/60 = 0.01667 (approximately) 1/70 = 0.01429 (approximately) 1/80 = 0.01250


(approximately)

Now, find the arithmetic mean of these reciprocals:

Arithmetic Mean of Reciprocals = (0.01667 + 0.01429 + 0.01250) / 3

Arithmetic Mean of Reciprocals ≈ 0.01482

Finally, take the reciprocal of the arithmetic mean of the reciprocals to find the
harmonic mean:

Harmonic Mean (H) = 3 / 0.01482 ≈ 202.05 mph


So, the harmonic mean of the speeds of these three vehicles is approximately 202.05
mph.

The harmonic mean is particularly useful when dealing with rates or ratios, such as
speed, time, or efficiency, because it places more weight on smaller values in the
dataset. It is often used in situations where you want to find a balanced average that
reflects the "slower" values more significantly.
HISTOGRAM
A histogram is a graphical representation of a grouped frequency distribution with continuous
classes. It is an area diagram and can be defined as a set of rectangles with bases along with the
intervals between class boundaries and with areas proportional to frequencies in the corresponding
classes. In such representations, all the rectangles are adjacent since the base covers the intervals
between class boundaries. The heights of rectangles are proportional to corresponding frequencies of
similar classes and for different classes, the heights will be proportional to corresponding frequency
densities.

Question: The following table gives the lifetime of 400 neon lamps. Draw the histogram for
the below data.

Lifetime (in Number of lamps


hours)

300 – 400 14

400 – 500 56

500 – 600 60

600 – 700 86

700 – 800 74

800 – 900 62

900 – 1000 48

Solution:

The histogram for the given data is:


PIE CHART
A pie chart is a type of graph that represents the data in the circular graph. The slices of pie show the
relative size of the data, and it is a type of pictorial representation of data. A pie chart requires a list of
categorical variables and numerical variables. Here, the term “pie” represents the whole, and the
“slices” represent the parts of the whole.

Question: The percentages of various cops cultivated in a village of particular distinct


are given in the following table.

Items Wheat Pulses Jowar Groundnuts Vegetables Total

Percentage of cops 125/3 125/6 25/2 50/3 25/3 100

Represent this information using a pie-chart.

Solution:

The central angle = (component value/100) × 360°

Items Percentage of cops Central angle

Wheat 125/3 [(125/3)/100] × 360° = 150°

Pulses 125/6 [(125/6)/100] × 360° = 75°

Jowar 25/2 [(25/2)/100] × 360° = 45°

Groundnuts 50/3 [(50/3)/100] × 360° = 60°

Vegetables 25/3 [(25/3)/100] × 360° = 30°

Total 100 360°


Mean:
 The mean, often called the "average," is the sum of all values in a
dataset divided by the number of values.
 Formula: Mean (μ) = (Sum of all values) / (Number of values)
 Example: If you have the dataset [2, 4, 6, 8, 10], the mean is (2 + 4 + 6 +
8 + 10) / 5 = 6.

Median:
 The median is the middle value in a dataset when the values are
arranged in ascending or descending order.
 If there's an even number of values, the median is the average of the
two middle values.
 Example: In the dataset [3, 5, 1, 6, 2], the values when sorted are [1, 2, 3,
5, 6], so the median is 3.
 Example with an even number of values: In the dataset [4, 2, 8, 6], the
values when sorted are [2, 4, 6, 8], so the median is (4 + 6) / 2 = 5.

Mode:
 The mode is the value that appears most frequently in a dataset.
 A dataset can have no mode (if all values are unique), one mode
(unimodal), or more than one mode (multimodal).
 Example: In the dataset [2, 4, 4, 6, 8, 8, 8], the mode is 8 because it
appears more frequently (three times) than any other value.

These measures help summarize and understand the central characteristics of a


dataset:

 The mean provides the "average" value and is sensitive to extreme values
(outliers).
 The median is the "middle" value and is less affected by extreme values,
making it a good measure for skewed datasets.
 The mode represents the most frequently occurring value(s) and is useful for
identifying common values in categorical or discrete datasets.

Skewness:
Skewness measures the asymmetry of the probability distribution or dataset. It tells
us whether the data is skewed to the left (negatively skewed), roughly symmetric (no
skew), or skewed to the right (positively skewed).

1. Negatively Skewed (Left-skewed): If the tail on the left side of the


distribution is longer or fatter than the right side, the data is negatively
skewed. In a negatively skewed distribution, the mean is typically less than the
median.
2. Symmetric: If the data distribution is roughly symmetric, the skewness is close
to zero, meaning the tail lengths on both sides are approximately equal. In a
perfectly symmetric distribution, the mean and median are the same.
3. Positively Skewed (Right-skewed): If the tail on the right side of the
distribution is longer or fatter than the left side, the data is positively skewed.
In a positively skewed distribution, the mean is typically greater than the
median.

Kurtosis:
Kurtosis measures the "tailedness" or the degree of outliers in the probability
distribution or dataset. It tells us whether the data has heavy tails (outliers are more
extreme) or light tails (outliers are less extreme) compared to a normal distribution.

1. Leptokurtic: If the data distribution has high kurtosis, it is said to be


leptokurtic. This means that the data has heavy tails, and there are more
extreme values (outliers) than in a normal distribution. Leptokurtic
distributions have higher kurtosis values.
2. Mesokurtic: A mesokurtic distribution has kurtosis similar to that of a normal
distribution. It has moderate tails and a kurtosis value of around zero.
3. Platykurtic: If the data distribution has low kurtosis, it is said to be platykurtic.
This means that the data has light tails, and there are fewer extreme values
(outliers) than in a normal distribution. Platykurtic distributions have negative
kurtosis values.

BIOSTATISTICS
Biostatistics is a branch of statistics that focuses on the application of statistical
methods and techniques to biological and health-related data. It plays a crucial role
in research, analysis, and decision-making in various fields of biology, medicine, and
public health. Biostatistics helps researchers and practitioners draw meaningful
conclusions from data, make informed decisions, and address questions related to
health and life sciences. Here's an explanation of biostatistics and its key
components:

1. Data Collection: In biostatistics, data is collected from various sources, such


as clinical trials, surveys, laboratory experiments, epidemiological studies, and
healthcare records. These data may include information on patient outcomes,
disease prevalence, treatment efficacy, genetic variations, and more.
2. Data Analysis: Biostatisticians use statistical methods to analyze and interpret
complex biological and health-related data. This includes techniques for data
cleaning, summarization, and hypothesis testing. They apply methods like
regression analysis, analysis of variance, and survival analysis to extract
meaningful insights from the data.
3. Experimental Design: Biostatisticians play a critical role in designing
experiments and clinical trials. They help determine sample sizes,
randomization procedures, and data collection protocols to ensure that the
results are statistically valid and clinically meaningful.
4. Descriptive Statistics: Biostatistics involves using descriptive statistics to
summarize data. Measures like means, medians, standard deviations, and
percentiles are used to describe central tendencies, variations, and the
distribution of data.
5. Inferential Statistics: Inferential statistics are used to draw conclusions about
a population based on a sample of data. Hypothesis testing, confidence
intervals, and p-values are commonly employed to make inferences about
parameters, treatment effects, and associations between variables.
6. Epidemiology: Epidemiology, a field closely related to biostatistics, focuses
on the study of disease patterns, causes, and risk factors in populations.
Biostatistics is essential in epidemiological research for data analysis and
modeling, especially in the study of disease outbreaks and public health
interventions.
7. Genomics and Genetics: Biostatistics plays a crucial role in genetic research,
helping scientists analyze and interpret genetic data, study inheritance
patterns, and identify genetic markers associated with diseases and traits.
8. Clinical Trials: Biostatisticians are integral to the design, conduct, and analysis
of clinical trials, ensuring that the results are statistically valid and reliable.
They help assess treatment efficacy, safety, and potential side effects.
9. Public Health Policy: Biostatistics informs public health policy decisions by
providing data-driven insights into disease prevalence, risk factors, and the
impact of public health interventions. This helps policymakers make evidence-
based decisions to protect and improve public health.
10. Bioinformatics: In the era of big data, biostatistics also intersects with
bioinformatics, which involves the analysis of large-scale biological and
genomic data using computational methods.
SANSKRITI UNIVERSITY
Mathura, Uttar Pradesh, India.

ASSIGNMENT: BIOSTATISTICS

SUHAIL KHAN
BSC. BIOTECHNOLOGY
1ST YEAR (SEMESTER 1)

You might also like