كتاب الاحصاء الحيوي
كتاب الاحصاء الحيوي
كتاب الاحصاء الحيوي
salih 2018
Engineering statistics
Asst.Prof.dr.sufian M.salih
2018
1
Asst.Prof.dr.sufian M.salih 2018
PART 1
1.1. Introduction
Statistics (Common) : Production, consumption, population, health,
education, traffic, monitoring the results of a specific event, such as the
economy; its size, assets, distribution, and so on, obtained about the properties,
that can be interpreted figures are called statistics. These definitions are
frequently encountered. The visual and written media often mentioned this
definition.
Statistics (Scientific) : Statistics is the art of the defineing the datas. Allows to
predict the decisions about the future using existing information. Of research;
planning, implementation, obtaining the data, summary of the data that
obtained, evaluated and some analysis and forecasts of the scientific method to
describe the manner in which called for the submission of statistics. This
definition is of an interest rather than researchers. So, university researchers
and research institutions are much more in a research, to evaluate the purposes
of this definition.
2
Asst.Prof.dr.sufian M.salih 2018
The Rate: It is the unit affinity between the same two values.For Example:,
income-expenditure ratio, birth-death rate, export-import ratio, … ect.
Thousand: If the value is too small, it will be multiply with 1000 and to obtain
the thousandth in value.
Velocity: The units used to determine the interest rate with each of two
different variables. Price = Money/Ware; Velocity = Road/Time… etc.
Example: Depending chance to sample drawn from the population and the
quality and quantity of community members is called example. The basis of the
sample is a random selection. Research is often a lack of manpower, financial
and instrument-hardware failure etc. are carried out on samples as reasons.
Parametric: It is a test’s and a forecast, which equation, variance and ratio are
used.
Non parametric: Made using the sort and mark tests and estimates.
Variable: They are the values from which the data obtained as a result of
observation, counting, measuring and evaluation... Variables are generally
expressed from the last letters of alphabet like x,y,z or some word shortcuts are
3
Asst.Prof.dr.sufian M.salih 2018
used gibi genellikle alfabenin son harfleriyle yada kısaltılmış kelimelerle ifade
edilir. Variables are divided into two.
Example
Health Condition : Sick – Healthy
Gender : Female – Male
Quality : First Class, Second Class, Third Class
Pen numbers in pocket : 5, 7, 12
Number of harmful : 50, 75, 67
Petrie
Number of children in :2, 3, 4
the family
4
Asst.Prof.dr.sufian M.salih 2018
2. Rating (Rank) scale: Rating is usually a process occurring after the group.
Objects are put in order according to their having any particular property.
Terms of similar characteristics, the most outstanding is the right one 1 st, 2 th,
3 rd, 4 th ranking is shaped the most backward. After placing the order loses
its importance in common. It is important from whom more is less or little-big
occur. The classification of data is done in the form of rankings. Example,
Product quality: I. Quality, II. ... As quality.
3. Interval scale: Interval scale indicate the amount of the difference between
the objects. For this, collection - extraction calculation process can take out.
Each type of statistical procedures applied. Data based on a fictitious relative
starting point or two points separated by an interval equal to the specified
portion (such as Celsius and Fahrenheit thermometer for temperature
measurement) is created. Thermometers are examples of scale scores range.
4. Rating Scale: These are the top-of-scale scale. The only difference is the
presence of the interval scale of such a starting point scale indicating the
absolute absence. Which is an actual starting point (zero point) are each scale is
expressed as solid data. The measure used is the exact measure of the rate.
Variables measured in this kind of scale in terms of quantity.
Ratio scale is the most common type of scale, all arithmetic data obtained in
this scale and statistical techniques can be applied. Example: length, area, time,
weight, volume, density measurements, etc.
The research carried out in the framework of the planned issue should be aware
of the following.
In the study, the sample size (number of repetition) should be enough.
Impartiality in all stages of the research should be considered to be
objective.
Tools and equipment should consist of instruments that appropriate and
accurate measurement research.
The members and workers of the research should be trained
educational, impartial and know what to do.
5
Asst.Prof.dr.sufian M.salih 2018
Q3. Enter the variable type and measurement scale on the following variables.
Variable Variable Type Ratio Type
Health Condition
Succes Condition
Hematocrit value
Height
Blood pressure
values;
Body temperatures;
Pulse value
PART 2
Appropriate methods with easy to make clear the raw data obtained from
research, summarizing and interpreting the subject of descriptive statistics.
These methods are tables and figures (graphs) can be divided into two main
groups.
2.1.Tables
a) Private tables
b) Frequency tables
Researchers can use the appropriate special tables to present their research
results. These tables are generally based on specific characteristics mean,
standard deviation, etc. that is included in statistics. However, some features
are not expressed in the classification table provides information about the
frequency distribution with the characteristics of the data is defined as the use
of graphics is more appropriate. Frequency is a periodik repeat of number in
values.
The number of classes can also be determined according to the rules of Sturges.
SS = 1+3,32*log (n);
n = The number of data. The number of classes should be rounding decimal if
it is a integer.
7
Asst.Prof.dr.sufian M.salih 2018
8
Asst.Prof.dr.sufian M.salih 2018
Sample 1: 70 children height has been mesuaret and found like below.
Summarize it in frequency table.
Sorted data will be seen that it is difficult to interpret these data is analyzed on.
It would be almost impossible to interpret these data in this way, if a greater
number of data that should be considered known. In the sorted data sets
interpretations opportunity to make some small operation has occurred. The
shortest and longest children emerged, repetitive values are immediately
visible. When this data into a frequency table can be made in more nice
comments.
Height is given as an integer. Thus the value of the data unit is (measurement
accuracy) 1. Class limitations, are found by the half of the unit number of 1’s
subtraction from below limit, which is added than to the upper limit.
Data is scanned using the class limits and where the data is written in the
falling number of frequency column in each class. The number of scan lines are
added as data in the class. This form of distribution of the data is determined by
screening.
9
Asst.Prof.dr.sufian M.salih 2018
and then by divideing them into 70 and multiplying it with 100, incremental
relative frequencies will be founded (5/70)*100=7,1 .
This table is examined that the data is viewed almost symmetrical distribution
or concentration of data, which shows that a mean of 102 cm, that can be seen
immediately. The number of data in certain intervals, could be interpreted.
10
Asst.Prof.dr.sufian M.salih 2018
e) Other Graphs: With data that is obtained from research; column, line,
circle graphs etc... drawing is converted into a more concise and
understandable. Results will enable faster detection and interpretation of
the reader to be presented visually. Suitable graphics should be selected
according to the data.
Column graph: Column charts at present, are more than one property in the
same period is appropriate.
Line graph: Line graphics are used to investigate the change over time of any
feature. Growth curves are expressed with line graphs, and generally it
increases up to a certain time and then fixed.
Circle graph: In expressing the parts of a whole, is more suitable apartment or
pie chart. An example is presented for these three graphs below the most
common.
50
40
30
User
20 Non User
10
0
Illıretate Primary School Secondary School High School College
Graph 2. Use cases of using the family planning and education level.
11
Asst.Prof.dr.sufian M.salih 2018
Eye
9% 13%
Internal Medecine
28% Orthopedics
31%
Child
19%
Psychiatry
40 5
50 25
60 15
70 18
80 17
Q4. Systolic blood pressure values of 30 people are given below. Summarize
this data at an appropriate frequency table? Xi: { 56 60 68 90 97 88 80 78 76 69
80 77 59 60 65 86 90 96 64 77 70 75 68 80 95 41 79 91 63 74}
Q5. In a frequenct table the total frequency is 200. The 4. Class frequency is
12, incremental frequencies is 40 and the 5. Class’s incremental frequencies is
64. Fin the incremental relative frequency (IRF) and relative frequency (RF) of
4. And 5. Class.
Q6. The number of patients in a hospital clinic are given. Draw by selecting the
most appropriate graph showing the distribution of the total number of patients
in hospital clinics.
PART 3
x x2 ... xn
xi
xi
x 1 i 1
n n n
Example: Five babies birth interval are gibin below. Find the arithmetic mean
?
3 2 4 3.5 2.5
x 3 kg
5
Sum of squared deviations from the mean is zero and the sum of
squares of deviations are minimum.
n
( x x ) 0 (3-3)+(2-3)+(4-3)+(3.5-3)+(2.5-3)=0
i 1
i
ve
n
n n
( x x ) ( x A)
i 1
i
2
i 1
i
2
here is A diffrent from the mean.
Here the value that is typied, is not importat at the mean (3) becouse the
value will always be bigger than 2.5 wich value you ever going to be give.
14
Asst.Prof.dr.sufian M.salih 2018
If the datas are be in a multiply with A, the mean will increase in the
multiplied value of A.
yi xi * A ; y x*A
x: {3, 2, 4, 3.5, 2.5} and A=10 for yi*10 values, y:{30,20,40,35,25}
y 3*10 30
If the datas are diveded with A, the mean will decrease in the diveded
value of A.
yi xi / A ; y x/A
x: {3, 2, 4, 3.5, 2.5} ve A=10 için yi/10 değerleri,
y:{0.3,0.2,0.4,0.35,0.25}
y 3/10 0.3
It consists of great value by utilizing the features of the results can facilitate the
calculation of the mean.
t x i i
ti xi
The weighted mean is estimated as follows:; X T i 1
n
ti
t
i 1
i
n n
fi xi fx i i
fi xi
For the mean of frequency table ; X FT i 1
= i 1
n
fi
f
n
i
i 1
15
Asst.Prof.dr.sufian M.salih 2018
Sample 2: The mean of the frequency table for example in the Part 1;
Frequency(f) SD(x) f*x
5 91.5 457.5
8 95.5 764.0
15 99.5 1492.5
19 103.5 1966.5 7201
13 107.5 1397.5 X FT =102,8
70
8 111.5 892.0
2 115.5 231.0
Toplam: 70 7201
GM = n x1 * x2 *...* xn n xi
According to this;
A=B(1+r)t is givin formula; B: is th starting amount, A: is the amount in a
specific period of time r: increasing ratio in term of radians and t: is per unit of
time.
3000 = 100(1+r)5 r = 0.97 increasing value per hours (ratio) %97 dir.
16
Asst.Prof.dr.sufian M.salih 2018
Sample 1: Example size is (n) odd number; What is the median of the x
variable data?
xi: {60, 62, 58, 50, 100, 58, 60, 58, 58};
Ordered Values; xi: {50, 58, 58, 58, 58, 60, 60, 62, 100}
When the data were analyzed for the presence of abnormal data is usually seen
as the data of about 100 next 50. Using the mean can be misleading in this
case. However, the median is not affected by this anomalous observations. The
median in the center has the value (9 + 1) / 2 = 5. 58
Sample 2: Let's write more amount of data by adding (68) more data to the
data and let's determine whether the median again. In this case, the data series
xi: {50, 58, 58, 58, 58, 60, 60, 62, 65, 100}
would be in order of 10 values. The values are 10/2=5’th value 58 and
10/2+1=6’th is 60. The mean of the two values is (58+60)/2= 59 median.
17
Asst.Prof.dr.sufian M.salih 2018
Classified data taken from frequency tables which median accounts are done in
a similar sense. However, it is estimated by a formula. The following formula
is used for calculation of the median frequency table.
N / 2 - Fb
Med = L c here; L: Median class’s real lower limit; N = ∑fi: Total
Fmed
observation number, Fb: Frequency total of class’s before median class’s, Fmed:
Median class’s frequency and c: The interval of the class.
The median class is the first class that holds the cumulative frequency of half
of the total frequency. Let us examine the example of the frequency table in the
apllication department of part 1. Columns are necessary to calculate the median
is given below. Half of the total frequency of 35 which has included it first is
called 4th grade class cumulative frequency designated value.
Mode / Top value: the most repeated value in the data series. The data in the
most repeated value called mode.
According to the median example Xi: {50, 58, 58, 58, 58, 60, 60, 62, 65, 100}
mod of the series is 58. Becaouse this is the most repeated value.To calculate
mod from a freqeuncy table a formula is used.
d1
Mod = L * c Here, L: Median class’s real lower limit; d1: The
d1 d 2
difference of Mod class’ses frequency between the previous class’es, d2: The
18
Asst.Prof.dr.sufian M.salih 2018
19 15
Mod 101,5 *4 103,1
(19 15) (19 13)
3.2.2. Variance
It is the data that are indicative of deviation from the mean. It is a measure of
the variability in the data. It is not a matter how small the data variance is so
close to each other. That is less than mean deviations. The sum of the squared
deviations from the mean variance divided by the degrees of free. The
following formulas are used to calculate the variance;
or S 2 n
n -1 n -1
According to the formulas, N: Is the number of individuals in the population, n:
Is the number of individuals in the sample, : Is population mean and x : Is the
sample mean.
19
Asst.Prof.dr.sufian M.salih 2018
Studies are usually carried out on samples and becouse of that in all examples
onyl sample variance is going to be used. The unit of the variance as shown
from the formula is 2 unit. When the square of the values are taken, the squares
of the values are also been taken. As the square values (g2, kg2) are illogical,
they wont be used with the variance. The samples variance’s denominator
value is called the free degree spot. For a sample the free degree spot is n-1.
Sample 1: Five babies weight when they born is givin below. Calculate the
variance ?
7201
First the mean has to be predicted; X FT =102.8
70
Then with the formula givin below the variance is predicted;
S2
f (x x )
i i
2
n -1
5(91.5 102.8) 2 8(95.5 102.8) 2 ... 2(115.5 102.8) 2 2452.7
S2 35.54
70 1 69
Properties of variance
It has nearly the same properties like the mean.
yi xi A ; S x2 S y2
If the values are multiplied with a fixed number like A, the variance
will increase the square multiply of A.
yi xi * A ; S y2 A2 * S x2
If the values are diveded with a fixed number like A, the variance will
decrease the square divede of A.
xi S x2
yi ; S y2
A A2
22
Asst.Prof.dr.sufian M.salih 2018
S2 S
Sx
n n
0, 79
Sample 1’s standart error: S x 0,35 kg.
5
5,96
Sample 2’s standart error: S x 0, 71 cm.
70
S
VK *100
x
0, 79
Sample 1’s variation coefficient: VK *100 %26 .
3
5,96
Sample 2’s variation coefficient: VK *100 %6 .
102,8
If the mean coefficient of variation is used in another area compared in two
different population variability variance or standard deviation can be
misleading. In such cases, the coefficient of variation should be used.
For example; For mothers and babies get the following statistics are given.
Mother of the standard deviation of the variation between maternal weight for
babies is greater than the standard deviation is considered to be larger. When
analyzed according to Whereas it is seen that the real variation is higher in fetal
weight. 29% deviation from the mean birth weight was only showing maternal
weight deviate by 15%.
23
Asst.Prof.dr.sufian M.salih 2018
(x - ) /n
3
3 =
3
Kurtosis Coefficient: Kurtosis is the distribution of data that provides
information about the sharpness. It is indicated and estimated by the following
formula. This coefficient is neither sharp nor flat, the full normal distribution is
0, the + (positive) value is sharp when the distribution is - (negative) and that
value means that the distribution is flattened.
(x - ) /n - 3
4
4 =
4
3.3. Sample Questions
Q1. What is the relation between Mean, Mod and Median ?
Q2. In winter 15 pregnant women come to clinic per day becouse of
hyperemesis gravidarum circumtences, after 4 day later the number increased
to 150 per day. What is the spread speed of hyperemesis gravidarum ?
Q3. Find the mod and median number of breath rate that is givin below ?
Xi: (12, 12, 12, 14, 14, 13, 13, 16, 16, 20, 22, 24, 24, 24, 18, 18, 18, 18, 18, 18,
18)
Q4. The hearth rate voice of a fetus is givin below. Find spread speed of the
series along (DG, S², S and VK).
Xi: (120, 126, 134, 136, 140, 144, 148, 150, 154, 158, 162)
24
Asst.Prof.dr.sufian M.salih 2018
Q5. 8 pregnant women body tempreturse is givin below. Find the mean of
these series ?
Xi: (36.2, 36.5, 36.7,36.8, 36.9, 36.3, 37, 37.2)
Children Number 6 8 14 7 5
Head circumference (cm) 31 32 33 34 35
Q7. 328 patients with hepatitis are admitting to the state hospital. According to
the studies and researches for this disease along for 4 years the patients with
this disease decreased to 30. Find the decreasing ratio of the circumtences and
interpret it. Herhangi bir bölge hastanesine yılda 328 hepatit hastası
başvurmaktadır.
Q8. Health courses taken by students in schools of midwifery a students first
class, course credits and grades are given. Calculate the student's GPA?
Lesson Anatomy Genetics Microbiology Biochemistry Psychology
Name
Credit 4 3 2 4 2
Mark 94 88 72 96 68
Q9. What are the use of the variation coefficient ?
Q10. Pregnant women from an mean of 162 cm height and variance of 100
health centers, an mean of 70 kg of weight variance is calculated as 49. By
calculating the coefficient of variation of height and weight determine them for
which is greater than the variability ?
Q11. In a population growth period of a time 40 patient was coming to delivery
room, after 2 days later the number of the patient increased to 600. Calculate
the daily growth rate ?
Q12. A bacterial culture of the colony count is done. First day the count was
1000 and in the fourth day is determined to 8000. If the increase is İlk günkü
sayım 1000 ve 4. Gündeki sayım ise 8000 olarak belirlenmiştir. If we think the
increase is geometric. Calculate the daily growth rate ?
Q13. In April 10 children with diarrhea admitted to a clinic. After 2 months
later on June the number increased to 50. Calculate the monthly growth rate of
diarrheal disease ?
Q14. Find the mod and madian values of 15 people’s hearth rate that is givin
below ?
25
Asst.Prof.dr.sufian M.salih 2018
Xi: {70, 74, 82, 80, 74, 80, 88, 92, 80, 96, 74, 80, 76, 88, 78}
Q15. In period of a time where bird flu was spread, 40 patients was admiting to
a clinic, after 2 days later the number increased to 400 person. Calculate the
rate of increase in the spread of bird flu disease daily.
26
Asst.Prof.dr.sufian M.salih 2018
PART 4
4.1. Possibilities
Probability is the most basic issues in statistics . Because all the estimates,
given every decision is expressed by a certain probability level. That is, a
certain error or confidence level is concerned. Therefore, the possibility should
be noted that some ofthe basic concepts and rules. Here is a simple possibilities
will be discussed. Because the subject matter is probably the single head.
The probability of an event is related to the ratio of the number occurring in the
total number of votes. It is calculated by the following formula:
x
P( x)
n
Here; n: total event number, x: represents the desired number of events.
A possibilities of an event is in the range of between 0≤P(x)≤1. P(x)=0 means
that the event is impossible and P(x)=1 means that the event is going to be.
Sample 1: In a region 75 newborns of 30 was calculated girl. The girl ratio in
30
this populations is found according to this; P( x) 0, 40 .
75
Sample 2: Think that in a health center patients are coming they are 20 of flu
disease, 15 of them are internal medicine patient and 10 of them are Infectious
Diseases patients.
The possibility of internal medicine patient from all of those patients is.
15
P( x) 0,33 .
45
But sometimes the event may went complicated. Like some of the patient that
is waiting treatmen may can be internal medicine patient and Infectious
Diseases patients in same time. According to this the possibility is going to me
calculated:
15 10
P( x) 0,33 0, 22 0,55 dir.
45 45
As shown, each block is the event in question. In such cases, the probability of
occurrence of one or the other is the sum of each separate occurrence
probability is defined as the addition rule.
27
Asst.Prof.dr.sufian M.salih 2018
For example, a good pass rate of operation of the two patients operated on at
the same time get 0.70 and 0.80. To the possibility of a successful operation
Both of these patients;
P(x)=0,70*0,80=0,56
n!
Permutations; order system is important: nP x And
(n-x)!
n!
Combinations; order system is not important: nCx
(n x)! x!
Here, n: represents the total event numbar, and x: the number of the events.
Sample: A, B, C is 3 student from the class president and vice, including the
EU, the number of permutations to be created by taking two at a time, AC, BC,
BA, CA, as the CB was 6 because there is a significant number of ranking
combination AB, BA from; AC, BC, and CA is not different from that of CB.
Thus the number of combinations; AB, AC, BC in the form of three types.
3!
Permutation Number; 3P2 6 and
(3 - 2)!
3!
Combination Number; 3C2 3
(3 - 2)!2!
obtaining. Datas that are obtained by the quantitative feature interval and ratio
scales shows a solid dispersion.
Each Distribution has a function. This function is defined as the probability
density function which is usually continuous distributions f (x) and for discrete
distributions P (x) as it is defined.
30
Asst.Prof.dr.sufian M.salih 2018
1 ( x - )2
1 - .
It is f ( x) e 2 2
. However, it must necessarily be integral in
2 2
each case to make forecasting using this formula. This is not easy. Especially
in terms of time would not be possible at all during the test. Therefore, the
value of the standard normal distribution is symmetric (z) is utilized. Z standart
normal values mean is 0 (µz= 0), the vriance is ( Z2 = 1) and it is summarized
x-
in z N(0,1). In function of f ( x) , the z is equalized to z and when it
x-
is writin in the in equation to z , normal distribution function
1 12 z 2
f ( z) e is changed into a standart normal distribution function. The
2
hole integrals are given in the additional paper that includes the Z tables.
Example: 7-year-old children in one of an mean length of 130 cm and standard
deviation was determined to be 8 cm. Find different alternatives to these
options and answer the questions below according to this.
a) What is the ratio of these kids that are above from 130 cm ?
b) What is the ratio of these kids that are above from 135 cm ?
c) What is the ratio of these kids that are below from 125 cm ?
d) What is the ratio of these kids that are between 120 cm and 135 cm ?
Solution:
a) As it is a symetric distribution the ones that are
smaller and taller from the mean is find
%50 %50
P(z<0) = P(z>0) =0.50 and it is %50.
0
135 130
b) P(x>135) =P( z )
8
P( z 0.63) 0.2643; % 26.43 .
0.2643
0 0.63
120 130
c) P(x>120) =P( z )
8
P( z 1.25) 0.1056; % 10.56 . 0.1056
-1.25 0
It has to be awared that the negative values are not
included in the z table. According the symetry has to be used. Like:
P( z 1.25) = P( z 1.25) =0.1056 it is predicted.
31
Asst.Prof.dr.sufian M.salih 2018
= P(1.25 z 0.63)
1 {P( z 1.25) P(0.63 z)} -1.25 0 0.63
Q3. The birth note 2’s mean is calculated 70 and the variance 169. As the
class’s members succes rate is %40, what is the minimum grade of those
succesfully students (Distribution is normal).
Q4. The mistake possibility of pregnacy testi that is be accured in health center
is 0.001. What is the possibility of minimal 2 misdiagnosed women from 1000
testers ?
Q5. A midwife has misdiagnosed %20 of pregnant women that came for
medical examination. What is the maximum true possibility for the 4 pregnant
women of 5 that is medical examinated ?
Q6. In Turkey the death rate of pregnant women is 0.0001 becouse of the some
complications. What is the death possibility of 3 women from 1000 women
that have the complications ?
Q7. The newborn death possibility while in a birth is %1 with an experienced
midwife. According to this experienced midwife, what is the minimum 2
newborns death possibility in a birth from 200 newborn ?
Q8. A medecine that has a side effect of 0.02 is known whic is given to an
epilepsy patient. is given to an epilepsy patien Bir epilepsi hastasına verilen bir
tür ilacın kötü etki yapma ihtimalinin 0.02 olduğu bilinmektedir. Bu ilacın
şansa bağlı olarak seçilen 300 epilepsi hastasından en fazla 3 ünde kötü etki
yapma ihtimalini bulunuz.
Q9. In a pediatrics service icterus disease has been found 10 from 90 newborn.
In the clinic 6 of the babies had been checked for icterus disease. What is the
icterus disease possbility of 2 from 6 newborns ?
Q10. In a hospital there is 400 boy and 300 girl in a total number of 700
children. When 3 children would be choosen, what would be the minimum 1
boy possibility of 3 children ?
Q11. The infection of the AIDS is 0,002 in a hospital. What is the infection
possibility yearly of 600 employees ?
Q12. In a health centre from 900 of 3 midwife has been seen jointless disease
while in a health check. 600 midwifes had been also checked for edema. What
is the 5 jointlees midwife possibility that are been checked ?
Q13. In a hospital an antiboitcs side effects possbility has been recorded to
0.02. What is the side effect possbility of 2 from 100 patients by chance ?
Q14. In a hospital admitting to the family planning clinics of women ages
mean has been calculatet 25 and standard deviation has been calculated 5. If
the age are been distributed normal;
a) What is the age possbility of each between 25-27 ?
b) What is the age possibility of below 30 ?
c) Calculate the age possbility of below 27 ?
Q15. In a location, b grup blood typed people ratio had been shown %40. 5
person has lend her/his blood to a hospital. What is the total B grup blood type
possbility of this 5 people ?
33
Asst.Prof.dr.sufian M.salih 2018
34
Asst.Prof.dr.sufian M.salih 2018
PART 5
5.1. Sampling
Studies are carried out as mentioned in the first part samples for various
reasons. Creation of sample is large enough to be a stand-alone course topics
will be only one entry here.
Basically the sample into each individual in the population should have equal
chances. Selection must be made to ensure that aleatory. However, the lack of
uniform or homogeneous materials, or with different properties can be limited
due to the commitment that chance. For this, various sampling methods have
been developed. Let's examine a few of them.
35
Asst.Prof.dr.sufian M.salih 2018
Knowledge of the sample size required for research in any population. Sample
size varies depending on the parameters to be estimated and the variation in
future studies.
36
Asst.Prof.dr.sufian M.salih 2018
2) If the population variance is unknown: In this case firstly a pilot study is.
A pilot study of the research is to establish a small sample. According to the
variance obtained from this pilot study the sample size of the real survey are
estimated. When studies conducted on small samples, the small sample
distribution of the t distribution is utilized.
subscript. S 2 is the sample variance of the pilot study and d: is the researchers
maksiumum deviation value
Sample: A pilot study about the blood sugar has been made with a grup of 20
people and the blood sugar standart deviations has been calculated 30. To find
the save ratio %99 in a 5 unit deviation, how many sample of people has to be
the study ?
Solution: The sample variance and the deviation value has been givin to in the
formula. The unknown t critical value for error level and degree of freedom
could be found on t table. The save coefficient is : 1-=0.99; =0.01 . The
predicted mean ( x ) could maybe less or more from the real population mean
(µ). According to this the both tails error value is 0.01. The t0.005,19 2.861 value
would be find on t table.
According to this result in a 295 people sample, the save ratio with 5 unit
deviation would be %99.
37
Asst.Prof.dr.sufian M.salih 2018
desired to happen, z/2 is the specific value of the standard normal probability
level in the z table and d: is the researchers maksiumum deviation value.
ˆ ˆ 2 / 2
pqz
n
d2
Sample: In a grup of 50 dietitian, 7 of them has been found that they are eating
unhealthy. According to those which are eating unhealthy, what is the
possibility sample amount of this study that has a unhealthy eating habit save
ratio about %95 and with a %1 deviation ?
Solution: pˆ 7 / 50 0.14; qˆ 1 0.14 0.86 ve Z0.025=1.96
ˆ ˆ 2 / 2 0.14*0.86*1.962
pqz
n 4625
d2 0.012
Q3. In a pilot study with sample of 25 persons, the blood presure rate is found
S=15. To predict the blood pressure rate’s mean value with a %95 safe and 5
unit deviation, find the people for this sample which needed?
Q4. 15 dropsical patients’s weight’s standart deviation is found 310 gr. To
predict the dropsical patients’s weight’s with a %90 safe and 120 gr deviation
from their real values, to find the people for this sample wich is needed?
Q5. Postpartum education which is taken from 40 midwives, seen that 10 of
them have missing knowledge about infections. In this midives population, to
predict the infection caught with a %99 safe and maksimum 7 unit deviation, to
find the people for this sample wich is needed?
Q6. In a study from 50 doctors, it have seen that 10 of them are smoking
cigarettes. In the population predict the smoking rate with a %99 safe and
maksimum 4 unit deviation, to find the people for this sample wich is needed?
Q7. From 35 Students of 7 which all are in a study of health education, were
determined as a smoker. In this student population predict the smoking rate
with a %99 safe and maksimum %5 rat deviation, to find the people for this
sample wich is needed?
38
Asst.Prof.dr.sufian M.salih 2018
PART 6
Hypothesis testing and Confidence Limits
Studies are usually planned to test the claims raised in any matter. Hypothesis
testing is a method of statistical testing of the claims put forward in a certain
error level. With this feature has an important place in the statistical methods
and the most widely used method. Mean value obtained from the sample, the
ratio of variance, etc. such as statistics, for example, a decision was taken on
the suitability of the population parameter. The accuracy of the hypothesis
tested. It contains a specific error is not certain.
This is an inverse relationship between the two types of error, when error
increase, error decreases. According to a scientific study the significance
level level of the study is discribed with , the H0 hypothesis acceptance and
rejection is described below:
If = 0.01 is, it will described (P<0.01) and the H0 hypothesis is rejected in are
of 0.01 it is the description diffrences of statistics and it will be shown with two
**.
39
Asst.Prof.dr.sufian M.salih 2018
If = 0.01 is, it will described (P<0.01) and the H0 hypothesis is rejected in are
of 0.01 it is the description diffrences of statistics and it will be shown with two
***.
Summaries
P; value Decision Degree Symbol
H0 hypothesis accepted. no diffrence,
P>0.05 Unimportant ud/ns
equal, did not affect …
H1 hypothesis accepted. % 5 error rate
P<0.05 (%95 save ratio) diffrence, not equal, did Important *
affect…
H1 hypothesis accepted. % 1 error rate Important
P<0.01 (%99 save ratio) diffrence, not equal, did **
affect…
H1 hypothesis accepted. % 0,1 error rate Important
P<0.001 (%99,9 save ratio) diffrence, not equal, ***
did affect…
40
Asst.Prof.dr.sufian M.salih 2018
If it is H1:µ1>µ2 the hypothesis is one-way, the first mean is bigger than the
second mean (Right tail test),
If it is H1:µ1<µ2 the hypothesis is one-way, the first mean is smaller than the
second mean (left tail test)
4. The critical value and test statistics would be matched between each other
and the result would be interpreted.
2) The distribution is defined and according to this distribution the table value
is been found.
For one-way Hypotheses. zc = z
For two-way Hypotheses. zc = z/2
4) Comprasion accours.
When z h > z c is, H 0 rejected, H1 accepted.
When z h < z c is, H 0 accepted.
Solution:
H 0 : µ = 50
H1 : µ > 50
Critical Ruler Value; 0.05
zc = z =Z0.05=1.64 0 1.64
Solution:
H 0 : µ = 2.9
H1 : µ 2.9
Two way hypothesis;
tc = t/2.(n-1) =t0.01/2,19=2.861
-2.861 µ 2.861
S2 0.98
Sx 0.22
n 20
x - 3.2 2.9
th 1.36
Sx 0.22
Decision: When -2.861<1.36 < 2.861 is, H0 hypothesis is accepted. So the Na
value in blood with error ratio of %1 is not diffrent from 2.9 value.
43
Asst.Prof.dr.sufian M.salih 2018
The confidence limitations with %99 of Sodium amounts values are between
2.57 – 3.83.
Samples for the mean individual which will be compared (group comparison).;
Comparison of the hemoglobin value of girls and boys,
Comparison of the same age and sex that were divided into two groups of
people which are suffering from the same disease by applying two separate
drug treatment recovery in a period of time,
Compareing the height of boys and girls in certain age groups
Comparison of the endurance time of products from two factories producing
the same type of production etc. events like these are been applied with group
matching.
Sample for dependent mean (Pair) that are going to beeing compared;
Compareing the sleeping time of twins,
Comparison of the blood pressure values after - before takeing of the blood
pressure medicine,
Comparison of the hemoglobin values after – before takeing of the medicine,
Compareing of the heat, blood pressure, blood content.. etc. values that has
been taken from pateints.
44
Asst.Prof.dr.sufian M.salih 2018
(=0.01) and more find the %99 confidence limitations of the homoglobin
values by subtracting the men and women homoglobin values.
4) Comprasion accours.
z h > z c H 0 rejected and H1 accepted.
z h < z c H 0 accepted.
Solution:
1) Hypotheses are established;
H 0 : µ1 = µ2 or µ1-µ2 = 0
H1 : H1 : µ1-µ2 > 0
2) Critical rular value is beeing defined;
15 12 (0) 3
3) The test statistics will calculated; z h = 4.07
9 10 0.74
35
4) Comprasion accours; When zh =4.07>zc 2.33 is H0 is rejected and H1
accepted (P<0.01). So the hemoglobin value of men in blood with error ratio of
%1 is founded more than women.
Confidence limits are often determined duplex. For this the critical value of Z
table z / 2 z0.005 2.57 and other statistics formula will be placed and the
confidence limitations would be predicted while the calculations are been made.
(µ1 - µ2 )as x1 x2 Sx x 1 2
* z / 2 =15-12 + 0.74*2.57 = 1.1
(µ1 - µ2 )üs x1 x2 Sx x 1 2
* z / 2 =15-12 + 0.74*2.57 = 4.9
Erkek ve bayanların hemoglobin değerleri arasındaki %99 güvenle 1.1 ile 4.9
arasında değişmektedir. The confidence limitations with %99 of hemoglobin
values of men and women are between 1.1 – 4.9.
Microorganism not 11.0 13.0 12.8 12.6 9.0 12.0 13.2 12.7 12.8
included (1)
Microorganism 12.5 12.0 11.9 12.3 12.6 11.6 11.8 11.9 12.1
included (2)
4) Comprasion accours.
t c < t t H 0 accepted,
t c > t t H 0 rejected H1 accepted
Here the two mean subtraction standart error S x1 - x 2 while calculating;
S12 S22
When n 1 = n 2 is, S x - x will be.
1 2
n
1 1
When n 1 n 2 is, S x - x S02 will be.
n1 n2
1 2
46
Asst.Prof.dr.sufian M.salih 2018
Solution:
Let us first estimate the descriptive statistics;
n
x i
109.1 108.7
x i 1
; x1
12.12; x2 12.08 ;
n 9 9
(xi )2 109.12
xi2 1336.97
S2 n ; S12 9 1.80
n -1 9 1
108.7 2
1313.73
S22 9 0.11; ve S 0.33
9 1
2
1.80 0.11
The standart error will be(when n’s are equal); S x1 - x2 0.46
9
1) Hypotheses are established.
H 0 : 1 = 2 or 1 - 2 = 0
H1 : 1 2 0
2) Critical rular value is beeing defined.
For the two-way test, tc t / 2( n n -2) t0.05 / 2;16 2.12
1 2
x1 - x2 - ( 1 - 2 ) 12.12 12.08 0
3) The test statistcs calculated; th 0.09
S x1 - x2 0.46
4) Comprasion accours.
As the test statistics is lower than the critical rular velue the H0 is accepted and
according to Ca the two sample means which has a %5 error rate are accepted
with no diffrence.(P>0.05).
The difference between mean’s confidence limitations are;
(µ1 - µ2 )üs x1 x2 Sx1 x2 * t / 2,( n1 n2 2) 12.12 -12.08 0.46*2.12 1.17
From these two samples the mean, according to Ca the diffrences values
between 1.02 – 1.17 is changed with a save ratio of %95.
The describetion for the test is being discribed in the begenin of the part of the
topic. As the test procces is being made from the subtraction of the pair, the
proces stages are same like the hypothesis process. As the interpretation is the
difference’s proces, the solution would be like two different mean’s
explanation.
Example: A test has been made to twins, the test results has been writen
below. Test the score mean of the twins as to the importance of the diffrence
(=0.05). Forecast the save limitation rate of %95 according to the subtractions
?
Total
Pair1 82 80 78 80 76 74 84 76 68 84 782
Pair2 84 76 82 84 72 70 82 84 72 80 786
Subtractions(
xi ) 2 -4 4 4 -4 -4 -2 8 4 -4 4
2
xi 4 16 16 16 16 16 4 64 16 16 184
H0 : f = 0
H1 : f 0 ; H1 = f > 0 ; H1 : f < 0
or H0 : f = a
H1 : f a ; H1 = f > a ; H1 : f < a
48
Asst.Prof.dr.sufian M.salih 2018
(xi )2 42
xi2 184
S 2f n ; S 2f 10 20.82
n -1 10 1
1) Hypotheses are established.
H0 : f = 0
H1 : f 0
2) Critical rular value is being defined.
For the two-way test t c = t /2(n-1) =t0.05/2; 9=2.262
3) The test statistcs calculated.
xf -f 0.4 0 0.4
It is t h = 0.28 .
Sxf 20.82 /10 1.44
4) As the test statistics is lower than the critical rular velue the H0 is accepted
and according to twins score which has a %5 error rate that is accepted with no
diffrence.(P>0.05).
zh zc H 0 accepted.
x
Here, p̂ = is the ratio that is calculated from the sample p 0 : is the ratio of
n
ˆˆ
pq
population, pˆ : shows the standart error of the mean.
n
Solution:
a)
1) Hypothesis, H 0 : p = 0.15
H1 : p > 0.15
2) Critical rular value is being defined;
For the one-way hypothesis; zc z z0.05 1.64
15 0.20*0.80
3) The test statistcs; pˆ 0.20 ve qˆ 1 0.20 0.80, pˆ 0.05 ,
75 75
pˆ - p0 0.20 0.15
It is zh 1.08
pˆ 0.05
4) As the calculation value is lower than the critical rular velue the H0 is
accepted and the malnutrition problem rate is accepted that it is not above of
%15.(P>0.05).
6.3.1.1. Confidence limitations of the rate
Malnutrition problem rate of 90% save limits for the critical value z0.10 / 2 =
z0.05 = 1.64
ˆˆ
pq
püs pˆ z / 2 0.20 1.64*0.05 0.28
n
ˆˆ
pq
pas pˆ z / 2 0.20 1.64*0.05 0.12
n
Malnutrition problem rate of 90% save limits are changeing between
0.12(%12) and 0.28(%28).
50
Asst.Prof.dr.sufian M.salih 2018
The number of individuals has to be in enough amount for the two samples.
(n1 ve n2 >30)
Sample 1: 500 of 450 and 350 of 200 families had a certain level of knowledge
about hepatitis B which they had been choosen from a different two provinces.
Along to that are the families of the first province has more knowledge about
hepatitis B according to families of the second provinci (=%1). Forecast the
save limitation of %99 rate for the rate difference ?
1 1 x1 x2
pˆ - pˆ p0 q0 and p0 , q0 1- p0 is defined.
1 2
n1 n2 n1 n2
Solution:
1) Hypotheses; H 0 : p1 - p 2 =0
1 1
pˆ - pˆ 0.77 *0.23 0.03
1 2
500 350
51
Asst.Prof.dr.sufian M.salih 2018
pˆ1qˆ1 pˆ 2 qˆ2
( p1 p2 )as ( pˆ1 pˆ 2 ) z / 2 0.77 0.23 2.57*0.03 0.46
n1 n2
In this case, knowledge of the differences between provinces is seen that varies
between 62% and 46%.
52
Asst.Prof.dr.sufian M.salih 2018
In the formula; gi: is the observed value, bi: is the waited value
2 critical test value is defined 2,(SD). Free Rate is defined (FR)= Catagory
number – 1.
The values which are observed in the test statitistic, are the patient income
values to the services. The values are being defined by asking ourselfes: “
What would be the change of the values, if the income distribution to the
services would be equal. So the distribution rate should be (1:1:1:1:1).
According to this 100/5 = 20 is the number of the equivalent income to the
services.
Internal
Medicine ENT Eye Child Orthopedics Total
Observed
Value(Oi) 24 21 15 28 12 100
Expected value
(Ei) 20 20 20 20 20 100
Classified data obtained from the survey are summarized in mostly two-
dimensional table. In these tables for quantitative factors examined the
relationship between two factors of 2 test. For example; The
relationship between hair color or eye color and gender; The
relationship between success and sex; a large number of provinces as
examples of the relationship between the type of birth according to the
neighborhood. This relationship between events is determined by the
independence test.
53
Asst.Prof.dr.sufian M.salih 2018
The tables which had been applied an independence test has a size like r
x c. r is the linage and c number of columns. The minimal of this table
is 2 x 2 sized and 4 celled. In this tables the linage and number of
columns are changeable according to the category number. The linage
and the number of columns has not to be equal. The are some
regulations according to cells’s numbers which is writen in to the
spaces.
If the income values are between 5-20, the 2 test statistics would be
2x2 as to Yates regulations, if the income values are lower than 5,
Fisher’s exact test 2 would be made. Any factor that is above 2 and the
catogarisation rate wich is more from %20 that have an income lower
than 5 frequency which is catagoriesed (rows or columns ), will be
combined with previous catogaries for makeing the 2 test.
Sample: In a two location the birth distribution of boy and girl has been given
below. Are there any relation of the girl/boy birth distribution according to the
locations ? Test it.(=0,05)
Solution:
Hypotheses are established.
H0: There isn’t a relation between the gender to the distribution of the
locations.
In this formula; gij = i. order j. value that is observed in column. Eij= i. order j.
value that is awaited in the column. As seen the awaited values has been writin
like gij ve bij in the test statistics. Because the values in each cell belongs to
two factors here. the Görüldüğü üzere test istatistiğinde gözlenen ve beklenen
değerler şeklinde yazılmıştır. Çünkü burada her bir hücredeki değer iki faktöre
54
Asst.Prof.dr.sufian M.salih 2018
aittir. Indicis respectively shows the rows (i) and columns (j). The awaited
values;
Is calculated with help of formula. Here; ri = i. Addition of
rows, c j = j. addition of columns ve T = the general addition.
Expected values can be found by others after removing cells found through this
formula as to the free rate.So,
; ;
In the form. The test statistic is calculated using the values
Observed value Expected Values
I. Location II. Total I. Location II. Total
Location Location
Boy 150 105 255 132.8 122.2 255
Girl 100 125 225 117.2 107.8 225
Total 250 230 480 250 230 480
(150 132.8)2 (105 122.2) 2 (100 117.2) 2 (125 107.2) 2
h2
132.8 122.2 117.2 107.2
h2 2.23 2.42 2.52 2.74 9.92
Each value of the test statistic is defined as the contribution of each cell to the
test statistics. Men and contribution of location, the cell types is in a value of
2.23.
Commentary: As the h2 =9.92 >c2 3.84 is, H 0 will rejected and H1 will
accepted. So, there is a important relations with locations about the
gender.(P<0.05).
Q1. 300 of 56 women have pulled her teeth in a dentist and the same have
made to 480 of 72 men. Is there any gender difference about the teeth pulling
ratio ? (α=0.05)
Q2. A medecine which increase the hearth pulse rate has given to 6 person and
the results below have been seen. Is there any hearth pulse increase ? (α=0.01)
1 2 3 4 5 6
Before Medecine 65 70 68 80 73 69
55
Asst.Prof.dr.sufian M.salih 2018
After Medecine 69 73 70 85 70 70
Difference (A-B) 4 3 2 5 -3 1
Q3. The values below are the datas of breath rates of patients which are in
chest treatment service in a hospital. Calculate the values’es mean and variance
statistics. Test the breath rate mean if it is difference from 20 or not. (=0.05)
xi : {16,15,17,20,15,19,19,16,22,21,19,20}
Q4. In a hospital which of 75 from 15 children in a age of 2 years old has been
admited, whose of the 15 has a malnutrition problem. Determine the baby
populations malnutrition problem according to the mean save limitation ratio in
%90.
Q5. The birth weight of a newborn babies’s standart deviation is known 1.6 kg
. In a newborn service 12 babies are choosen by chance and their birth weight
mean’s have been found 3.2 kg.
a) Test the birth weight’s mean according to the difference term of more than
3 kg. (=0.05)
b) Predict the population mean with a safe of %95.
Q7. In a surgery service of a hospital, 20 patient’s blood sugar’s mean has been
calculated 168 and the variance 64. Calculate patient’s blood sugar’s real mean
with a safe of %95.
Q8. To commentate the dyspnea in a state hospital datas have been given
below. Find the breath values if it is diffrence from 15 or not ?(=0.01)
S.D={ 14,12,12,15,19,19,18,16,16,17,16}
56
Asst.Prof.dr.sufian M.salih 2018
Q10. In a health center, which 1.5 year old of 25 boys’es height mean has
found 80 cm and variance 64. Calculate the true height of the children in safe
limitation of %99.
Q11. Newborns birth weight’s variance is known 2,4 kg. The birth weight’s
mean of 16 baby has been found X= 3,2 by chance. According to these values
predict the population mean with a safe limitation of %95.
Q13. In a study of 20 patient’s blood sodiom value mean which is being found
1.4 and the variance 0.81. Calculate the patient’s blood sodiom value with a
safe limitation of %99.
Q14. In a study of 20 pregnant’s blood potassium value mean which are being
found 2.6 and the variance 0.64. Calculate the patient’s blood sodiom value
with a safe limitation of %95.
Q15. 500 of 450 and 350 of 200 families had a certain level of knowledge
about hepatitis B which they had been choosen from a different two provinces.
Along to that are the families of the first province has more knowledge about
hepatitis B according to families of the second provinci (=%1). Forecast the
save limitation of %99 and rate for the rate difference
Q16. 30 baby boys height mean has been calculated 150 and the standart
deviation 81.
a) Are the babies weight mean smaller than 160 cm ? (=0.01).
b) Calculate the babies real weight mean with a safe of %95 ?
Q17. The boy children distribution of 60 families who have 3 children has
given below. Could have the families with a 0,1,2 amount child in their family
a equalent child distribution ? Test it with a error rate of =0.05
Number of children (boy) 0 1 2
Number of family member 25 12 23
Q18. It is observed that 150 of 120 products have a fine quality which is
choosen randomly in factory. Predict the products fine quality mean in this
factory with safe of %95 ?
Q19. A researcher wants to investigate the side effects of cigarrettes for a
cancer caught. He observed that a smoker group of 300 people which 45 of
them are caughted cancer. He also observed a non smoker group of 400 people
which 30 of them caughted cancer. According to these datas determine the
cancer caught difference between smokers and non?(=0,05)
57
Asst.Prof.dr.sufian M.salih 2018
Q20. Bir The probation’s weekly day distribution of a health schools students
which are studieng. The datas of them are given below. Are students
distributed equalent to the probation ? (=0,05)
Monday - Tuesday - Wednesday - Thursday - Friday
15 25 25 15 20
Q24. In a sample of 200 people that taken by chance whose life span mean are
65 years, is found in the mediterranean region. The life span population’s
variance is 12, find the populations safe limits with a safe of %99.
Q25. Two wild hybridized and the following results were obtained in
Drosophila. Assuming that the wild-type is dominant to the mutant types in
such a hybridization result of Mendelian ratio (3: 1) can be it assumed to be
valid? (=0.05)
Wild Type Mutant Type Toptal
Number of individuals 80 10 90
Q26. 1000 of workers’es hair and eye color has been classified in a workplace
below. Test the independence of the eye color wether to the hair color.
Calculate the independence coefficient via interpreting.
C
E
y
Hair Color
e
r
l
58
Asst.Prof.dr.sufian M.salih 2018
Q27. In a population of medecine pests the death ratio has been seen %80. A
researcher which is studieng this assertion, has made 5 different field work and
found the results below. Is the medecines efficiency realy accourd %80 ?
(=0.05)
PART 7
Sample: Below 6 babies age (month) and weights (kg ) has been given.. Guess
the relationship between fat and weight status, by checking the significance of
the correlation coefficient which is different from zero ?(=0.05)
Weight
Age (x) (y) x*y x2 y2
0 3.5 0.0 0.0 12.3
5 7.0 35.0 25.0 49.0
10 10.0 100.0 100.0 100.0
15 12.0 180.0 225.0 144.0
20 13.0 260.0 400.0 169.0
25 14.0 350.0 625.0 196.0
Total 75 59.5 925.0 1375.0 670.3
60
Asst.Prof.dr.sufian M.salih 2018
75*59.5
925
r 6 0.967
75
2
59.52
1375 670.3
6 6
Becouse the correlation coefficient nears to +1, we can say that there is a
linearly relationship between age and weight. However, it should be made
more precise to speak of the significance test;
Hypotheses ; H0: = 0
H1: 0
Critical valu For the two-way test; t c = t /2(n-1) =t0.05/2;5=2.571
The test statistcs; would be find
1 r 2
1 0.967 0.967 0
Sr 0.09; th 10.74
n2 62 0.09
As the test statistics is bigger than the critical rular velue, the H0 hypothesis is
rejected and H1 hypothesis is accepted and it can be said that there is a
important linearly relationship between weight and age(P>0.05).
yi a bxi
62
Asst.Prof.dr.sufian M.salih 2018
This graph shows that according to a linear equation of the data. The following
equation is used to estimate the parameters of the equation.
xi yi
xi yi -
( xi - x )( yi - y ) n
b veya b
2 xi
( xi - x ) 2 2
xi -
n
The calculated values from the data tables, are placed into the 2. formula and
the parameters are predicted like below.
75*59.5
925
6 59.5 75
b 0.414 ; y 9.92 ; x 12.5
752 6 6
1375
6
a y - bx 9.92 0.414*12.5 4.74
y 4.74 0.414 x This current equation that made, would be defined as placeing
the or y variables according to certain limits (confidence limits of the
estimate). This is the equation which uses the method of least squares error, for
minimizeing the deviation.For example the weight of a 23 months old baby;
It will be y 4.74 0.414*23 14.26 . But using the same equation for
forecasting a 50 months old baby’s weight would be wrong examination.
Becouse the curves would not increase in a linear direction, a decreaseing
change may eventually accours according to the growing age.
Age 1 5 10 15 20 25
Pulse 130 110 100 95 86 84
64
Asst.Prof.dr.sufian M.salih 2018
Q3. The regression weight of chest circumference bxy=1.20 and weight of the
chest circumference byx=0.60. What is the correlation coefficient between the
two variables?
Q4. 10 persons weight (y) and length (x) values are as follows.
X 45 55 70 60 50 65 68 70 75 78
Y 160 164 170 165 165 170 175 168 175 177
a) describes the linear relationship between height and weight of the
correlation coefficient (r) to predict and determine the direction?
b) Culculate the determination coefficient with your guesses ?
c) Set a simple linear regression equation by thinking the weight and
height as a function ?
d) Predict somneones weight whose height is 150 cm.
Q5. Total number of a dental filler contained and the age of the patients is
given below. Find the regression equation that gives the relationship between
age and the number of dental fillings.
Dental Fillings( yi
Patients Age, xi xi2 yi2
xi * yi
)
1 9 1 81 1 9
2 13 2 169 4 26
3 15 2 225 4 30
4 16 3 256 9 48
5 19 4 361 16 76
Total 72 12 1092 39 189
65
Asst.Prof.dr.sufian M.salih 2018
Q7. In a experiment of a bread, the fermentation increase per time has been
found like below.
Q8. Someones monthly weight lost distribution who is on a diet has been
given below. According to this;
a-) Find the regression equation in a releation with weight and month.
b-) Find the starting weight and the lost weight after three months.
c-) Find the correlation and the determination coefficiants according by your
predictions.
66
Asst.Prof.dr.sufian M.salih 2018
Index
68