Math M Assignment Presentation
Math M Assignment Presentation
Math M Assignment Presentation
Introduction
A)Data description
Observed values or measurements of a variable are called data.
Example of variable:heights of students in a class
Data divide in two:
1.Qualitative data
2.Quantitative data
Quantitative data divide in two:
1.Discrete data
2.Continuous data
Discrete data Continuous data
• can take only exact value • any value can take without
• Example: range
the number of student because • Example :
there was not half of student time was measured to fractions of
a second.
Ungrouped data Grouped data
raw data bundled together in categories
data first gather from an histograms and frequency
experiment or study table was used to show data
not sorted into categories,
classified, or otherwise grouped
a list of numbers
•Measures
of tendency
used to determine the central values of a data set
Mean,(arithmetic mean) :
sum of the values divided by the total number of observations
it can be used with both discrete and continuous data
Median :
value at the centre of a data set after the data set is arranged in ascending or
descending order
Mode :
value occurred the most time in a set
•Measures
of dipersion
used to determine the spread of a data
Standard deviation ,
square root for the mean of all the squares
Variance,
square of the standard deviation.
Range
difference between the largest value and the smallest value
•Interquartile
range (IQR)
third quartile subtracted the first quartile,
First quartile ,
known as lower quartile was a number that 25% of the data were less than
this number
Third quartile
known as upper quartile was a number that 75% of the data were less than
this number
Second quartile ,
known as median
Spread of distribution
showed by boxplots
Boxplots
draw by using smallest value ,largest value , first quartile ,third quartile
and median
non-parametric
Outliers
may be plotted as individual points
B)Correlation
determine the degree of association between two variables
two variables are observed simultaneously ,bivariate data are obtained
linear relation between two variables :
one where a change in one variable corresponds to a constant change in
the other variable
Scatter diagram
each pair of the bivariate data is plotted whether there is a linear
correlation between two variables
Pearson correlation coefficient , r
known as the product-moment correlation coefficient
linear correlation coefficient is a numerical value which the degree of linear
correlation between the variables was indicated
Properties of the Pearson correlation coefficient
correlation coefficient has values between -1 and 1,inclusive of both value.
correlation coefficient of +1
indicated a perfect positive linear correlation between the variables
correlation coefficient of -1
indicated a perfect negative linear correlation between the variables
Other correlation coefficient values
indicated a lesser degree of liner correlation
Zero,0 correlation coefficient
indicated no linear correlation between two variables and
uncorrelated
correlation coefficient is independent of the change of origin and
scale of measurements
•
Spearman correlation coefficient ,
defined as the Pearson correlation coefficient between the rank variables
the linear correlation of data was calculated by first assigning ranks to each data
take values between -1 and +1
Values of r close to -1
indicated a very strong negative linear correlation
Values of r close to +1
indicated a very strong positive linear correlation
Near the value of 0
indicated very weak or no linear correlation between the data
Methodology
•To get mean (
all value x was added
the sum of x was divided by the total number of observation which
To get mode :
number of time of value in a data that occurred the most time
To get median:
the data set was arranged in ascending order
the two value of the center of data set was added
the sum of the value was divided by two
•To
get range:
the smallest value of observation was subtracted by the largest value
of observation
To get interquartile range(IQR):
the data set was arranged in ascending order
the value of Pn was found where p was the proportion and was the
size of the data set
Pn was an integer that say r , the average of the (r+1)th ordered
value was determined
• To get first quartile , second quartile , third quartile
P and n was multiplied
after Pn was multiplied, the (r+1) th was determined and multiplied
with
the first quartile was subtracted by third quartile
• get standard deviation ():
To •To get variance (:
all the value x was added and squared all the value x was added and
the sum of the squared value x was squared
divided by the total number of the sum of the squared value x was
observation which divided by the total number of
the mean ( was squared observation which
the squared mean ( was subtracted by the mean ( was squared
the sum of the squared value x that
divided by the total number of the squared mean ( was subtracted
observation which by the sum of the squared value x
that divided by the total number of
the answer was square root
observation which
•To
draw boxplot:
the smallest value ,largest value , first quartile , third quartile and
median was used
To determine outlier :
lower boundary and upper boundary was calculated
•To get scatter diagram :
each pair of bivariate data was plotted which x and y
the raw data between two variables was used
To get Pearson correlation coefficient (r) :
the value x was added =
the value y was added =
the value x was squared and all the answer was added =
the value y was squared and all the answer was added =
all value x was multiplied with all value y and the answer was added =
• the value x that added was multiplied with the value y that added
and divided by ,the total number of observations
the answer was subtracted by the value x that multiplied with value
y which all product x and y was added
the answer was divided by square root of value x that squared and
added together was subtracted the value x that squared and added
that divided ,the total number of observations and square root of
value y that squared and added together was subtracted the value y
that squared and added that divided ,the total number of
observations.
•To get Spearman rank correlation coefficient (
the rank of x was calculated by arranged the data set ascending and
the position was not changed
if the value of two or three x was same meant it was the same rank
the rank of y was calculated by arranged the data set ascending and
the position was not changed.
if the value of two or three y was same meant it was the same rank
difference of rank(D) was calculated when rank of y was subtracted
by rank of x
• answer of difference of rank(D) was squared which and added
together which
value of six was multiplied with
the answer was divided by ,the total number of observations that
was multiplied with the total number of observations that squares
which and value of one was subtracted by
the answer was subtracted by value of one
Result & Conclusion
•Boxplot:
o distribution of the Mathematics is skewed to the right since Q3 -Q2
Q2 –Q1 and it had the biggest range between Biology and Malay
o distribution of the Biology is skewed to the right since Q3 -Q2 Q2 –Q1
and it had the second biggest range between Mathematics and Malay
o distribution of the Malay is skewed to the right since Q3 -Q2 Q2 –Q1
and it had the lowest range between Mathematics and Biology
o outlier was not exist
The school plans to give prizes to the top 5% of each subject:
o For Mathematics,the minimum marks to win the prizes:
• 95 mark
o For Biology,the minimum marks to win the prizes:
• 92 mark
o For Malay,the minimum marks to win the prizes :
• 91 mark
Scatter diagram:
o the variables which x Mathematics and the variable y which Biology
are said to have a positive correlation
o the variables which x Mathematics and the variable y which Malay
are said to have a positive correlation
o the variables which x Biology and the variable y which Malay are said
to have a positive correlation
o this meant that the variables x and y increase or decrease together
•
Comparison between Pearson correlation coefficient ,and Spearman rank coefficient,
o Pearson correlation very sensitive to extreme values (outliers) but not Spearman rank
coefficient
o Spearman rank coefficient had one advantage over Pearson correlation coefficient is
that it can be used for non-numerical quantities since it assigns ranks to the quantities
o low value of Pearson correlation coefficient which r=0.28,0.21.0.18.So, Pearson
correlation coefficient was sensitive to extreme values
o Spearman rank coefficient was rather strong indicating a linear correlation between
x and y which =0.97,0.96,0.97
o Spearman rank coefficient is not sensitive to extreme value
•
Conclusion on the performance of students in the examination and the relationship on
the performance between the subjects:
Performance of students in Mathematics :
o most bad compare to the other two subjects
o mean( =40.7 is the lowest compare to the other two subjects
o mode = 70 was second highest compare to the other two subjects
o median =32 was the lowest compare to the other two subjects
o standard deviation( 29.15 was the highest compare to the other two subjects
o variance ( = 849.56 was the highest compare to the other two subjects
o range = 94 was the highest compare to the other two subjects
o interquartile range(IQR) = 54.5 was the highest compare to the other two subjects
•
Performance of students in Biology:
orather good than Mathematics but bad than Malay
omean( = 44.4 was second highest compare to the other two subjects
omode = 50 was the lowest compare to the other two subjects
omedian = 39 was the second highest compare to the other two subjects
ostandard deviation( 24.73 was second highest compare to the other two subjects.
ovariance ( = 611.69 was the second highest compare to the other two subjects.
orange = 84 was the second highest compare to the other two subjects
o interquartile range(IQR) = 40.5 was the second highest compare to the other two
subjects
•
Performance of students in Malay :
o the best compare to the other two subjects
o mean( =56.13 was the highest compare to the other two subjects
o mode =80 was the highest compare to the other two subjects
o median = 55 was the highest compare to the other two subjects
o standard deviation( 22.80 was lowest compare to the other two subjects
o variance ( = 519.55 was the lowest compare to the other two subjects.
o range = 79 was the lowest compare to the other two subjects.
o interquartile range(IQR) = 39.5 was the lowest compare to the other two subjects
•
Pearson correlation coefficient(r):
o the relationship between Mathematics and Biology was moderate positive because
r=0.28
o the relationship between Mathematics and Malay was moderate positive because r=0.21
o the relationship between Biology and Malay was moderate positive because r=0.18
Spearman rank coefficient ():
o the relationship between Mathematics and Biology was strong positive because =0.97
o the relationship between Mathematics and Malay was strong positive because =0.96
o the relationship between Biology and Malay was strong positive because =0.97