Non Parametric Testing - Sec A

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 20

Presented By :Mayank Gaur 32021 Prateek Parimal 32029 Raj Kamal Rawat 32035 Sunnu Setu 32042

The term Non-Parametric was first used by Wolfowitz , 1942

Non-Parametric statistic refers to a statistic whose interpretation does not depend on the population fitting any parameterized distributions.
Non-Parametric Test:

- referred to as distribution-free test. - used when data have a ranking but no clear numerical interpretation - used to structure non-linear data - does not require the assumption of normality - has reliance on fewer assumptions

Comparative Study
Parametric
Assumed distribution Assumed variance Typical data Data set relationships Normal Homogeneous Ratio or Interval Independent

Non-parametric
Any Any Ordinal or Nominal Any

Usual central measure


Benefits

Mean
Can draw more conclusions

Median
Simplicity; Less affected by outliers

Treat samples made up of observations from several different populations. Can treat data which are inherently in ranks as well as data whose seemingly numerical scores have the strength in ranks They are more robust due to reliance on fewer assumptions. Easier to learn and apply than parametric tests

Losing precision/wasteful of data Low power while using with numerical/scale data, a large sample size required to draw conclusion with the same degree of confidence Not often as efficient or accurate as parametric methods.

The runs test is used to determine for serial randomness: whether or not observations occur in a sequence in time or over space. In the example below participants were sampled in front of IRMA mess at dinner time, resulting in the data set: MMFFMMFFFFMMMFFFFMMFFF Where M denotes male and F denotes female. We are interested in determining whether the order of the two genders is random or not, as opposed to the genders forming groups such as: MMMMMMMFFFFFFF or members of one gender always accompanying members of other gender, such as: MFMFMFMFMFMFMF Unlike other tests there is no equation for the runs test unless the sample size of either group is greater than 30. One only needs to count the number of runs (u), a run being a series of the same nominal value when counting from left to right. MMMFFMM, The underlined portion is a run.

Two Tailed Runs Test


H0 : The distribution of gender coming to mess for dinner is random Ha : The distribution of gender coming to mess for dinner is not random. n1 = 9 there are 9 occurrences of the value M. n2 = 13 there are 13 occurrences of the value F. u = 8 there are 8 runs. = 0.05 uCritical = 6, 17 there are 2 critical values of u, if the calculated value falls between these then the null hypothesis is accepted. Since 6 < 8 < 17 accept null hypothesis The distribution of gender coming to mess is random

If a one tailed runs test is used, we can determine whether the data are either random, non-random due to clustering, or non-random due to uniformity.

Again, there are 2 critical values If u < the lower uCritical then the data are non-random due to clustering. If u > the upper uCritical then the data are non-random due to uniformity. If u falls between the lower and upper uCritical then the data are random.

H0 : The distribution of gender coming to mess is random. Ha : The distribution of gender coming to mess is not random due to clustering. MMMMMFFFFMMMMMFFFMMMMMM n1 = 16 there are 16 occurrences of the value M n2 = 7 there are 7 occurrences of the value F. u = 5 there are 5 runs. = 0.05 uCritical = 6, 15 there are 2 critical values of u, if the calculated value falls between these then H0 is accepted. Since 5 < 6 < 15 reject H0 The distribution of gender coming to mess is non-random due to clustering

Is an alternative to t- test for testing two independent samples when data is ordinal The t-test tests if the two samples have been drawn from identical normal population. The Mann-Whitney U test is its generalization. An advantage with this test is that the two samples under consideration may not necessarily have the same number of observations. Data for both the samples is mixed and ranked from smallest to largest

The two samples under consideration are random, and are independent of each other, as are the observations within each sample. The observations are numeric or ordinal (arranged in ranks).

n1(n1 1) U n1n2 R1 2
and

n1 (n1 1) U ' n1n2 R2 2


n1 = number in sample 1

n2 = number in sample 2
R1 = sum of ranks in sample 1

MARKS OF TWO SECTIONS


Section A 27 28 29 31 32 33 Rank Section B 24 2 3.5 5 7 8 9 34 34 35 36 37 12 13 14.5 10.5 10.5 28 Rank 1 3.5

30

37 38

14.5 16

39 40

17 18
41 42 19 20

43

21

44

22

R1 =130= sum of the ranks for the section A R2 =123= sum of the ranks for the section B N1 =12= size of the group 1 N2 = 10=size of the group 2 Mann-Whitney U-statistics for section A U = N1N2 + [N1(N1 + 1) / 2] - R1 =68 Mann-Whitney U-statistics for section B U = N1N2 + [ N2(N2 + 1) / 2 ] - R2 = 52

Mean =

(12*10)/2 = 60

n1n2 2
( n1 )( n2 )( n1 n2 1) 12

Standard deviation u

Ui u

We next compare the value of calculated U with the value given in the Tables of Critical Values for the Mann-Whitney U-test, where the critical values are provided for given n1 and n2 , and accordingly accept or reject the null hypothesis. Two- Tailed test
We want to test the hypothesis that the mean marks of section A and section B are the same.

H0 : Performance of both the sections are same H1 : Performance of both the sections are not same

In a two tailed test either U or U can be used. Thus arbitrarily using U we get Z= 52-60 15.17 = -0.53

If we take a significance of 10% we have to check whether the obtained value of Z lies between -1.65 to 1.65 or not. Since the obtained Z value lies in the given range so we cant reject the null hypothesis. Interpretation: At the 10% level of significance the mean performance of both the sections are the same.

This test is based on the idea that the particular pattern exhibited when m number of X random variables and n number of Y random variables are arranged together in increasing order of magnitude provides information about the relationship between their parent populations.

Non Parametric tests are receiving more importance in several fields like economics, finance and legal. Parametric tests have certain limitations whereas non parametric are very useful. Although Non parametric tests has relatively low power of tests as well as low level of significance, but it has better ability to capture the central tendency of categorical data.

Applied Statistics for Business and EconomicsAllen L Webster Business Statistics for Management and Economics- Daniel Terrell
en.wikipedia.org/wiki/Non-parametric_statistics www.angelfire.com/wv/bwhomedir/notes/nonpar. www.gigawiz.com/nonparametric.html http://www.une.edu.au/WebStat/unit_materials/c6_co mmon_statistical_tests/nonparametric_test.html http://www.statsoft.com/textbook/nonparametricstatistics/

You might also like