Lakshami Through Sarasawati: 1. Introduction of Data
Lakshami Through Sarasawati: 1. Introduction of Data
Lakshami Through Sarasawati: 1. Introduction of Data
People do job to satisfy their different needs by earning the money (Lakshami) .As per the research, more educated
(Sarasawati) people are getting better job. Hence we can conclude it is not Lakshami Vs Sarasawati, Instead it is
“Lakshami Thorough Sarasawati”.
________________________________________________________________________________________
1. Introduction of Data
Employee data has been interpreted and result has been explained in this report. This
data is having nine attributes those attributes are gender, birth day, education level
(Year), Job category, current Salary, Beginning salary, Month since hire, Previous
experience and Minority classification. Some new attribute is derived from above nine
attribute like male (binary value for gender that may be 0 or 1) and age (derived from
date of birth).
2. Analysis
2.1. It’s a need to test weather female are less educated then male i.e. is
education level gender biased? (Refer: Appendix I)
a. Hypothesis:
H0:μMale = μfemale (Null Hypothesis is that mean of education level is same for male and female)
Ha:μMale ≠ μfemale (Alternate hypothesis is mean of education level is not same for male and female)
Significance Level α 0.05 (i.e. Rejection Region - Reject the null hypothesis if p-value ≤ 0.05)
In this case, attribute “education level” and “gender” need to be interpreted from
the employee data. Here “education level” is a continuous variables and
“gender” is a categorical variable. By Q-Q plot (Figure 1 , Appendix I), it is found
that the continuous variable “educational level” is normal is nature since
observed values in Q-Q plot is approximately on expected values. Also we
Vivek Kumar Enrollment no – 09BS0002756
found that skewness and Kurtosis of “education level” is -0.114 and -0.265
which is acceptable region to say data is approximately normal to proceed for
independent sample t-test. Also we need to identify the outlier for education
level. A box plot is drawn to remove the outlier but we did not identified any
outlier for education level (number of year of education)
2.2. It’s a need to test weather more education gives better Job (Refer Appendix II)
a. Hypothesis:
H0:μClerical = μCustodian= μManger (Null Hypothesis that mean of all category are equal)
In this case attribute “education level” is a continuous variable and job category
is a categorical variable which has more than two categories (i.e. Custodial,
Clerical and Manager). For the normality check of variable “education level” is
explained in previous section of this report and it is found that education level is
Vivek Kumar Enrollment no – 09BS0002756
approximately normal. Since here more than two groups for variable “job
category” is available we need to apply ANOVA instead of independent sample
t-test.
2.3. It’s a need to test weather job category is gender biased (Refer: Appendix III)
a. Hypothesis:
H0: Job category is independent of gender
Ha: Job category is NOT independent of gender
Significance Level α 0.05 (i.e. Rejection Region - Reject the null hypothesis if p-value ≤ 0.05)
Group Statistics
Employment Category
Clerical Custodial Manager Total
Gender Female Count 206 0 10 216
Expected Count 165.4 12.3 38.3 216.0
% within Gender 95.4% .0% 4.6% 100.0%
Male Count 157 27 74 258
Expected Count 197.6 14.7 45.7 258.0
% within Gender 60.9% 10.5% 28.7% 100.0%
Total Count 363 27 84 474
Expected Count 363.0 27.0 84.0 474.0
% within Gender 76.6% 5.7% 17.7% 100.0%
Table 5 : Contingency table ,From Cross Tab
Vivek Kumar Enrollment no – 09BS0002756
Chi-Square Tests
Asymp. Sig.
Value df (2-sided)
Pearson Chi-Square 79.277a 2 .000
Likelihood Ratio 95.463 2 .000
N of Valid Cases 474
a. 0 cells (.0%) have expected count less than 5. The
minimum expected count is 12.30.