New Content-1
New Content-1
New Content-1
Today, I'm excited to share with you the findings from our capstone project on HR data analysis and
modeling at Delta Ltd.
The goal of our project is to develop a model that can automatically determine the salary range for
employees with similar profiles. This aims to bring transparency and fairness to salary determination,
eliminating biases in the process.
THIS project addresses the need to use historical data for predicting salaries based on various factors like
experience, education, and industry. By minimizing human judgment, we aim to ensure fairness and
transparency in salary decisions, ultimately benefiting both the company and its employees.
This project offers several benefits, such as attracting and retaining talented employees with competitive
salaries, enhancing employee satisfaction, and promoting social justice by eliminating discrimination in
salary among similar profiles.
Data Report: The dataset, comprising 25,000 records of Delta Ltd. applicants,
encompasses 29 columns, including applicant ID, current and expected CTC, total
experience, education, and industry. Primary data collection methods, such as surveys and
interviews, were employed.
For univa analysis histogram and boxplots were used to understand the distribution of each
variable. Notably, total experience exhibits a right-skewed distribution, while current and
expected CTC display normal distributions with outliers at higher values.
EDA - Bivariate: Bivariate analysis was performed using scatter and corr matrix to underst
the relatiomship b/w diff variabl
Bivara analysis highlighted significant positive correlations between current and expected
CTC, total experience, and current CTC.
Business Insights from EDA: Key insights include the need for scaling due to the
sensitivity of regression models to variable magnitudes. The marketing department attracts
the highest number of applicants, followed by finance and IT, indicating competitiveness.
Diverse roles and skills are observed among applicants, and the training, IT, and banking
industries are the most relevant.
BEST PERFORMER
The top-performing models, including Ridge Regression, Linear Regression, and XG Boost
Regression, exhibit superior accuracy, minimal errors, and a robust fit to the data. These
models effectively capture intricate data relationships, ensuring reliable and consistent
predictions. Moreover, they provide valuable insights into feature importance, aiding in the
identification of key factors influencing salary outcomes.
Optimal Model and Business Implications: Random forest and XGBoost models
emerged as potentially optimal due to high accuracy and strong predictive power. Business
implications include supporting decision-making in salary negotiation, employee retention,
and talent acquisition, leading to improved operational efficiency and customer satisfaction.
Conclusion: The project successfully achieved its objectives, providing valuable insights
and building robust regression models for HR data analysis at Delta Ltd.