Predicting Employees Performance Using Data Mining Techniques

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 12

A

PROJECT DESIGN REPORT


ON
“PREDICTING EMPLOYEES PERFORMANCE USING
DATA MINING TECHNIQUES”
For the subject Lab1 Project Phase 1
Submitted in partial fulfillment of the requirement for the award of
Bachelor of Engineering
In
Computer Science and Engineering
Punyashlok Ahilyadevi Holkar Solapur University
By

Name Roll. No. Exam Seat No.


Suraj Ambewale 48
Pratik Bembalkar 49
Sidheshwar Cholle 50
Naitik Doshi 51

Under Guidance Of
MRS. R. DIXIT

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


WALCHAND INSTITUE OF TECHNOLOGY
SOLAPUR - 413006
(2019-2020)
CERTIFICATE
This is to certify that the Project entitled

“Predicting employees performance using data mining techniques”


is
Submitted by

Name Roll. No. Exam Seat No.


Suraj Ambewale 48
Pratik Bembalkar 49
Sidheshwar Cholle 50
Naitik Doshi 51

as a part of Project Design Report.

Studying in BE CSE for the subject Lab1 Project Phase 1

Mrs. R. Dixit (Dr.R.V.Argiddi)


Project Guide Head
Dept of Computer Science & Engg

(Dr. S .A. Halkude)


Principal

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


WALCHAND INSTITUE OF TECHNOLOGY
SOLAPUR
(2019-2020)
INDEX

Sr No. Topic Page No.


1 Abstract 4
2 Introduction 5
3 Background 6
4 Technologies Required 7
5 Objectives 8
6 Proposed Work 9
7 Flow Chart 11
8 Conclusion 12
9 References 12
ABSTRACT

Human Resources Management has become one of the essential interests of managers and
decision makers in almost all types of businesses to adopt plans for correctly discovering highly
qualified employees.

Accordingly, managements become interested about the performance of these employees.


Especially to ensure the appropriate person allocated to the convenient job at the right time.
From here, the interest of Data Mining (DM) role has been growing that its objective is the
discovery of knowledge from huge amounts of data.

In this project, DM techniques will be utilized to build a classification model for predicting
employee’s performance using a real dataset.

Three main DM techniques will be used for building the classification model and identifying the
most effective factors that positively affect the performance. The techniques are the Decision
Tree (DT), Naive Bayes, and Support Vector Machine (SVM).
INTRODUCTION

With the advancement and growth of technologies in business organizations, HR employees


need not handle the massive amount of data manually any further. These data is very important
for the decision makers, but there is a challenge to mine and get the best and useful data from
these huge data. From here, the role of DM comes. It is considered as a recently emerging
analysis and predictive tool, because of the existence and multiplicity of massive amount of data
containing huge hidden unknown knowledge.

DM techniques provides an approach to utilize different DM tasks such as classification,


association, and clustering used to extract hidden knowledge from huge amount of data.

Classification is a predictive DM technique, makes prediction about values of data using known
results found from various data. Classification technique is a supervised learning technique in
DM and machine learning.

The used classification techniques commonly build models, which in turn used to predict future
data trends.
BACKGROUND
Many researches have used DM classification techniques for generating rules and predicting
certain attitudes in various fields of science therefore, evaluation and prediction of employee’s
performance efficiency are considered as a critical issue for detecting the whole number of
variables and criteria related to the predictive model efficiency of the employees’ performance
that have been reviewed.

In General, this project is an initiative attempt to investigate DM tasks, especially classification


task, for supporting decision makers and HR’s professionals by identifying and studying the
main factors of their employees that may positively affect their performance. In this project we
will apply some of the classification techniques to build a proposed model for supporting the
prediction of the employees’ performance.
TECHNOLOGIES REQUIRED

1) React
React (alternatively referred to as ReactJS), is an up and coming alternative to Angular. It is a
JavaScript library, developed by Facebook and Instagram, to build interactive, reactive user
interfaces. Like Angular, React breaks the front-end application down into components. Each
component can hold its own state, and a parent can pass its state down to its child components
(as properties) and those components can pass changes back to the parent through the use of
callback functions. Components can also include regular data members (which are not state or
properties) for data which isn't rendered. React is most commonly executed within the browser
but it can also be run on the back-end server within Node.js, or as a mobile app using React
Native.

2)Flask
Flask is a Python web framework built with a small core and easy-to-extend philosophy. Flask is
a micro web framework written in Python. It is classified as a microframework because it does
not require particular tools or libraries. It has no database abstraction layer, form validation, or
any other components where pre-existing third-party libraries provide common functions.
However, Flask supports extensions that can add application features as if they were
implemented in Flask itself. Extensions exist for object-relational mappers, form validation,
upload handling, various open authentication technologies and several common framework
related tools. Extensions are updated far more frequently than the core Flask program.
OBJECTIVES

 Gathering a dataset of predictive variables.

 Identification of different factors, which affects employees behavior and performance.

 Using proposed DM classification techniques for constructing a predictive model and


identifying relationships between most important factors affecting over whole efficiency
of the model.
PROPOSED WORK
The main objective of the proposed methodology is to build the classification model that tests
certain attributes that may affect job performance.

A. Data Classification Preliminaries


In general, data classification is a two-step process. In the first step, which is called the learning
step, a model that describes a predetermined set of classes or concepts is built by analyzing a set
of training database instances. Each instance is assumed to belong to a predefined class. In the
second step, the model is tested using a different data set that is used to estimate the
classification accuracy of the model. If the accuracy of the model is considered acceptable, the
model can be used to classify future data instances for which the class label is not known. At the
end, the model acts as a classifier in the decision-making process. There are several techniques
that can be used for classification such as decision tree, Bayesian methods, support vector
machine. Decision tree classifiers are quite popular techniques because the construction of tree
does not require any domain expert knowledge or parameter setting, and is appropriate for
exploratory knowledge discovery. Decision tree can produce a model with rules that are human-
readable and interpretable. Decision Tree has the advantages of easy interpretation and
understanding for decision makers to compare with their domain knowledge for validation and
justify their decision. Some of decision tree classifiers are NB Tree, and others.

B.Data understanding and preparation


1) Data understanding

First, the data understanding phase starts with initial data collection, which we collect from
available data sources to help us get familiar with the data. Some important activities must be
performed including data load and data integration in order to make the data collection
successfully. Then, the data needs to be explored by tackling the data mining questions, which
can be addressed using querying, reporting, and visualization.

2) Data Preparation

The data preparation typically consumes about 90% of the time of the project. The outcome of
the data preparation phase is the final data set. Once available data sources are identified, they
need to be selected, cleaned, constructed and formatted into the desired form.
C. Highlights of the project
1) The front-end is developed in React and would include a single page with a form to
submit the input values.

2) The back-end is developed in Flask which exposes prediction endpoints to predict trained
classifier and send the result back to the front-end for easy consumption.
FLOWCHART
CONCLUSION
This project has been concentrated on the possibility of building a classification model for
predicting the employees’ performance. On working on performance, many attributes have been
tested, and some of them are found effective on the performance prediction.

REFERENCES
1) Predicting Employees performance using DM Techniques-
International Journal of Science and Information Security, January 2019

2) https://towardsdatascience.com/create-a-complete-machine-learning-web-application-
using-react-and-flask-859340bddb33

You might also like