Aniket Kumar BTECH10710.20 UG Project Endsem 8 Report

Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

Data Entry and Visualization of enrolment and piloting

student’s information

A Thesis
Submitted in partial fulfillment of the requirements for the
award of the Degree of

BACHELOR OF TECHNOLOGY
IN
BIOTECHNOLOGY

BY

Aniket Kumar

(BTECH/10710/20)

DEPARTMENT OF BIOENGINEERING AND BIOTECHNOLOGY


BIRLA INSTITUTE OF TECHNOLOGY
MESRA-835215, RANCHI

SP 2024
i
APPROVAL OF THE GUIDE(S)

Recommended that the thesis entitled “Data Entry and Visualization of


enrolment and piloting student’s information” presented by Aniket
Kumar (BTECH/10710/20) under my supervision and guidance be
accepted as fulfilling this part of the requirements for the award of Degree
of BACHELOR OF TECHNOLOGY in BIOTECHNOLOGY. To the
best of my knowledge, the content of this thesis did not form a basis for
the award of any previous degree to anyone else.

Date: ____________

__________________ __________________
(External Guide) (Internal Guide)
(Miss Ishika Sharma) (Dr. Soham Chattopadhyay)
Senior Data Analyst and Assistant Professor
Project Lead
LeapScholar Dept. of Bioengineering and
Biotechnology,
Birla Institute of Technology
Mesra, Ranchi

ii
DECLARATION CERTIFICATE

I certify that
a) The work contained in the thesis is original and has been done by
myself under the general supervision of my supervisor.
b) The work has not been submitted to any other Institute for any other
degree or diploma.
c) I have followed the guidelines provided by the Institute in writing the
thesis.
d) I have conformed to the norms and guidelines given in the Ethical
Code of Conduct of the Institute.
e) Whenever I have used materials (data, theoretical analysis, and text)
from other sources, I have given due credit to them by citing them in
the text of the thesis and giving their details in the references.
f) Whenever I have quoted written materials from other sources, I have
put them under quotation marks and given due credit to the sources
by citing them and giving required details in the references.

Aniket Kumar
(BTECH/10710/20)

iii
CERTIFICATE OF APPROVAL

This is to certify that the work embodied in this thesis entitled “ Data
Entry and Visualization of enrolment and piloting student’s
information”, is carried out by Aniket Kumar (BTECH/10710/20)
has been approved for the degree of BACHELOR OF
TECHNOLOGY in BIOTECHNOLOGY of Birla Institute of
Technology, Mesra, Ranchi.

Date:

Place:

Internal Examiner External Examiner

(Chairman)
Head of Department

iv
ABSTRACT

This internship report encapsulates the practical experience attained from


two distinct projects, both centered around leveraging data analytics
techniques to drive insightful visualization and support decision-making.
The initial project centered on crafting the "India's Demographic
Analysis Dashboard," which entailed applying data cleaning, validation,
and visualization methods to dynamically depict state-wise population and
literacy rates in India. The resulting dashboard provides an interactive
platform for exploring and analyzing demographic trends effectively.
Following this, attention shifted towards developing the "Student's Data
Analytics Dashboard," employing similar methodologies to process
student enrollment and registration data. This dashboard delivers a
comprehensive overview of student demographics and enrollment patterns,
furnishing stakeholders with data-driven insights into educational
dynamics. These projects facilitated the acquisition of valuable skills in
data manipulation, visualization, and dashboard development, emphasizing
the pivotal role of data analytics in guiding decision-making processes
across diverse domains.

v
ACKNOWLEDGEMENT
I would like to express my profound gratitude to my project guides, Dr. Soham
Chattopadhyay and Miss. Ishika Kumari for their guidance and support during
my thesis work. I benefited greatly by working under their guidance. It was their
effort for which I am able to develop a detailed insight on this subject and special
interest to study further. Their encouragement motivation and support has been
invaluable throughout my studies at BIT, Mesra, Ranchi.
I convey my sincere gratitude to Dr Kunal Mukhopadhyay, Head, Dept. of
Bioengineering and Biotechnology, BIT, Mesra, Ranchi, for providing me various
facilities needed to complete my project work. I would also like to thank all the
faculty members of my department who have directly or indirectly helped during the
course of the study. I would also like to thank all the staff (technical and non-
technical) and my friends at BIT, Mesra, Ranchi who have helped me greatly during
the course.
Finally, I must express my very profound gratitude to my parents for providing me
with unfailing support and continuous encouragement throughout the years of my
study. This accomplishment would not have been possible without them.
My apologies and heartful gratitude to all who have assisted me yet have not been
acknowledged by name.

Thank you.

DATE: Aniket Kumar


(BTECH/10710/20)

vi
CONTENTS

ABSTRACT i
ACKNOWLEDGMENT ii
LIST OF FIGURES v
LIST OF TABLES vii
1 DATA ASSOCIATE ROLE...……......…..………………………………………….10
1.1. INTRODUCTION………………………….………………………………………………..…10

2 ORGANISATION…………....…………….………..…………………………….11
2.1. ABOUT THE ORGANISATION…………….…………...……………….……………..….11

3 DATA ASSOCIATE WORKING...…..……………………..……………………12


3.1. DATA ASSOCIATE ROLE…………………………..……………….….…………..…..…12

3.2. DATA CLEANING AND VALIDATION...…………………………..…..……………….12


3.3. DATA VISUALIZATION……………….………………………….……………………….13

4 DEMONSTRATION PROJECT……………...……….….……………………...14
4.1. WORKING OF DEMONSTRATION PROJECT…………………………..……………..…14

4.2. DATA CLEANING AND VALIDATION...………...…………………….………………….14


4.3. DATA VISUALIZATION……………….……………...…….…….………………………….15

4.4. TOOLS USED…………………….………………………...…………………………………16

5 ACTUAL DATASET…………….………….……………...……………………...17
5.1. WORKING OF DEMONSTRATION PROJECT…………………………….……………..….17

5.2. DATA CLEANING AND VALIDATION...……..……………...…………..……………….17


5.3. DATA VISUALIZATION……………….………………………...….……………………….18

5.4. TOOLS USED………………………………….……………………..………………………...19

vii
6 RESULTS…………………….….…………………………………………………...20

7 CONCLUSION…………………..…………………………………………………..22

8 REFERENCES……………………….……………………………………….……..23

viii
LIST OF FIGURES

Figure 1 Indian State Population and Literacy Rate Combined Data

Figure 2 Microsoft Power BI Dashboard showing Indian State Population and


Literacy Rate

Figure 3 Student Enrolment Data in LeapScholar 25

Figure 4 Microsoft Power BI Dashboard of Student Information Insights Data in 25


LeapScholar

ix
1. Data Associate

1.1 INTRODUCTION
During my tenure as a Data Associate at LeapScholar, I embraced a multifaceted role
focused on meticulously managing data to facilitate well-informed decision-making
processes. Central to my responsibilities was the collection, processing, and upkeep
of extensive datasets, ensuring their accuracy, comprehensiveness, and timeliness.
Engaging with various data sources and utilizing a range of tools, I adeptly navigated
the complexities of data manipulation, particularly within the extensive domain of
MS Excel, extracting pertinent information from numerous external channels. My
commitment extended beyond mere data entry, encompassing the strategic collection,
storage, and analysis of data, with the aim of empowering students, businesses, and
organizations to navigate their respective landscapes with precision and foresight.
Leveraging visualization and analytical tools, I transformed raw datasets into
insightful narratives, presenting information in a format conducive to understanding
and actionable insights. This internship report encapsulates my journey, detailing the
challenges encountered, the strategies employed, and the invaluable insights gained
during my tenure as a Data Associate at LeapScholar.

Page | 10
2. About the Organisation
2.1 Introduction:
LeapScholar stands out as a beacon of opportunity within the global education arena,
offering a comprehensive range of services tailored to students aiming to pursue
academic paths abroad. Serving as a dynamic platform, LeapScholar provides
essential resources that guide students through every stage of the application process
until they embark on their academic journey overseas. Our dedication to excellence is
evident in our thorough approach, from the initial assessment of student profiles to
the seamless facilitation of college selection and visa applications. With unwavering
commitment, we ensure that each student's aspirations are not just fulfilled, but
surpassed, as they secure coveted positions in their desired universities. Positioned at
the intersection of two burgeoning sectors—edtech and fintech—LeapScholar
embodies innovation and effectiveness, harnessing technology's transformative power
to reshape the global education landscape. This internship report sheds light on my
tenure as a Data Associate at LeapScholar, illustrating the alignment between our
organizational values and my professional contributions in the realm of data
management and analysis.

Page | 11
3 Data Associate
3.1 Data Associate role
A Data Associate assumes a pivotal position in organizations, responsible
for gathering, processing, analyzing, and managing data. Their duties
typically encompass a range of tasks, including data entry, cleaning,
dataset organization, visualization, quality assurance checks to uphold
data accuracy, and the creation of reports or summaries based on analyzed
data. Data Associates collaborate with diverse data sources and utilize
various tools, including databases, spreadsheets, and data visualization
software, to extract meaningful insights that guide decision-making
processes within the organization.

Similarly, professionals in Visualization Roles leverage data visualization


techniques and tools to effectively communicate insights derived from
data analysis. These individuals utilize visualization software such as
Tableau, Power BI, or Python libraries like Matplotlib and Seaborn to
craft visual representations of data, including charts, graphs, dashboards,
and interactive visualizations. By transforming complex data into
accessible visual formats, these visualizations enable stakeholders to
comprehend key insights quickly and make informed decisions based on
the presented data.

3.2 Data Cleaning and Validation

Throughout my internship, I deeply engaged with the essential realm of data


validation, acknowledging its pivotal role in strengthening the bedrock of an
organization's data management framework. This experience granted me an insightful
understanding of the multifaceted significance and intricate methodologies inherent
in data validation practices.

Data validation emerged as a central pillar of our data management strategy,


epitomizing the organization's dedication to maintaining the utmost standards of
accuracy, reliability, and integrity across its datasets. As I delved further into the
nuances of data validation, I came to recognize its extensive implications for various
aspects of organizational functioning, ranging from decision-making processes to
compliance with regulatory requirements.
Page | 12
3.3. Data Visualization
Data visualization encompasses both the artistic and scientific processes
of depicting data visually, often through charts, graphs, and maps, with
the aim of rendering complex datasets more understandable and
meaningful. It serves as a potent instrument in data analysis, empowering
individuals and organizations to unveil patterns, trends, and relationships
within their data.

During my internship, I had the opportunity to delve into the realm of


data visualization, gaining firsthand insight into its significance and
various techniques. I came to recognize data visualization as a pivotal
tool in transforming extensive and intricate datasets into intuitive and
compelling visual representations. By presenting data visually, intricate
information becomes more accessible and easier to interpret, enabling
stakeholders to swiftly grasp key insights with precision and efficacy.

Page | 13
4. Demonstration Project
4.1. Working on demonstration project
In the initial phase of my internship, I received an assignment for a demo
project aimed at acquainting me with the different stages of data analysis.
This project involved utilizing synthetic or sample data to replicate real-
world scenarios. My main objective was to navigate through the entire
data analysis pipeline, encompassing tasks such as data processing,
validation, cleaning, and visualization. This practical engagement
provided me with a sturdy groundwork and enabled me to comprehend
the intricacies of each step in the process. Subsequently, upon
successfully completing the demo project and acquiring proficiency in
every stage of the data analysis pipeline, I transitioned to working with
genuine datasets. Leveraging the knowledge and skills gleaned from the
demo project, I undertook analogous tasks on real-world data, albeit
encountering greater complexity and scale.

4.2. Data Validation and Data Cleaning


Following the conclusion of the data processing phase, I progressed into
the crucial stage of data validation, which held significant importance in
guaranteeing the dependability and integrity of the dataset. This phase
encompassed a thorough assessment of the dataset, during which I
diligently executed a series of checks and measures to evaluate its quality
and coherence.

The process of data validation involved meticulous scrutiny of the dataset


to detect any anomalies or inconsistencies that could compromise the
accuracy of subsequent analyses. This entailed a systematic review of
multiple facets of the dataset, including completeness, consistency,
accuracy, and coherence.

Page | 14
Fig. 1 Indian State Population and Literacy Rate Combined Data
Source: - CENSUS 2011

4.3. Data Visualization


After completing the stages of data processing and validation, I ventured
into the realm of data visualization, marking the culmination of the data
analysis pipeline. In this phase, I utilized visualization techniques to
illuminate the insights extracted from the processed and validated
dataset. Employing Microsoft Power BI, I developed a dynamic and user-
interactive dashboard named "India’s Demographic Analysis." Through
this immersive experience, I not only enhanced my comprehension of the
importance of data visualization but also refined my proficiency in
utilizing diverse tools and techniques to create compelling visual
narratives.

Page | 15
Fig. 2 MS Power BI Dashboard showing Indian State Population and Literacy Rate
Source: - Screenshot From Personal Computer

4.4. Tools Used


- For Data Processing we used Microsoft Excel.
- For Data Validation and Cleaning we used Microsoft Excel.
- For Data Visualization we used Microsoft Power BI.

Page | 16
5. Actual Dataset
5.1. Working on actual dataset
Upon the successful completion of the demo project and mastering each
stage of the data analysis pipeline, I eagerly transitioned to working with
genuine datasets. My assignment involved analyzing a student
registration and enrollment dataset sourced from various departments
within the organization. Drawing upon the knowledge and skills acquired
from the demo project, I approached this real-world data with
confidence, prepared to confront the challenges it presented.
Engaging with authentic datasets introduced me to a heightened level of
complexity and scale compared to the synthetic data utilized in the demo
project. Nonetheless, the robust foundation established during the demo
project enabled me to effectively navigate these challenges. I applied the
same methodologies and techniques acquired previously, albeit with
necessary adjustments to accommodate the intricacies of the real-world
dataset.

5.2. Data Validation and Data Cleaning

Following the conclusion of the data processing phase, I progressed into


the crucial stage of data validation, which held significant importance in
guaranteeing the dependability and integrity of the dataset. This phase
encompassed a thorough assessment of the dataset, during which I
diligently executed a series of checks and measures to evaluate its quality
and coherence.

The process of data validation involved meticulous scrutiny of the dataset


to detect any anomalies or inconsistencies that could compromise the
accuracy of subsequent analyses. This entailed a systematic review of
multiple facets of the dataset, including completeness, consistency,
accuracy, and coherence.

Page | 17
Fig 3. Student Enrolment Data in LeapScholar
Source: - LeapScholar Database

5.3. Data Visualization

Following the completion of the data processing and validation stages, I


entered the realm of data visualization, marking the apex of the data
analysis pipeline. Here, I utilized visualization techniques to animate the
insights garnered from the processed and validated dataset. Employing
Microsoft Power BI, I crafted a dynamic, user-interactive dashboard
titled "India’s Demographic Analysis." This immersive endeavor not
only enriched my comprehension of the importance of data visualization
but also refined my abilities in utilizing diverse tools and techniques to
construct captivating visual narratives.

Page | 18
Fig 4. MS Power BI Dashboard of Student Information Insights Data in LeapScholar

Source: - Screenshot from personal computer

5.4. Tools used


- For Data Processing we used Microsoft Excel.
- For Data Validation and Cleaning we used Microsoft Excel.
- For Data Visualization we used Microsoft Power BI.

Page | 19
6.Results

The culmination of these endeavors led to the successful development of


a sophisticated dashboard titled "India’s Demographic Analysis" using
MS Power BI for our demonstration project. This dashboard offers
stakeholders a comprehensive overview of crucial metrics and trends
within the dataset, such as state-wise population and literacy rates,
facilitating informed decision-making. Notable features of the dashboard
include:

• Interactive Visualizations: Users have the flexibility to delve into


specific data points and tailor views to their requirements.

• Real-time Data Updates: The dashboard is engineered to receive real-


time updates, ensuring stakeholders have access to the most recent
information.

• Insightful Analysis: The visualizations provide insightful analysis,


spotlighting trends, outliers, and correlations within the data.

Page | 20
After this the actual dataset concluded the following results: -
The project produced a dynamic and user-interactive dashboard fashioned
using MS Power BI, adeptly converting raw data into actionable insights.
Significant results and features of the dashboard comprise:

• Comprehensive Overview: Stakeholders are furnished with a thorough


summary of pivotal metrics and trends inherent within the dataset.

• Dynamic Visualizations: The dashboard incorporates dynamic


visualizations enabling users to engage with the data in real-time,
fostering deeper exploration.

• Actionable Insights: Through astute analysis, the dashboard furnishes


stakeholders with actionable insights pivotal for informed
decisionmaking.

This report encapsulates the process of leveraging data techniques to


create a dynamic and user interactive dashboard using Microsoft Power
BI.

Page | 21
7 CONCLUSION AND FUTURE SCOPE OF WORK

7.1. CONCLUSION

In conclusion, the project demonstrates the effective application of data


collection, cleansing, and visualization techniques to develop a dynamic
and user-interactive dashboard using MS Power BI. By converting raw
data into actionable insights, the dashboard emerges as a valuable
resource for data-driven decision-making. Looking ahead, ongoing
refinement and enhancement of the dashboard will augment its utility and
impact.

In summary, this internship has been a journey of adeptly employing


tools and methodologies in data analytics. The creation of comprehensive
dashboards on MS Power BI showcased my capability to translate raw
data into actionable insights. Adhering to a rigorous process for both
sample and actual datasets ensured consistency and accuracy throughout.
By seamlessly integrating data collection, cleansing, and visualization, I
established a robust foundation for cohesive and polished analytical
outcomes. Leveraging tools such as MS Excel and Power BI, alongside
dedicated data cleansing techniques, equipped me with the proficiency
required for intricate data tasks. This experience has been invaluable, and
I extend my appreciation to LeapScholar, my supervisor, and colleagues
for their guidance.

Page | 22
8 REFERENCES

1. LeapScholar Database. "Top Dynamic Dashboard creating and


practice topic ideas." LeapScholar, (2021).

2. LeapScholar Database. "Data Entry and Visualization of enrolment


and piloting student’s information." LeapScholar, (2024).

3. Miss Ishika Kumari, Mr. Samagra Dev Sharma, Mr. Agastya Rao.
"Data Cleaning, Validation and Visualization for demonstration
projects." (2024).

Page | 23

You might also like