0% found this document useful (0 votes)

31 views

Case Study On Processing Data Driven For Health

This case study explores how a large hospital network leverages the Hadoop ecosystem to gain valuable insights from its vast healthcare data. It discusses the major components of Hadoop including HDFS, YARN, MapReduce and how the hospital used these tools to ingest and analyze data from various sources for improved patient care, research, and operations.

Uploaded by

shivamshinde2703

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views

Case Study On Processing Data Driven For Health

Uploaded by

shivamshinde2703

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

Case Study on processing data driven for Health-Care systems Name: Shivam Shinde

with Hadoop ecosystem and its various constituents.

Roll No.: 39065

Class: TE-B
 What are the major constituents of HADOOP ECOSYSTEM?

The major constituents of the Hadoop ecosystem mentioned in the case study topic are:

1. Data Storage and Management:

2. HDFS (Hadoop Distributed File System)
3. YARN (Yet Another Resource Negotiator)
4. Data Processing and Analysis:
5. MapReduce
6. Spark
7. Pig
8. Hive
9. Real-time Data and Analytics:
10. HBase
11. Machine Learning and Advanced Analytics:
12. Mahout
13. Spark MLLib
14. Data Retrieval and Search:
15. Solr
16. Lucene
This case study explores how a large hospital network, Central General Hospital (CGH), leverages the
Hadoop ecosystem to gain valuable insights from its vast amount of healthcare data.
 Introduction: Harnessing the Power of Big Data in Healthcare - A Case Study of
Central General Hospital's Transformation with the Hadoop Ecosystem

Healthcare Industry is one of the world's greatest and most extensive ventures. Amid, the ongoing years
the Healthcare administration around the globe is changing from infection focused to a patient-focused
model and volumebased model. Teaching the predominance of Healthcare and diminishing the cost is a
guideline behind the creating development towards-esteem based Healthcare conveyance model and
patient-focused mind. The volume and interest for huge information in Healthcare associations are
developing little by close to nothing. To give successful patient-focused care, it is fundamental to oversee
and analyse the huge amount of data sets. The traditional methods are obsolete and are not sufficiently
adequate to break down enormous information as assortment and volume of information sources have
expanded and a very large rate in previous two decades. There is a requirement for new and creative tools
and methods that can meet and surpass the capacity of overseeing such a huge amount of data being
generated by the healthcare department.

The social insurance framework of healthcare departments is community in nature. This is since it
comprises of a substantial number of partners such as doctors with specialization in different sectors,
medical caretakers, research centre technologists and other individuals that cooperate to accomplish the
shared objectives of decreasing medicinal cost and blunders and also giving quality healthcare experience.
Every one of these partners produce information from heterogeneous sources, for example, physical
examination, clinical notes, patients’ meetings and perceptions, research facility tests, imaging reports,
medications, treatments, overviews, bills and protection.

The rate at which information is being generated from heterogeneous sources from various healthcare
department has incremented exponentially on the daily basis. Therefore, it is becoming hard to store,
process and break down this inter related information with traditional dataset handling applications.
Nonetheless, new and efficient methods and systems are in addition to provide great processing
advancements to store, process, break down and extricate values from voluminous and heterogeneous
medical information being generated in a continuous way. Henceforth, the medicinal services framework
is quick turning into a major information industry.

Generally, medicinal services information has developed enormously in both organized and unstructured
way, to a great extent driven by the requests of always extending information parched populace what's
more, operational attributes of e-health stages. This dangerous multidimensional development has lead
scientists, to add numerous more watchwords to portray Healthcare Big Data (HBD). It isn't only the
volume however their assortment, specifically the kinds of sources that deliver information and the
objective sorts that request them are excessively different and various in Healthcare area. These
incorporate medicinal services workforce (doctors, clinical staff, parental figures), benefit giving
organizations (counting safety net providers), healing facilities with resources, clinicians, government
controllers, drug stores, pharmaceutical makes (with look into groups included), and therapeutic gadget
organizations.
 Objectives:

In order to process a huge amount of health data records at once we need efficient tools and
methodologies. The proposed papers use the Hadoop Framework to handle the data, and the
algorithm being used is Map Reduction.

Fig1: 5 Vs of Big Data Fig2: Apache Hadoop Ecosystem

The Hadoop Distributed File System (HDFS) is the essential information stockpiling framework
utilized by Hadoop applications. It comprises of NameNode/The Master and DataNodes/The Slave
design to execute a disseminated record framework called Hadoop Distributed File System to get to
information crosswise over exceedingly adaptable Hadoop Clusters in an effective way. Hadoop
Framework in total consists of 5 daemon processes namely:-

1. Name Node: Name Node is utilized to store the Metadata (data about the area, size of
files/blocks) for HDFS. The Metadata could be put away on RAM or Hard-Disk. There will
dependably be just a single Name Node in a cluster. The only way that the Hadoop framework
can fail is when the Name Node will crash.

2. Secondary Name Node: It is used as a backup for Name Node. It holds practically same data
as that of Name Node. On the off chance that Name Node falls flat, this one comes into picture.

3. Data Node: The actual user files or data is stored on Data Node. The number of Data Node
depends on your data size and can be increased with the need. The Data Node communicates to
Name Node in definite interval of times.

4. Job Tracker: Name Node and Data Nodes store points of interest and genuine information on
HDFS. This information is likewise required to process according to users’ prerequisites. A
Developer writes a code to process the information. Processing of data can be done using
MapReduce. MapReduce Engine sends the code over to Data Nodes, making jobs in multiple
nodes running alongside of each other. These employments are to be persistently observed by
the Job tracker.
Map Reduction algorithm contains two important tasks, namely Map and Reduce.

 Mapping – Attained by Mapper Class

 Reduction – Attained by Reducer Class.

MapReduce utilizes different numerical calculations to separate an errand into little parts and dole
out them to various frameworks.

MapReduce calculation helps in sending the Map and Reduce errands to proper servers in a bunch.
The tasks are executed in parallel in all the different nodes and finally the result is returned to the
user.

Fig3: Map Reduction

CGH adopted a phased approach to implement the Hadoop ecosystem. The initial phase focused on
ingesting data from various sources, including:

 Electronic Health Records (EHR)

 Medical Image data (X-rays, MRIs)
 Research data repositories
 Sensor data from medical devices
 Billing and insurance data
 Challenges

CGH utilized Sqoop, a tool specifically designed for transferring data between relational databases and
HDFS. Data cleansing and transformation were performed using MapReduce or Spark scripts. The
processed data was then stored in HDFS in a structured format readable by querying tools like Hive and
Pig.While the Hadoop ecosystem offers significant advantages, CGH encountered certain challenges:

Data Security and Privacy: Implementing robust security measures is crucial to protect sensitive patient
data stored in the data lake. CGH enforces access controls and encrypts data both at rest and in transit.

Data Quality and Standardization: CGH established data governance procedures to ensure data quality
and consistency across different sources. Standardizing data formats facilitates seamless integration and
analysis.

Technical Expertise: Managing a Hadoop cluster requires specialized skills. CGH invested in training its
staff and potentially outsourced some functions.

 Applications and Benefits

By implementing the Hadoop ecosystem, CGH unlocked significant benefits:

Improved Patient Care: By analyzing patient records and sensor data, CGH can identify potential health
risks and initiate proactive interventions. Predictive analytics helps tailor treatment plans to individual
patient needs.

Enhanced Research & Development: CGH can leverage Big Data to analyze research data from various
sources, enabling faster drug discovery and improved treatment methods.

Operational Efficiency: Data analytics helps CGH optimize resource allocation by identifying areas for
cost savings and improving operational workflows.

Personalized Marketing: CGH can analyze patient data to understand their needs and preferences,
allowing for targeted communication and outreach programs.

Fig4: Big data Adoption in Industry

 Future Scope

At present as the medicinal services showcase is developing, it has turned out to be certain that
associations which are equipped for utilizing the force of examination are showing a sensible favored
outlook in bit of the general business over their opponents. Big Data analysis in medical services has
indeed, now turned into the main thrust for making openings for creating ideal treatment pathways,
enhancing the restorative setback extent and better managing clinical decision candidly steady systems.

With taking off healthcare services costs and additionally developing administrative weights for both
moderateness and changes in clinical results – Analytics has risen as a silver covering for this industry.
Examination in medical services has demonstrated to create experiences that not just lower add up to costs,
decrease wasteful aspects, and distinguish high hazard populace yet additionally can foresee a patient's
future social insurance needs.

Thus, with examination today, emotionally supportive networks are presently being subjected to different
measurable and computerized reasoning systems. This is further bringing about advancement of significant
bits of knowledge, for example, ID of different patient hazard factors, gathering of patients in light of
changed wellbeing conditions, arrangement of noteworthy data to doctors at the purposes of care and
above all, quantifiable advancement on healthcare services results. The presentation of examination in
medical services hence, has helped in beating a lot of difficulties making genuine incentive for this part. A
portion of these ordinary difficulties include

Fig 5: Big Data Uses

 Diverse information sources that make it hard to make a solitary hotspot for reality.
 Information not being accessible in an opportune way, so choices are not information driven.
 Lack of an unmistakable vision on how the association can profit by examination.
 Too many manual frameworks sent, bringing about lacking electronic information.
 The culture not being prepared to end up an information driven association

Anyway, to conquer the greater part of the above difficulties, it is basic that the data being recorded via
the patients should be put to good use. Additionally, all clinical data should be put away in their
standard information organizations. For example - EHRs must be changed into useful information on
which analysis can be done, in order to effectively get significant insight from it and improvements can
be achieved over the same to provide personalized healthcare experience to the patient.

To finish up along these lines, it tends to be said that analysis today is undoubtedly an urgent process in
medical services that will altogether reshape its scene in the upcoming years. Besides, investigation is
likewise being current drive of a move in this industry towards arrangements that are fit for conveying
genuine esteem, for example,

 Improving clinical nature of care.

 Improving tolerant security and lessening therapeutic mistakes.
 Improving wellbeing, avoidance and ailment administration.
 Optimizing supply chains and human capital administration.
 Improving hazard administration and administrative consistence.
 Conclusion: The Future of Data-Driven Healthcare with Hadoop

Healthfirst’s successful implementation of the Hadoop ecosystem serves as a compelling example of

how big data technologies can revolutionize healthcare delivery. By leveraging the scalability,
flexibility, and analytical power of Hadoop, Health First has gained valuable insights from its data,
leading to improved patient care, personalized medicine, and cost reduction.

Looking Ahead: Expanding the Potential

The potential of the Hadoop ecosystem in healthcare extends far beyond the applications demonstrated
by Health First. Here are some promising future directions

Genomics and Precision Medicine: Integrating genomic data with traditional clinical data can pave the
way for personalized medicine at a deeper level, tailoring treatments to individual genetic profiles.

Population Health Management: Analyzing large datasets from entire patient populations can identify
trends, predict disease outbreaks, and develop targeted public health interventions.

Wearable Devices and IoT Integration: Data from wearable devices and Internet of Things (IoT)
sensors can provide real-time insights into patient health and behavior, enabling proactive monitoring
and preventive care.

Advanced Analytics and AI: Machine learning and artificial intelligence hold immense potential for
tasks like automating medical image analysis, drug discovery, and even chatbots for patient support.

The adoption of big data technologies like Hadoop marks a transformative journey for the healthcare
industry. By embracing data-driven insights, healthcare providers can empower themselves to deliver
better patient care, improve clinical outcomes, and optimize resource allocation. As technology
continues to evolve and new challenges emerge, continuous innovation and a commitment to data
security and privacy will be paramount in unlocking the full potential of big data for a healthier future.

This case study has explored the implementation of the Hadoop ecosystem at Health First and its impact
on various healthcare initiatives. The concluding sections have highlighted the future potential and
challenges associated with big data in healthcare. It is evident that big data holds immense promise for
transforming healthcare delivery, and the Hadoop ecosystem serves as a powerful tool for unlocking
valuable insights from the ever-growing ocean of healthcare data.

PCW Applicant's Information Sheet
100% (6)
PCW Applicant's Information Sheet
2 pages
Beck Cognitive Insight Scale
No ratings yet
Beck Cognitive Insight Scale
3 pages
T71.20 Trauma Antibiotic Guideline
No ratings yet
T71.20 Trauma Antibiotic Guideline
7 pages
Big Data in Healthcare
No ratings yet
Big Data in Healthcare
14 pages
Ro7 2017 Vicente Sotto SR Memorial Medical Center
No ratings yet
Ro7 2017 Vicente Sotto SR Memorial Medical Center
358 pages
Case Study DSBDA Report Final
No ratings yet
Case Study DSBDA Report Final
24 pages
Case Study DS-BDA
No ratings yet
Case Study DS-BDA
29 pages
Hadoop Ecosystem for Big Data
From Everand
Hadoop Ecosystem for Big Data
Dr. Zemelak Goraga
No ratings yet
Creating A Health Data Management Platform Using Hadoop
No ratings yet
Creating A Health Data Management Platform Using Hadoop
4 pages
De-Identified Personal Health Care System Using Hadoop
No ratings yet
De-Identified Personal Health Care System Using Hadoop
8 pages
Self-Medical Analysis Using Internet-Based Computing Upon Big Data
No ratings yet
Self-Medical Analysis Using Internet-Based Computing Upon Big Data
6 pages
Building and Operating Data Hubs: Using a practical Framework as Toolset
From Everand
Building and Operating Data Hubs: Using a practical Framework as Toolset
Georg Graner
No ratings yet
Charles Boicey Stony Brook Medicine R Nusa
No ratings yet
Charles Boicey Stony Brook Medicine R Nusa
33 pages
Fhir Data Solutions With Azure Fhir Server, Azure Api For Fhir & Azure Health Data Services: 1, #1
From Everand
Fhir Data Solutions With Azure Fhir Server, Azure Api For Fhir & Azure Health Data Services: 1, #1
AJIT DASH
No ratings yet
Mini Project Doc 2
No ratings yet
Mini Project Doc 2
25 pages
Hadoop BIG DATA Interview Questions You'll Most Likely Be Asked
From Everand
Hadoop BIG DATA Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
The Big Unlock: Harnessing Data and Growing Digital Health Businesses in a Value-Based Care Era
From Everand
The Big Unlock: Harnessing Data and Growing Digital Health Businesses in a Value-Based Care Era
Paddy Padmanabhan
No ratings yet
Store Private Healthcare Data Off-Chain and Manage Medical Data Using Blockchain
No ratings yet
Store Private Healthcare Data Off-Chain and Manage Medical Data Using Blockchain
6 pages
Data Mining: Fundamentals and Applications
From Everand
Data Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
Extensive Web Data Extraction
No ratings yet
Extensive Web Data Extraction
42 pages
The Power of Big Data: Transforming Industries and Shaping the Future
From Everand
The Power of Big Data: Transforming Industries and Shaping the Future
Tom Henricksen
No ratings yet
CapStoneProject_AWS_Security1
No ratings yet
CapStoneProject_AWS_Security1
7 pages
dSbDa MiniProject Case Study
No ratings yet
dSbDa MiniProject Case Study
10 pages
Hadoop-Hive Report
No ratings yet
Hadoop-Hive Report
17 pages
(Solved) Case Study - GlobalHealth Innovations LTD, A Leading Healthcare... - Course Hero
No ratings yet
(Solved) Case Study - GlobalHealth Innovations LTD, A Leading Healthcare... - Course Hero
6 pages
Chapter 2 Introduction To Hadoop
No ratings yet
Chapter 2 Introduction To Hadoop
31 pages
Clinical Decision Support System: Fundamentals and Applications
From Everand
Clinical Decision Support System: Fundamentals and Applications
Fouad Sabry
5/5 (1)
Big Data in Health Care Management
No ratings yet
Big Data in Health Care Management
2 pages
Bsa Assignment
No ratings yet
Bsa Assignment
13 pages
Data Science and Analytics Essentials: The Revolution of Decision-Making: Leveraging Data in the Digital Age
From Everand
Data Science and Analytics Essentials: The Revolution of Decision-Making: Leveraging Data in the Digital Age
Daniel Richards
No ratings yet
Big Data Analytics For Healthcare Industry
100% (1)
Big Data Analytics For Healthcare Industry
20 pages
MICROPROJECT(EST)
No ratings yet
MICROPROJECT(EST)
17 pages
Medical Report Management & Distribution System On Blockchain
No ratings yet
Medical Report Management & Distribution System On Blockchain
8 pages
Big Data Analytics For Healthcare Industry: Impact, Applications, and Tools
No ratings yet
Big Data Analytics For Healthcare Industry: Impact, Applications, and Tools
10 pages
peerj-cs-2060
No ratings yet
peerj-cs-2060
33 pages
UHI apps and LLM (1)
No ratings yet
UHI apps and LLM (1)
8 pages
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet
Unit 5 - Introduction To Hadoop
No ratings yet
Unit 5 - Introduction To Hadoop
50 pages
Bda QB Soln
No ratings yet
Bda QB Soln
22 pages
Big Data Fraud Health Care
No ratings yet
Big Data Fraud Health Care
71 pages
Ass 1 Database
No ratings yet
Ass 1 Database
6 pages
Data Science and Analytics: Transforming Raw Data into Actionable Insights: A Comprehensive Guide
From Everand
Data Science and Analytics: Transforming Raw Data into Actionable Insights: A Comprehensive Guide
Marlowe Reyes
No ratings yet
AmirLatif2020 Article ARemixIDESmartContract-basedFr
No ratings yet
AmirLatif2020 Article ARemixIDESmartContract-basedFr
24 pages
A Survey on Blockchain for Healthcare Challenges Benefits and Future Directions
No ratings yet
A Survey on Blockchain for Healthcare Challenges Benefits and Future Directions
39 pages
Mastering Big Data and Hadoop: From Basics to Expert Proficiency
From Everand
Mastering Big Data and Hadoop: From Basics to Expert Proficiency
William Smith
No ratings yet
HCI - Notes-Ch1-2
No ratings yet
HCI - Notes-Ch1-2
238 pages
Ads Report
No ratings yet
Ads Report
18 pages
Development of National Health Data Warehouse For Data Mining
No ratings yet
Development of National Health Data Warehouse For Data Mining
12 pages
Decision Support System: Fundamentals and Applications for The Art and Science of Smart Choices
From Everand
Decision Support System: Fundamentals and Applications for The Art and Science of Smart Choices
Fouad Sabry
No ratings yet
Data-Driven Healthcare: Revolutionizing Patient Care with Data Science
From Everand
Data-Driven Healthcare: Revolutionizing Patient Care with Data Science
William Webb
No ratings yet
Software Engineering Lab Submission 8
No ratings yet
Software Engineering Lab Submission 8
5 pages
Mar Publishing
No ratings yet
Mar Publishing
7 pages
1b. Information Systems For Health Information Management
No ratings yet
1b. Information Systems For Health Information Management
15 pages
E Health Care Management System Java Project
No ratings yet
E Health Care Management System Java Project
23 pages
All About Data Science: Learn Data Science from scratch
From Everand
All About Data Science: Learn Data Science from scratch
Devi Prasad
No ratings yet
Notes on Population Health: The Healthcare Guys
From Everand
Notes on Population Health: The Healthcare Guys
The Healthcare Guys
No ratings yet
Health Data Analytics And Informatics
From Everand
Health Data Analytics And Informatics
Mbuso Mabuza
No ratings yet
Hospital Management System Abstract
No ratings yet
Hospital Management System Abstract
4 pages
Principles of Data Mining
From Everand
Principles of Data Mining
Subodh Keshari
No ratings yet
Shared_Patient_Records_Networks_in_Open_Source
No ratings yet
Shared_Patient_Records_Networks_in_Open_Source
6 pages
Cognitive Systems (Unit 5)
No ratings yet
Cognitive Systems (Unit 5)
34 pages
2016-12 Hortonworks Road Show - From Acquisition To Insights
No ratings yet
2016-12 Hortonworks Road Show - From Acquisition To Insights
24 pages
BDA Question Bank
No ratings yet
BDA Question Bank
10 pages
Midpresentation Report-2024
No ratings yet
Midpresentation Report-2024
21 pages
Report
No ratings yet
Report
33 pages
Name: Shivam Shinde Class: TE - B Roll No: 39065 Website Design Issues
No ratings yet
Name: Shivam Shinde Class: TE - B Roll No: 39065 Website Design Issues
2 pages
WT Lab 3
No ratings yet
WT Lab 3
4 pages
WT Lab7
No ratings yet
WT Lab7
3 pages
6-Core Functions of Public Health
No ratings yet
6-Core Functions of Public Health
12 pages
ACCREDITED PEDIA ON Chong Hua Hospital Mandaue & Cancer Center
No ratings yet
ACCREDITED PEDIA ON Chong Hua Hospital Mandaue & Cancer Center
6 pages
Publication-UNFPA WHO Study - MONGOLIA
No ratings yet
Publication-UNFPA WHO Study - MONGOLIA
53 pages
Nurse Involvement in End of Life Decision
No ratings yet
Nurse Involvement in End of Life Decision
4 pages
Pandemic Preparation & Response - A Case Study Applying Kotter - S 8
No ratings yet
Pandemic Preparation & Response - A Case Study Applying Kotter - S 8
64 pages
DRUG STUDY (Erythromycin)
No ratings yet
DRUG STUDY (Erythromycin)
3 pages
Cholelithiasis PDF
No ratings yet
Cholelithiasis PDF
9 pages
poster-schedule
No ratings yet
poster-schedule
33 pages
مشروع الانجليزي
No ratings yet
مشروع الانجليزي
10 pages
Gender Dysphoria Fact Sheet
No ratings yet
Gender Dysphoria Fact Sheet
2 pages
Lymphatic Rescue Summit 2022 Day 7
100% (1)
Lymphatic Rescue Summit 2022 Day 7
4 pages
TDM Uts
No ratings yet
TDM Uts
285 pages
SWSDD
No ratings yet
SWSDD
33 pages
Iqvia Institute Randd Trends 2024 02 24 Forweb
No ratings yet
Iqvia Institute Randd Trends 2024 02 24 Forweb
81 pages
Gender Dysphoria - Wikipedia
No ratings yet
Gender Dysphoria - Wikipedia
30 pages
WHO Certificate E120 2019
100% (1)
WHO Certificate E120 2019
4 pages
What Is Administration in Nursing Service?
No ratings yet
What Is Administration in Nursing Service?
4 pages
Basic Needs As Human Rights
No ratings yet
Basic Needs As Human Rights
2 pages
VirtualVacancy AHU R3
No ratings yet
VirtualVacancy AHU R3
148 pages
Role Play Kelompok 1 Bahasa Inggris Kelas A
No ratings yet
Role Play Kelompok 1 Bahasa Inggris Kelas A
6 pages
ĐỀ CƯƠNG AV2-RG 350 CÂU.SV
No ratings yet
ĐỀ CƯƠNG AV2-RG 350 CÂU.SV
45 pages
LIU Pharm Student Handbook
No ratings yet
LIU Pharm Student Handbook
67 pages
Tourism and Hospitality Industry
100% (1)
Tourism and Hospitality Industry
12 pages
1 s2.0 S2468042724000101 Main
No ratings yet
1 s2.0 S2468042724000101 Main
25 pages
Certificat D'aptitude Psychologique
No ratings yet
Certificat D'aptitude Psychologique
2 pages
Home Visit
No ratings yet
Home Visit
22 pages