COL362-632 - DBMS - Assignment 1- SQL Queries
COL362-632 - DBMS - Assignment 1- SQL Queries
COL362/632 - DBMS -
Assignment 1: SQL
Queries
Deadline
All submissions should be made on Moodle
by 3rd Feb 2025, 11:59 PM.
General Instructions
Follow all instructions. Submissions not
following these instructions will not be
evaluated and will be given Zero marks.
:
1. Kindly ensure that this assignment is
completed independently. Collaboration
with external entities, including
individuals, AI agents, websites,
discussion forums, etc., is strictly
prohibited. You can discuss and post
questions on the piazza to seek
clarification until 1st Feb.
2. You must use PostgreSQL 15.x for this
assignment.
3. There are a total of 35 queries in this
assignment.
22XXCSXX123/
│
BASIC/
│ BASIC_1.sql
│ BASIC_2.sql
│ ...
│
INTERMEDIATE/
│ INTERMEDIATE_1.sql
│ INTERMEDIATE_2.sql
│ ...
:
│
ADVANCED/
│ ADVANCED_1.sql
│ ADVANCED_2.sql
│ ...
│
GRAPH/
GRAPH_1.sql
GRAPH_2.sql
...
Dataset
This dataset is structured into two primary
modules—hospital ("hosp") and intensive
care unit ("icu")—which together span tens
of thousands of patient encounters and
aggregate millions of individual records. We
highly suggest to also examine the schema
and data before writing your queries.
Hospital Module ("hosp"):
• Data Included:
Description
hosp Module
:
Table Name Description
d_hcpcs Dimension table describing codes used in hcpcsevents (mostly CPT codes)
diagnoses_icd ICD-9/ICD-10 diagnoses billed for each hospital admission. Links to d_icd_d
emar_detail Detailed breakdown of each eMAR event, including dose given and route.
labevents Laboratory measurements and results from hospital-wide labs. Links to d_la
pharmacy Detailed records of filled medications, including dosing information and stat
and emar via pharmacy_id.
poe Provider Order Entry (POE) records. General interface for entering patient or
consults, etc.).
prescriptions Medications ordered for a patient, including route, frequency, and identifyin
:
prescriptions Medications ordered for a patient, including route, frequency, and identifyin
to pharmacy and emar.
procedures_icd ICD-9/ICD-10 procedure codes billed during hospital stays. Links to d_icd_p
provider Lists de-identified provider identifiers used across hosp (e.g., order_provide
services Tracks clinical services assigned to the patient (e.g., “Surgery,” “Cardiology
transfers Tracks physical movement of patients across different wards/units in the hos
icu Module
chartevents The majority of ICU charted data (vital signs, ventilator settings, etc.). Links
itemid.
icustays Defines individual ICU stays, with admission and discharge times.
ingredientevents Ingredients used within the same solution for a drug (e.g., multiple additives
outputevents Output data (urine output, drain volumes) recorded in the ICU.
procedureevents Procedures documented during the ICU stay (e.g., mechanical ventilation, im
includes start/end times and caregiver info.
:
Detailed Tables of Columns
Below is a reference table for each dataset
table. Columns appear in the following
format:
JavaScript
hosp Module
• admissions
• subject_id → patients.subject_id
• d_hcpcs
• d_icd_diagnoses
:
Column Name Data Type Description
long_title text Full descriptive title for the ICD code (e.g., “Choler
• d_icd_procedures
• d_labitems
• diagnoses_icd
• drgcodes
• emar
• emar_detail
dose_due, text The expected dose and units for the administration
dose_due_unit
infusion_rate_unit text The unit for the infusion rate (e.g., mL/hr).
• hcpcsevents
seq_num bigint Sequence number for the code within a single adm
• labevents
valuenum double precision Numeric value of the lab test result if applicable.
ref_range_lower double precision Lower bound of the normal range for this lab test, i
:
ref_range_lower double precision Lower bound of the normal range for this lab test, i
ref_range_upper double precision Upper bound of the normal range for this lab test,
• microbiologyevents
isolate_num double precision The isolate number within the specimen (multiple i
dilution_comparison text Comparison operator for the dilution (e.g. “>”, “<”)
• omr
• patients
:
Column Name Data Type Description
• pharmacy
pharmacy_id bigint Primary key within the pharmacy table, used to lin
infusion_type text Code for type of infusion (e.g., ‘B’, ‘C’, ‘N’).
• poe
:
Column Name Data Type Description
• poe_detail
field_name text The label/name of the attribute for this order det
• prescriptions
• procedures_icd
• Joins to d_icd_procedures on
(icd_code, icd_version).
• provider
• services
• transfers
:
Column Name Data Type Description
careunit text The unit or ward to which the patient was trans
etc.).
icu Module
• caregiver
• chartevents
valuenum double precision Numeric version of the recorded value (if appl
• datetimeevents
• d_items
:
• d_items
linksto text The name of the events table to which this ite
“inputevents”).
• icustays
:
Column Name Data Type Description
first_careunit text The first ICU care unit (e.g., MICU, SICU) the
last_careunit text The last ICU care unit the patient stayed in (c
• ingredientevents
:
Column Name Data Type Description
rate double precision The rate at which the ingredient was adminis
• inputevents
patientweight double precision Patient weight (kg) at the time of the order.
• outputevents
• procedureevents
Queries
Basic Queries
Note (for this section): In case of a tie,
output all the answers following the order.
Use the strings (along with the case) given in
the query for string comparison.
The format "X days, HH:MM:SS" is a
general format for interval differences. If the
duration is 24 hours or more, it is displayed
as "X days, HH:MM:SS". If the duration
is less than 24 hours, it is shown
as "HH:MM:SS"
output: count
6. For each patient, count the number of
times they have been admitted to
ICU(including 0 number of stays).
Intermediate Queries
order by columns:
percentage(descending)
Advanced Queries
Note (for this section): In case of a tie,
output all the answers following the order.
Use the strings (along with the case) given in
:
the query for string comparison.
For this section use the format “YYYY-MM-
DD HH:MM:SS” to display the time gap. You
may use the age() operator and extract year,
month and day from it. Make sure to stick to
the exact format.
ordered by columns:
died_in_hospital(descending),
resistant_antibiotic_count(descending)
,icu_length_of_stay_hours(descending),
:
,icu_length_of_stay_hours(descending),
subject_id(ascending),
hadm_id(ascending).
order by columns:
average_transfers(Descending),
distinct_procedures_count
(Descending), subject_id (Ascending),
hadm_id (Ascending)
:
hadm_id (Ascending)
4. Find the patient having the longest chain
of transfers for an
admission( hadm_id ). For that patient
output all the chains of transfers for each
hospital admission. If more than one
patient has the longest chain output
them all.