A Dash application for visualizing and monitoring healthcare data quality metrics based on the Tuva Data Quality Framework.
The Tuva Data Quality Dashboard provides a comprehensive view of data quality for healthcare data mapped to the Tuva Data Model. Tries to answer the question of "have I mapped my input layer correctly?" It allows users to:
- Monitor overall data quality with a grading system (A-F)
- Track data mart usability status
- Identify and investigate data quality issues
- Generate printable data quality reports
This application consumes data quality test results and exploratory chart data from the Tuva Health dbt package.
- Data Quality Grade: Overall assessment of data quality (A-F)
- Data Mart Status: Usability status for each data mart
- Test Results: Detailed view of passing and failing tests
- Quality Dimensions: Analysis by completeness, validity, consistency, etc.
- Visualizations: Charts showing data patterns and quality metrics
- Report Generation: Printable data quality report cards
- Level 1: Critical issues that prevent dbt from building
- Level 2: Major issues affecting data reliability, specific to marts
- Level 3: Moderate issues requiring caution in data usage, specific to marts
- Level 4: Minor issues with limited impact, specific to marts
- Level 5: Low-priority issues / informational dbt tests
- Python 3.11+
- Docker (optional, for containerized deployment)
- Clone this repository:
git clone https://github.com/tuva-health/tuva_dqi.git cd tuva_dqi
- Create a virtual environment for this repo
cd tuva_dqi # Second cd intentional python -m venv .venv # If python does not work, try python3 source .venv/bin/activate # On Windows: .venv\Scripts\activate python -m pip install -r requirements.txt
- Run the application:
python app.py
- Access the dashboard at localhost:8080
To run the application in a Docker container:
docker build -t tuva_dqi .
docker run -p 8080:8080 tuva_dqi
This dashboard consumes two main CSV files exported from the Tuva dbt package:
- Test Results (
data_quality__testing_summary
): Contains results of data quality tests - Chart Data (
data_quality__exploratory_charts
): Contains data for visualizations
The input CSV files are generated from the Tuva dbt package. The dashboard is
designed to work with the dqi-enhancements
branch which contains the necessary data quality components.
Any release compatibility would require a version > 0.14.2.
- Create a new dbt project or use an existing one
- Add the Tuva package to your
packages.yml
file, specifying tuva version 0.14.3 or later:
- package: tuva-health/the_tuva_project
version: [">=0.14.3"]
- Install the package and run dbt building commands:
dbt clean
dbt deps
dbt seed --full-refresh
dbt run
dbt test
- Export the required tables as CSV files:
- Export
data_quality__testing_summary
as CSV with headers - Export
data_quality__exploratory_charts
as CSV with headers
- Upload these CSV files to the dashboard using the "Import Test Results" feature
This dashboard is currently in alpha/early development. It is designed to work with Tuva versions 0.14.3 or later, which contains the necessary data quality components.