Get lineage from SQL files. This might be of help if your project is a mess. Oh, my life is better after this one.
As simple as:
poetry install
- Run docker compose
docker compose up -d
-
Create a .env file from the
.env-template
, filling the value ofSIMPLE_LINEAGE_ROOT_FOLDER
variable. -
Do:
poetry run python3 simple_lineage_generator/simple_lineage_generator.py
It might take a while to do everything needed.
- After finished, open Neo4j.
http://localhost:7474
First time you access, you might face this screen, just choose "No Authentication"
- Kill docker compose
docker compose down
- If you wanna get rid of your neo4j data:
sudo rm -rf .neo4j
- Get all tables that directly source from a table
MATCH (t:Table {name: 'table name'})<-[r:SOURCES_FROM]-(x:Table) RETURN t, x
- Get all column that inherit from a column up to 10 connection levels
MATCH (n:Column {name: 'column name'})-[:SOURCES_FROM*1..10]->(m:Column)
RETURN n, m
- Get all columns up to 3 levels frm the original column and also all tables in 1 level distant
MATCH (x:Column {name: 'column name'})<-[:SOURCES_FROM*1..3]-(y:Column),
(x)-[:HAS_COLUMN*1]->(z:Table)
RETURN y, z
- Get the table lineage database with more conenctions to it
MATCH (t:Table)-[r]-(x)
WITH distinct t,x,
COUNT(r) as con_count
ORDER BY con_count DESC
RETURN t
LIMIT 1
Now plot these 1st level realtionships
MATCH (t:Table {name: 'table name'})-[r]-(x)
RETURN t, x
- Docstrings. This thing was created in just 3 days, so no time to do it properly
- Unit tests. Same reason as above
- Improve parsing for less errors.
-
Lineage was built using sqllineage
-
Thanks, Caetano Veloso, for providing an incredible soundtrack for coding this.