Master Airflow With This Amazing Document!
Master Airflow With This Amazing Document!
Master Airflow With This Amazing Document!
GritSetGrow - GSGLearn.com
AIRFLOW
Start
Your
Data Engineering
Journey
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
python task =
BashOperator(task_id='bash_example',
bash_command='exit 1',
retries=3,
retry_delay=timedelta(minutes=5),
dag=dag)
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
A custom operator is an
extension of Airflow’s base
operators to include custom
logic.
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
python task =
BashOperator(task_id='bash_example',
bash_command='echo "Hello World"',
dag=dag, sla=timedelta(minutes=30))
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
python python_task =
PythonOperator(task_id='python_example',
python_callable=my_function,
op_args=['arg1'], dag=dag)
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
WHAT IS A SUBDAGOPERATOR?
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
python task =
BashOperator(task_id='bash_example',
bash_command='sleep 300',
execution_timeout=timedelta(minutes=
5), dag=dag)
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
WHAT IS A
TRIGGERDAGRUNOPERATOR?
The TriggerDagRunOperator
triggers another DAG from
within a DAG.
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
ini [logging]
base_log_folder = /path/to/logs
remote_logging = True
remote_log_conn_id = my_s3_conn
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
python task =
BashOperator(task_id='bash_example',
bash_command='echo "Hello World"',
priority_weight=10, dag=dag)
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
The BranchPythonOperator
allows branching based on
the result of a Python function.
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
python task =
BashOperator(task_id='bash_example',
bash_command='echo {{ ds }}',
dag=dag)
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
python from
airflow.operators.dagrun_operator
import TriggerDagRunOperator
trigger_task =
TriggerDagRunOperator(task_id='trigger
_dag', trigger_dag_id='example_dag',
dag=dag)
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
python task =
BashOperator(task_id='bash_example',
bash_command='exit 1', retries=3,
retry_delay=timedelta(minutes=5),
dag=dag)
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
python task =
BashOperator(task_id='bash_example',
bash_command='echo {{ ds }}',
dag=dag)
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
file_sensor_task =
FileSensor(task_id='wait_for_file',
filepath='/path/to/file', dag=dag)
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
python task =
BashOperator(task_id='bash_example',
bash_command='echo {{ ds }}',
dag=dag)
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
Shwetank Singh
GritSetGrow - GSGLearn.com
DATA ENGINEERING 101 - AIRFLOW
python dag =
DAG('example_dag',
schedule_interval='@daily',
start_date=days_ago(1))
Shwetank Singh
GritSetGrow - GSGLearn.com
THANK YOU