Post

DAG integrity - unit test your DAG before deploying

Prepare a test to ensure the pipeline is good enough to deploy

DAG integrity - unit test your DAG before deploying

Hi, guess you are not getting bored about Airflow stuff.

This blog, we are going to see how can we make sure our DAG is looking good. With DAG integrity checking method, we can ensure at our DAG is proved to be executable and has no error in a basic level.


The objective

At a unit test step, we just want to guarantee our DAGs are imported correctly. No syntax errors nor library import errors. We don’t do proving our pipeline is perfect at this time, we do that in integration test or end-to-end test.

What should we do now?


DAGBAG

DAGBAG is a module in Airflow DAG. It stores the DAGs and has structured in DAGs’ metadata. For more info, visit this link.

We can use this module to verify our DAGs are imported properly. Like this code stub.

After importing DagBag and initiate the class object as dagbag at line 3, we can print out its attributes .dags and .import_errors to see list of DAGs and list of errors if any.

This is similar to the commands we used in the last blog.


Combine with unittest

We use unittest together with this DagBag to test if we have the target DAG or not.

dag test

Please follow the link below to a complete Dag integrity scripts.

Now we try run this command to validate the DAG and see we found a DAG in DagBag.

1
python tests/dag_integrity.py

If DAGs are good, we shall see this message.

good

Otherwise, it will show an error like this.

err


Further applications

This is great to do unit test before deploying our apps to server, either preproduction or production.

We could add the command into our CI/CD stages. Any in our favor, Github action, Bitbucket pipeline, Google Cloud Build and others.

Have a great day with no bugs.

This post is licensed under CC BY 4.0 by the author.