Post

Let's try: Apache Beam part 2 - draw the graph

We can generate a DAG in visual figure using a few steps.

Let's try: Apache Beam part 2 - draw the graph
In this series

Continue to part 2.

As we know that Apache Beam pipeline will process like a waterfall from top to bottom, and also no cycle. This is what we call “DAG” or “Directed Acyclic Graph”.

We write Beam code in Python and we also can generate a DAG in visual figure using a few steps.


1. Install Graphviz

graphviz is a common package for generating any diagram using DOT language. We need to install this first and there are many installation method depends on your platform. See all download list at https://graphviz.org/download/

For me, I prefer using brew.

1
brew install graphviz

Verify if graphviz has been installed properly with the command.

1
dot -V # capital `V`

Then we should see its version.

graphviz

Read more about brew at link below.


2. Apply RenderRunner in Beam

Now we go back to our Beam code and update the code like this.

We are using RenderRunner to generate a DOT script for graphviz. Read more about this runner at this doc.

Also we put beam.options.pipeline_options.PipelineOptions() for the parameter options as well or it won’t generate a figure.


3. Execute

Let’s say we have a complete code like this one.

What we should do next is to run this with parameter --render_output="<path>". For example:

1
python3 main.py --render_output="dag.png"

execute

Therefore we will see “dag.png” as follows.

dag

However, if we name the step like this.

The figure it generated also has the name we put.

dag 2


Repo

This post is licensed under CC BY 4.0 by the author.