Post

Data Integration (EP 3 end) - clock-work

So many tools can schedule tasks but this time we will use the innate programs.

Data Integration (EP 3 end) - clock-work
In this series

Hello guys and myself!

We have created a simple program from Talend on the latest episode, right? Now we go setup the program. So many tools can schedule tasks but this time we will use the innate programs that are:

  • Task Scheduler of Windows OS or
  • Crontab of Unix-based OS e.g. OSX and Linux

Firstly, we need to export our program to be the executable files.


How can we export our program?

Start from saving our program after test it without any errors then right-click at the job in Repository and select Build Job.

talend build menu

Choose the path of the export. It will be a zip file that is able to be unzip after export by tick “Extract zip file”.

Don’t forget to click Override parameter's values > Values from selected context in case we set contexts in the program.

talend override params

Once finished, Talend takes some time to build a program. We will find it at the destination path. As the figure below, bat and sh files are in the place. bat is for running on Windows while sh one is for Unix.

check executable files

Inside those files, we can find the context in the format --context_param fpath="..." and we can modify it as we desire.

executable params

And yeah we got the program then we go setup the schedule time to run it.


Task Scheduler (Windows OS)

Find this on start button as below.

task scheduler

The main interface.

task scheduler interface

Click Task Scheduler Library to view all schedule and create a new one by click Create Task at the right-hand side.

task scheduler create task

We can set the values as the figures below:

create task name a task

set time set time of schedule

set path set program path and context

set conditions set condition based on machine status

set failure conditions set condition in case of failure

This is the history of running a sample task on schedule. I had set it to be run every hour as we can view its changes easily.

task scheduler history

The sample output files.

outputs


Crontab (Unix)

Other than task scheduler, Crontab is an innate tool of Unix that is quite easy to use. It’s concept is to define a format of schedule time plus a command. That format is below:

[minute] [hour] [day_of_week] [date_of_month] [month] [command]

For example, the schedule is midnight of every single night then we can write as this:

0 0 * * * bash Talend_program.sh

The meaning is, Talend_program.sh will be run at 00:00. The asterisk (*) means whatever values – it is written as [day_of_week] [date_of_month] [month] means every day of every month.

You can play around the crontab syntax by visit https://crontab.guru

We shall set this command by open Terminal and type crontab -e

Hit I for insertion mode ( --INSERT-- should be shown at the bottom-left of the screen) then type the command as below:

crontab vi

After that, hit esc to exit the insertion mode then :wq (write then quit) + enter to save it. When we don’t want this command to be run, just edit this file by removing that line or insert # at the front of that line to make it a comment as the program doesn’t run comments.

Use crontab -l to list all of our schedules

crontab list

Here is the result of the crontab.

sample outputs

And … this is just a tiny ability Talend can do. You can design and apply it more than what I’ve shown. Let’s play with it and feel fun.


References

This post is licensed under CC BY 4.0 by the author.