Skip to content

Setup tasks to run periodically

Tasks are intended to run algorithms repeatedly, similar to CRON processes.

Prerequisites

To be able to complete this tutorial, you will need to be familiar with PEACH Lab environment.

Write and setup a task

Let's create a simple task that would be executed repeatedly to compute some data and store them to be used quickly from the endpoint.

Creating code for our task

For the purpose of this example here we created a simple code in TensorFlow to train model and then to save computed weights in Redis for the future use from the API endpoints level.

You are free to distribute your code across many cells or leave everything in one cell only (you will need to define what exactly would be imported in the configuration file later). Just don't forget to have your dependencies and supporting code there as well.

Important part is to have an entry function, which would be executed, in our case it's called train_model. You don't have to call this function manually, it will automatically be executed in the task loop.

Registering a task

Now as we have code for our task - it's time to register it in the organizational task loop. There are two ways of registering a task:

  • via creating peach.yaml file in the root of repository with definitions of all your tasks

  • via splitting your tasks definitions to many .yaml files and placing them in the peach.conf directory located in the root of repository

In this example let's try to define our task inside the peach.conf folder. We create tf.yaml file inside the folder with the following content.

codops: zzebu
tasks:
  tf_task:
    notebook: tensorflow_demo.ipynb
    cell: 0
    method: train_model
dependencies:
  - tensorflow==2.0.0-alpha0

Creating code for our task

  1. The file has to be inside peach.conf folder on the root level
  2. It has be be a YAML file
  3. On the root level of configuration file organization code has to be defined
  4. Configuration file needs to have tasks key, inside which there is embedded key-value structure task_name: task_definition. It is allowed to define multiple tasks here. In our case we call our task tf_tasks and nesting all the information about the task inside this key
  5. Relative path to the notebook file with your code
  6. Cell number with code to import. If you want to import full notebook with all the cells - just ignore this field
  7. An entry point function which would be called on execution
  8. List of dependencies. It can be defined on a global level for all the tasks in this file (like in the example) or can be defined on per-task basis, e.g. (pay attention to the level of nested dependencies field):
    codops: zzebu
    tasks:
      tf_task:
        notebook: tensorflow_demo.ipynb
        cell: 0
        method: train_model
        dependencies:
          - tensorflow==2.0.0-alpha0
    

If using only one peach.yaml file - just place all tasks definitions inside this file.

After saving the file and commiting changes to the git repository (don't forget to label it! Read more about peach-lab label in the Introduction) the task will be added to the next task loop!

You need to wait for the next task loop execution to have your task registered

All the code in the repository should be in the master branch

Important: the task executor will ALWAYS execute the curent version defined in MASTER branch (not the version that was in the notebook at the moment the task was registered). We strongly recommend to work in separate branches during development.

Displaying tasks information

A dedicated Notebook is provided to display informations about tasks. You have to execute all the cells to display the information and it will have the following view:

Creating code for our task

  1. You just need to run the cell calling initialize_tasks_info()
  2. Select organization from the dropdown list
  3. You can see list of all the tasks for selected organization, their state (PENDING, RUNNING, FINISHED), status of the last execution, start time and duration of the last execution
  4. You can select particular task to display detailed information about its previous executions.
  5. The table contains version (which is shortened version of the head commit of the executed notebook), start time, status of the particular execution and time taken by the task
  6. By default the logs frame would be displaying all the logs available, but if you want to filter only logs of particular execution you can do it by pressing "show below" next to the execution
  7. Graph which shows duration of the execution for this task and the status (blue dot if it was successful, red cross - if task failed)
  8. Logs frame