Wok:A workflow management system implemented in Python | all4bioinformatics
Breaking News
Loading...

Thursday, 5 September 2013

Wok:A workflow management system implemented in Python









Open source code: https://github.com/chris-zen/wok

Source:https://bitbucket.org/bbglab/wok


Introduction

Wok is a workflow management system implemented in Python that makes very easy to structure the workflows, parallelize their execution and monitor its progress among other things. It is designed in a modular way allowing to adapt it to different infraestructures.
For the time being it is strongly focused on clusters implementing any DRMAA compatible resource manager (i.e. Oracle Grid Engine) which working nodes have a shared folder in common. Other, more flexible infrastructures (such as the Amazon EC2) are considered for future implementations.
Workflows in Wok are defined in an xml file with the .flow extension. This definition includes:
  • the different modules (or pieces of processing)
  • the interconnections between modules (i.e. the input of module B links with the output of module A)
  • explicit dependencies (i.e. module A cannot be executed until module B has finished)
  • descriptions that can be used to generate documentation automatically or to create web forms

Each module corresponds with a piece of software that has to be run in order to process some input and generate an output. For now, only Python scripts are allowed, but they can be used to execute software written in other languages.
Workflows in Wok can be treated as any software project and managed with version control system tools and the IDE of your choice.
Wok can be used as a terminal script or can be run in server mode.
The execution of a workflow in the terminal is done using the wok-run script which allows few options:
  • An instance name (-n name), which allows to run the same workflow many times simultaneously independently
  • Configuration files (-c file.conf), the configuration can be splitted in as much files as desired
  • Configuration parameters (-D param=value), which overwrite any previous configuration in configuration files

The workflow definition file (i.e. myworkflow.flow) is passed as the first argument.
To monitor the execution of the workflow there are different resources available:
  • The web server that allows to interact with the engine in a very straightforward way. Recommended!.
  • The logs emited by the wok-run through the standard output,
  • The intermediate files generated by Wok (i.e. the tasks output files)

It has been designed for workflow developers who feel more confortable programming than doing hundred of clicks and drag & drop's, and also for those who want infraestructure flexibility and full control and monitorization of the execution.

Authors


It is being developed by Christian Pérez-Llamas under the Biomedical Genomics Research Group.

google+

linkedin

About Author
  • Donec sed odio dui. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, eget lacinia odio sem nec elit. Sed posuere consecteturDonec sed odio dui. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, eget lacinia odio sem nec elit. Read More

    0 comments:

    POST A COMMENT

     

    Gallery

    About

    About Us