.. _tigresoverview:


Concepts in Tigres
******************

Tigres provides an API for composing, executing and monitoring workflows using an abstraction called
*templates* which captures common execution patterns. In this section of the documentation, we provide some overview information on the following topics:

.. contents::
    :depth: 1
    :local:
    :backlinks: top

Templates
=========
The Tigres template API allows one to programmatically create `workflows` using `templates` as the building
blocks. These `templates` are composed of individual `tasks` that are units of work from the end-user that needs to be executed. A Tigres `workfow` is a
python program (e.g. :code:`my_program.py`) that uses the Tigres template API to build and execute a workflow
`(Tigres program)`. There are four basic Tigres template functions: :code:`sequence`, :code:`parallel`,
:code:`merge` and :code:`split`. The Tigres program can contain one or more of these
template functions.

.. centered:: :strong:`Execution behavior of the four Tigres templates`

.. image:: _static/images/templates.png
   :alt:  The four Tigres templates a) Sequence b) Parallel c) Split d) Merge
   :align: center
   :width: 100%


Core API
========
As mentioned previously, a Tigres program is a python program that uses the Tigres template API to build workflow. The templates are the core of the Tigres API. Any or all of the four Tigres templates can be used in a single program.  A :code:`Task`
is the most basic unit of execution and can be defined as a python function internal to the Tigres program or
a separate executable (e.g. :code:`wget`, :code:`my_c_program`). The figure above demonstrates the flow of
execution with arrows. The inputs to each :code:`Task` execution can be statically defined or retrieved from
the results of previously executed *Tasks* or *Templates*. 

.. centered:: :strong:`A Glossary of Tigres Concepts`

.. glossary::

   Task
      The atomic unit of execution. (see :class:`tigres.Task`)

   Task Array
      A collection of tasks. (see :class:`tigres.TaskArray`)

   Templates
      Patterns of execution from a combination of tasks. (see :ref:`apireference`)

   Input Types
      The characteristic of the task the defines the type of the inputs. (see :class:`tigres.InputTypes`)

   Input Values
      The values used in a task execution. (see :class:`tigres.InputValues`)

   Input Array
      A collection of input values for a number of tasks. (see :class:`tigres.InputArray`)

Each *template* function minimally takes two named collections: *Task Array* and *Input Array*.
The *Task Array* is an ordered collection of tasks to be executed together, in sequence or
parallel depending on the execution flow of the particular template. The *InputArray* defines
the inputs for each *Task* in the corresponding *Task Array* and is a collection of *Input Values*.

The *Task*, the atomic unit of execution in Tigres, has a collection of *Input Types* that specifies
the type of inputs a task may take. A task's *Input Values* is an order list of task inputs and
are passed to the task during execution. They are  are not included in the task definition which allows for
task reuse and late binding of data elements to the Tigres program execution. The data model in Tigres is 
described in greater detail in Section :ref:tigres-data-model-label


Monitoring API
==============

The monitoring API has functions to:

*  create monitoring information
*  find and view monitoring information.

The monitoring information contains both information about template execution that is automatically generated
by Tigres, and arbitrary user-provided information. All the monitoring information, both automatic and
user-provided, is *semi-structured*, meaning it is broken into name/value pairs but only a few of the names and
values are pre-defined. In general, the monitoring follows the `Logging Best Practices`_ that arose from the
NetLogger_ project.

.. _`Logging Best Practices`: https://docs.google.com/a/lbl.gov/document/d/1oeW_l_YgQbR-C_7R2cKl6eYBT5N4WSMbvz0AT6hYDvA/edit
.. _`NetLogger`: http://netlogger.lbl.gov/


Execution Environments
======================
Tigres can be executed in several different environments from batch queues to local threads and processes.
By using the appropriate `execution engine`, a Tigres program can be executed on a single node or
deployed without additional infrastructure to department clusters and batch processing queues on
supercomputers. A program is written once and only the execution engine is changed at run time.
This allows users to easily scale from development (desktop) to production (department clusters and HPC centers).

Tigres currently supports five execution engines.

  * Local Threads - Tigres runs tasks as threads on one machine.
  * Local Processes - Tigres runs tasks as processes on one machine.
  * Distributed Processes - Tigres distributes tasks as processes across a cluster of machines.
  * Sun Grid Engine - Tigres submits tasks as Sun Grid Engine jobs. This mode is used on HPC resources (e.g., NERSC) where a private instance of MySGE is run as a glidin. 
  * SLURM - Tigres submits tasks to a SLURM job manager