Library Reference

This is the main package, containing the API to create and run templates; and perform and query monitoring and provenance. For background information on what Tigres is meant to do, see the Concepts in Tigres. The top-level tigres module has tools for initializing and finalizing tigres programs. This can be done explicitly using tigres.start() and tigres.end() or implicitly when the first Tigres data object is initialized.

tigres

platform:Unix, Mac
synopsis:The high level Tigres API. Initialization of Tigres templates, tasks and their inputs.
module author:Dan Gunter <dkgunter@lbl.gov>, Gilberto Pastorello <gzpastorello@lbl.gov>, Val Hendrix <vhendrix@lbl.gov>
Example:
>>> import tempfile as temp_file
>>> from tigres import *
>>> f = temp_file.NamedTemporaryFile()
>>> program=start(log_dest=f.name)
>>> set_log_level(Level.INFO)
>>> log_info("Tasks", message="Starting to prepare tasks and inputs")
>>> def adder(a, b): return a + b
>>> task1 = Task("Task 1", FUNCTION, adder)
>>> task2 = Task("Task 2", EXECUTABLE, "/bin/echo")
>>> tasks = TaskArray(None, [task1, task2])
>>> inputs = InputArray(None, [InputValues(None, [2, 3]), InputValues(None, ['world'])])
>>> log_info("Template", message="Before template run")
>>> sequence("My Sequence", tasks, inputs)
'world'
>>> # find() returns a list of records
>>> for o in find(activity="Tasks"):
...    assert o.message
>>> end()

Tasks and Inputs

  • InputArray - Array of one or more InputValues, which will be inputs to a TaskArray in a Template.
  • InputTypes - List of types for inputs of a Task.
  • InputValues - List of values for inputs matched to a task Task.
  • Task - Function or program. A task is the atomic unit of execution in Tigres.
  • TaskArray - List of one or more Tasks, which will be executed in a Template
  • EXECUTABLE - identifies that a task implementation is an executable
  • FUNCTION - identifies that a task implementation is a function

Initialization

Users should use start() to initialize the Tigres library, and this will initialize the logging as well. When start() is called, the log_dest keyword in that function is passed as the first argument logger and any other keywords beginning in log_ will have the prefix stripped and will be passed as keywords.

User logging

One of the design goals of the state management is to cleanly combine user-defined logs, e.g. about what is happening in the application, and the automatic logs from the execution system. The user-defined logging uses the familiar set of functions. Information is logged as a Python dict of key/value pairs. It is certainly possible to log English sentences, but breaking up the information into a more structured form will simplify querying for it later.

Analysis

Data dependencies

  • PREVIOUS - a syntax where the user can specify that the output of a previously executed task as input for future task.

    • handles dependencies between templates. PREVIOUS can be both implicit and explicit.
    • The semantics of PREVIOUS are affected by the template type.

class tigres.InputTypes(name=None, list_=None)

List of types for inputs of a Task.

Example:
>>> from tigres import *
>>> task1_types = InputTypes('Task1Types', [int, str])
class tigres.InputArray(name=None, values=None)

Array of one or more InputValues, which will be inputs to a TaskArray in a Template.

Example:
>>> from tigres import InputArray,InputValues
>>> input_array = InputArray("input_array1", [InputValues("values",[1,'hello world'])])
class tigres.InputValues(name=None, list_=None)

Values for inputs of a Task.

Example:
>>> from tigres import InputValues
>>> values = InputValues("my_values", [1, 'hello world'])
tigres.PREVIOUS

PREVIOUS is a syntax where the user can specify that the output of a previously executed task as input for a task. PREVIOUS creates dependencies between templates and can be both implicit and explict.

  • PREVIOUS: Use the entire output of the previous task as the input.
  • PREVIOUS.i: sed to split outputs across parallel tasks from the previous task.It matches the i-th output of | the previous task/template to the i-th InputValues of the task to be run. This only works for parallel tasks.
  • PREVIOUS.i[n]: Use the n-th output of the previous task or template as input.
  • PREVIOUS.taskOrTemplateName: Use the entire output of the previous template/task with specified name.
  • PREVIOUS.taskOrTemplateName.i: Used to split outputs from the specified task or template across parallel tasks. | Match the i-th output of the previous task/template to the i-th InputValues of the task to be run. This only works for parallel tasks.
  • PREVIOUS.taskOrTemplateName.i[n] Use the n-th output of the previous task or template as input.
class tigres.TaskArray(name=None, tasks=None)

List of one or more Tasks, which will be executed in a Template

Instances act just like a list, except that they have a name.

class tigres.Task(name, task_type, impl_name, input_types=None, env=None)

Function or program. A task is the atomic unit of execution in Tigres.

Example:
>>> def abide(dude): return dude - 1
>>> from tigres import Task, FUNCTION, EXECUTABLE
>>> fn_task = Task("abiding", FUNCTION, abide)
>>> exe_task = Task("more_abiding", EXECUTABLE, "abide.sh")
tigres.sequence(name, task_array, input_array, env=None, redirect=False)

List of tasks placed in series; If no inputs are given for any task, the output of the previous task or template is the input of the next.

Valid configurations of task_array and input_array are:

  • InputArray has zero or more InputValues and TaskArray has one or more tasks: The task array will be run sequentially for each InputValues list given. There will be one sequential set run for each available core. The 2nd though nth task given will be fed the previous values from the task before. In the first implementation, InputValues must be provided.
Parameters:
  • name – an optional name can be assigned to the sequence
  • task_array (TaskArray or list) – an array containing tasks to execute
  • input_array (InputArray or list) – an array of input data for the specified tasks
  • env (dict) – keyword arguments for the template task execution environment
  • redirect (bool) – Turn on redirection. The standard output of the previous task is piped to standard input of the next task. IMPORTANT: PREVIOUS syntax does not work between tasks in this template.
Returns:

results from the last task executed in the sequence. If the results of the last sequence is from a python function the output type will be defined by the type returned otherwise executable output will be a string.

Return type:

object or str

Example:
>>> from tigres import *
>>> def adder(a, b): return a + b
>>> task1 = Task("Task 1", FUNCTION, adder)
>>> task2 = Task("Task 2", EXECUTABLE, "echo")
>>> tasks = TaskArray(None, [task1, task2])
>>> inputs = InputArray(None, [InputValues(None, [2, 3]), InputValues(None, ['world'])])
>>> sequence("My Sequence", tasks, inputs)
'world'
tigres.merge(name, task_array, input_array, merge_task, merge_input_values=None, env=None)

Collect output of parallel tasks

Single ‘merge’ task is being fed inputs from a set of parallel tasks. The parallel task inputs (input_array) must be explicitly defined with PREVIOUS or explicit values.

Note

If the following conditions are met: a) no inputs are given to the parallel task (input_array) b) the results of the previous template or task is iterable c) there is only one parallel task in the task_array then each item of the iterable results becomes a parallel task.

Parameters:
  • name – The name of the merge
  • task_array (TaskArray) – array of tasks to be run in parallel before last task
  • input_array (InputArray or list or None) – array of input values for task array
  • merge_task (Task) – last task to be run after all tasks on task array finish. This task will receive a list of values, whose length is equal to the length of the task array or the list .
  • merge_input_values (InputValues or None) – List of input values for last task, or the outputs of each task (as a list), if None.
  • env – keyword arguments for the template task execution environment
Returns:

results from the merge task. If the results of the merge task is from a python function the output type will be defined by the type returned otherwise executable output will be a string.

Return type:

object or string

tigres.split(name, split_task, split_input_values, task_array, input_array, env=None)

Single ‘split’ task feeding inputs to a set of parallel tasks. The parallel task inputs (input_array) must be explicitly defined with PREVIOUS or explicit values.

Note

If the following conditions are met: a) no inputs are given to the parallel task (input_array) b) the results of the split_task is iterable c) there is only one parallel task in the task_array then each item in the iteration becomes a parallel task.

Parameters:
  • name (str) – Name of the split
  • split_task (Task) – first task to be run, followed by tasks on task_array
  • split_input_values (InputValues or list) – input values for first task
  • task_array (TaskArray) – array of tasks to be run in parallel after first
  • input_array (InputArray or list) – array of input values for task array, default PREVIOUS
  • env – keyword arguments for the template task execution environment
Returns:

Output of the parallel operation

Return type:

list(object)

tigres.parallel(name, task_array, input_array, env=None)

List of tasks processing their inputs in parallel.

Valid configurations of task_array and input_array are:

  1. TaskArray and InputArray are of equal length: Each InputValues in the InputArray will be match with the Task at the same index in the TaskArray
  2. InputArray has zero or one InputValues and TaskArray has one or more tasks: If there is one InputValues, it will be reused for all tasks in the TaskArray. If there are no InputValues then the output from the previous task or template will be the input for each Task in the TaskArray.
  3. TaskArray has one task and InputArray has one or more InputValues: The one Task will be executed with each InputValues in the InputArray

Note

If the following conditions are met: a) no inputs are given to the parallel task (input_array) b) the results of the previous template or task is iterable c) there is only one parallel task in the task_array then each item of the iterable results becomes a parallel task.

Parameters:
  • name (str or None) – an optional name can be assigned to the sequence
  • task_array (TaskArray) – an array containing tasks to execute
  • input_array (InputArray) – an array of input data for the specified tasks
  • env – keyword arguments for the template task execution environment
Returns:

list of task outputs

Return type:

list

Example:
>>> from tigres import *
>>> def adder(a, b): return a + b
>>> task1 = Task(None, FUNCTION, adder)
>>> task2 = Task(None, EXECUTABLE, "/bin/echo")
>>> inputs = InputArray(None,[InputValues("One",[1,2]), InputValues("Two",['world'])])
>>> tasks = TaskArray(None,[task1, task2])
>>> parallel("My Parallel", tasks, inputs)
[3, 'world']
tigres.parallel_sequential(name, task_array, input_array, env=None, redirect=False)

A list of tasks is given in the task_array. Those tasks will be run sequentially multiple times in parallel lanes depending on the size of input_array.

Valid configurations of task_array and input_array are:

  • InputArray has zero or more InputValues and TaskArray has one or more tasks: If the InputArray has zero entries, then parallel_sequential is the same as sequential with no inputs. [PROVISIONAL] Otherwise, input_array will be formatted as a rectangular matrix (in python a list of lists) where each row is the input list for a sequence in a lane. All subsequent tasks in the sequence will use implicit PREVIOUS. This means that any parameters needed in later sequence steps will need to passed in at the top and passed down. This requirement might be changed in later versions.
Parameters:
  • name – an optional name can be assigned to the sequence
  • task_array (TaskArray or list) – an array containing tasks to execute
  • input_array (InputArray or list) – an array of input data for the specified tasks
  • env (dict) – keyword arguments for the template task execution environment
  • redirect (bool) – Turn on redirection. The standard output of the previous task in a sequence is piped to standard input of the next sequence task. IMPORTANT: PREVIOUS syntax does not work between sequence tasks in this template.
Returns:

results from the last task executed in the sequence. If the results of the last sequence is from a python function the output type will be defined by the type returned otherwise executable output will be a string.

Return type:

object or str

tigres.log_debug(*args, **kwargs)

Write a user log entry at level DEBUG.

This simply calls write() with the level argument set to DEBUG. See documentation of write() for details.

tigres.log_error(*args, **kwargs)

Write a user log entry at level ERROR.

This simply calls write() with the level argument set to ERROR. See documentation of write() for details.

tigres.log_trace(*args, **kwargs)

Write a user log entry at level TRACE.

This simply calls write() with the level argument set to TRACE. See documentation of write() for details.

tigres.log_debug(*args, **kwargs)

Write a user log entry at level DEBUG.

This simply calls write() with the level argument set to DEBUG. See documentation of write() for details.

tigres.log_info(*args, **kwargs)

Write a user log entry at level INFO.

This simply calls write() with the level argument set to INFO. See documentation of write() for details.

tigres.log_warn(*args, **kwargs)

Write a user log entry at level ERROR.

This simply calls write() with the level argument set to ERROR. See documentation of write() for details.

tigres.set_log_level(level)

Set the level of logging for the user-generated logs.

After this call, all messages at a numerically higher level (e.g. DEBUG if level is INFO) will be skipped.

Parameters:level (int) – One of the constants defined in this module: NONE, FATAL, ERROR, WARN[ING], INFO, DEBUG, TRACE
Returns:None
Raise:ValueError if level is out of range or not an integer
tigres.start(name=None, log_dest=None, execution='tigres.core.execution.plugin.local.ExecutionPluginLocalThread', recover=False, recovery_log=False, **kwargs)

Configures and initializes tigres monitoring

WARNING: this will end any previously started programs

Parameters:
  • name
  • log_dest
  • name – Name of the program (Default: auto-generated name)
  • log_dest – The log file to write to. (Default: auto-generated filename)
  • execution – The execution plugin to use
  • recover – True if you would like to run the program in recovery mode
  • recovery_log – path to the recovery log file
  • kwargs – (log_meta) dictionary containing extra metatdata to log for this workflow (log_*) Additional keywords to pass to monitoring.log.init(), with the log_ prefix stripped
Returns:

Workflow identifier

Return type:

str

Raises:

TigresException

tigres.end()
Clear the Tigres monitoring.
return:None
tigres.write(level, activity, message=None, **kwd)

Write a user log entry.

If the API is not initialized, this does nothing.

Parameters:
  • level – Minimum level of logging at which to show this
  • activity – What you are doing
  • message – Optional message string, to enable simple message-style logging
  • kwd – Contents of log entry. Note that timestamp is automatically added. Do not start keys with tg_, which is reserved for Tigres, unless you know what you are doing.
Returns:

None

Example:

# Write a simple message
write(Level.WARN, "looking_up", "I see a Vogon Constructor Fleet")
# Or, using the more convenient form
warn("looking_up", "I see a Vogon Constructor Fleet")
# Use key/value pairs
warn("looking_up", vehicle="spaceship", make="Vogon", model="Constructor", count=42)
tigres.check(nodetype, **kwargs)

Get status of a task or template (etc.).

Parameters:
  • nodetype (str, See NodeType for defined values) – Type of thing you are looking for
  • state (str) – Only look at nodes in the given state (default=DONE, None for any)
  • multiple (bool) – If True, return all matching items; otherwise group by nodetype and, if given, name. Return only the last record for each grouping.
  • names (list of str, or str) – List of names (may be regular expressions) of nodes to look for.
  • program_id – Only checks the nodes for the specifed program
  • template_id – only checks the nodes that belong to the template with the specified template_id.
Returns:

list(LogStatus). See parameter multiple.

tigres.find(name=None, states=None, activity=None, template_id=None, task_id=None, **key_value_pairs)

Find and return log records based on simple criteria of the name of the activity and matching attributes (key/value pairs).

Parameters:
  • name – Name of node, may be empty
  • states (list of str or None) – List of user activities and/or Tigres states to snippets, None being “all”.
  • activity (str) – One specific user activity (may also be listed in ‘states’)
  • template_id (str) – Optional template identifier to filter against
  • task_id (str) – Optional task identifier to filter against
  • key_value_pairs – Key-value pairs to match against. The value may be a number or a string. Exact match is performed. If a key is present in the record, the value must match. If a key is not present in a record, then it is ignored.
Returns:

List of Record instances representing the original user log data

tigres.query(spec=None, fields=None, metadata=False)

Find and return log records based on a list of filter expressions.

Supported operators for spec are:

  • >= : Compare two numeric values, A and B, and return whether A >= B.
  • = : Compare two numeric or string values, A and B, and return whether A = B.
  • <= : Compare two numeric values, A and B, and return whether A <= B.
  • > : Compare two numeric values, A and B, and return whether A > B.
  • != : Compare two numeric or string values, A and B, and return whether A != B.
  • < : Compare two numeric values, A and B, and return whether A < B.
  • ~ : Compare the string value, A, with the regular expression B, and return whether B matches A.
Example expressions:
 

foo > 1.5 – find records where field foo is greater than 1.5, ignoring records where foo is not a number.

foo ~ 1.d – find records where field foo is a ‘1’ followed by a decimal point followed by some other digit.

Parameters:
  • spec (list of str) – Query specification, which is a list of expressions of the form “name op value”. See documentation for details.
  • fields (None or list of str) – The fields to return, in matched records. If empty, return all
  • metadata – set this to True if you would like meta data for the program to be returned
Returns:

Generator function. Each item represents one record.

Return type:

list of Record (metadata = False), list of tuple (Record,metadata)

Raise:

BuildQueryError, ValueError

tigres.start(name=None, log_dest=None, execution='tigres.core.execution.plugin.local.ExecutionPluginLocalThread', recover=False, recovery_log=False, **kwargs)

Configures and initializes tigres monitoring

WARNING: this will end any previously started programs

Parameters:
  • name
  • log_dest
  • name – Name of the program (Default: auto-generated name)
  • log_dest – The log file to write to. (Default: auto-generated filename)
  • execution – The execution plugin to use
  • recover – True if you would like to run the program in recovery mode
  • recovery_log – path to the recovery log file
  • kwargs – (log_meta) dictionary containing extra metatdata to log for this workflow (log_*) Additional keywords to pass to monitoring.log.init(), with the log_ prefix stripped
Returns:

Workflow identifier

Return type:

str

Raises:

TigresException

tigres.end()
Clear the Tigres monitoring.
return:None
class tigres.Level

Defined levels, and some related functions.

  • NONE (0): Nothing; no logging
  • FATAL (10): Fatal errors
  • ERROR (20): Non-fatal errors
  • WARNING (30): Warnings, i.e. potential errors
  • INFO (40): Informational messages
  • DEBUG (50): Debugging messages
  • TRACE (60): More detailed debugging messages
  • Other levels up to 100 are allowed

Note that the numeric equivalents of these levels is not the same as the Python logging package. Basically, they are in the reverse order – higher levels are less fatal. The Level class has static methods for manipulating levels and converting to/from the Python logging levels.

static names()

Return list of all names of levels, in no particular order.

Returns:List of level names
Return type:list of str
static to_logging(level)

Get Python logging module level.

Parameters:level – Tigres logging level
Returns:Equivalent Python logging module level.
Return type:int
static to_name(levelno)

Convert a numeric level to its string equivalent.

Parameters:levelno (int) – Numeric level
Returns:String level, or empty string
Return type:str
static to_number(level_str)

Convert a string level to its numeric equivalent.

Parameters:level_str (str or basestring) – String level (case-insensitive)
Returns:Numeric level, or NONE
Return type:int
tigres.get_results()

Get the results from the latest task or template of the running Tigres program

Returns:
class tigres.LogStatus(rec)

Provide convenient access to status information related to a single log entry.

Instances of this class are returned by the check() function.

to_json()

Return JSON representation of status.

class tigres.Record(rec=None)

Record used for formatting

from_dict(rec)

Add values from dict

Convert timestamp to a number of seconds since the epoch (1-1-1970) Convert the level name to a number.

intersect_equal(other, ignore=())

Check whether all keys/values in given record, which are present in this one, are the same.

Parameters:
  • other (Record) – The record to compare to
  • ignore (list of str) – Ignore these keys
Return type:

bool

project(fields, copy=True)

Project the record onto just these fields.

Parameters:
  • fields (list of object) – The fields to project onto
  • copy (bool) – Copy to a new record, or in-place
Returns:

New record, or nothing if copy=False

Return type:

Record or None

tigres.dot_execution(path='.')

Creates an execution graph of the currently running Tigres program and writes it to a file

Parameters:path – the file path to write the DOT file to
Returns:
Sideeffect:writes a DOT file of the currently running Tigres tigres.core.monitoring.state.Program

tigres.utils

platform:Unix, Mac
synopsis:Tigres shared classes/methods/constants
module author:Gilberto Pastorello <gzpastorello@lbl.gov>, Val Hendrix <vchendrix@lbl.gov>

Functions

Classes

  • Execution - Execution plugins available for a Tigres program
  • State - States of a Tigres program
  • TigresException - A Tigres Program Exception
  • TaskFailure - Object to be returned in case of failures in task execution
class tigres.utils.Execution

Execution plugins available

  • LOCAL_THREAD
  • LOCAL_PROCESS
  • DISTRIBUTE_PROCESS
  • SGE
  • SLURM
classmethod get(name)

get the execution plugin for the name

Example:
>>> from tigres.utils import Execution
>>> Execution.get('EXECUTION_LOCAL_THREAD')
'tigres.core.execution.plugin.local.ExecutionPluginLocalThread'
Parameters:name – execution plugin string
Returns:plugin module
Return type:str
class tigres.utils.State

Tigres State constants

  • NEW - NEW, CREATED or START
  • READY - WAIT, WAITING or READY
  • RUN - RUN, RUNNING or ACTIVE
  • DONE - DONE, TERMINATED or FINISHED
  • FAIL - FAIL or FAILED execution (e.g., from exception)
  • UNKNOWN - UNKNOWN
classmethod paste(name, state)

Combine a name and a state into a single string.

class tigres.utils.TaskFailure(name='', error=TigresException('Task execution failed', ))

Object to be returned in case of failures in task execution

Parameters:
  • name (str) – Name/description for error
  • error (Exception) – Exception that occurred, if available
exception tigres.utils.TigresException

A Tigres Program Exception

tigres.utils.get_new_output_file(basename='tigres', extension='log')

Get a new unique output file name.

Parameters:
  • basename (str) – Prefix for name
  • extension (str) – File suffix (placed after a ‘.’)
Returns:

Name of file (caller should open)