.. _hpctutorial:

.. currentmodule:: tigres

HPC Tutorial
************

Scaling up a Tigres workflow to an HPC system like those at `NERSC <http://www.nersc.gov>`_ is very simple.
This tutorial will demonstrate how to run a Tigres workflow at `NERSC <http://www.nersc.gov>`_ and assumes
that you have a `nersc account <http://www.nersc.gov/users/accounts/>`_.  It will walk you through setting
up a python environment for execution and submitting Portable Batch System (PBS) script  to each
of the three NERSC systems: `Edison <http://www.nersc.gov/users/computational-systems/edison/>`_,
`Hopper <http://www.nersc.gov/users/computational-systems/hopper/>`_ and
`Carver <http://www.nersc.gov/users/computational-systems/carver/>`_.

.. contents::
    :depth: 1
    :local:
    :backlinks: top

Python Environment
==================
The following steps detail how to setup a python environment for running Tigres worklflows on NERSC resources. The instructions are for edison but should work with other similary configured NERSC resources. 

The following set of instructions assume:

* You have a NERSC account
* You want to submit jobs to your default NERSC repo


1. Get Latest Tigres Release
    `Download <https://bitbucket.org/berkeleylab/tigres/downloads>`_
    the Tigres |release| source distribution from the
    `Tigres bitbucket repository <https://bitbucket.org/berkeleylab/tigres/downloads>`_.
    Now copy this to your NERSC home directory:

    .. parsed-literal::

        $ scp tigres-|release|.tar.gz dtn01.nersc.gov:./

2. Get Tutorials Archive
    `Download <https://bitbucket.org/berkeleylab/tigres/downloads>`_
    the Tigres |release| tutorials  from the
    `Tigres bitbucket repository <https://bitbucket.org/berkeleylab/tigres/downloads>`_.
    Now copy this to your NERSC home directory:

    .. parsed-literal::

        $ scp tigres-|release|-tutorials.zip dtn01.nersc.gov:./

    Login to NERSC and unzip the tutorials in your home directory:

    .. parsed-literal::

        $ ssh edison.nersc.gov
        $ unzip  tigres-|release|-tutorials.zip
        $ cd tigres-|release|-tutorials

3. Setup Script
    Change to the tutorial directory directory and run :code:`setup_env.sh`. The setup script
    prepares the Tigres python Environment. After running the script there will
    be a new virtualenv environment, :code:`env<NERSC_HOST>`, that is suffixed with the NERSC host name.::

    $ ./set_env.sh


4. Install Tigres
    There is one final step.  Tigres must now be installed into the virtualenv environment that was setup in the
    previous step.

    .. parsed-literal::

        $ module load python
        $ source envedison/bin/activate
        (envedison)$ pip install --no-index $HOME/tigres-|release|.tar.gz


Basic Statistics Example
========================
Once the tigres environment is setup in your NERSC home directory, you are ready to run the sample program,
:download:`basic_statistics_by_column.py </_static/code/basic_statistics_by_column.py>`, which takes a
delimited text file and performs basic statistics (total, mean, median, variance, standard deviation) on the specified
columns. The Tigres program extracts the data and uses a parallel template for the statistical calculations.

This section will first walk you through the steps to get the sample data and submit the job to the job manager queue.


--------

1. Get the Data
    This example uses a large dataset, `household_power_consumption.txt <https://archive.ics.uci.edu/ml/machine-learning-databases/00235/>`_,
    from the
    `UC Irvine Machine Learning repository <http://archive.ics.uci.edu/ml/datasets/Individual+household+electric+power+consumption>`_.::

    $ wget https://archive.ics.uci.edu/ml/machine-learning-databases/00235/household_power_consumption.zip
    $ unzip household_power_consumption.zip


2. Submit the Job
    A helper script called, :download:`tigres_submit.sh </_static/code/tigres_submit.sh>`, has been
    provided in the tutorial archive. This script uses :download:`tigres_run.pbs </_static/code/tigres_run.pbs>`
    to submit the Tigres program to the workflow queue.

    Run the script::

        $ ./tigres_submit.sh
        Running on edison
        1344091.edique02

    The job is now submitted and should run in the
    `debug queue <http://www.nersc.gov/users/computational-systems/edison/running-jobs/queues-and-policies/>`_::

        $ qstat -ume

        edique02:
                                                                                          Req'd    Req'd       Elap
        Job ID                  Username    Queue    Jobname          SessID  NDS   TSK   Memory   Time    S   Time
        ----------------------- ----------- -------- ---------------- ------ ----- ------ ------ --------- - ---------
        1344091.edique02        me          debug    EnergyConsumptio    --      1     24    --   00:30:00 Q       --


Cluster Management with SGE
===========================
For longer workflows and/or workflows where you would like to setup a glidin/private cluster, we provide 
ways using the MySGE mechansim. At NERSC, MySGE can be used as the
execution mechanism for Tigres workflows:

    MySGE allows users to create a private Sun GridEngine cluster on large parallel systems like Hopper or Franklin.
    One the cluster is started, users can submit serial jobs, array jobs, and other through-put oriented workloads
    into the personal SGE scheduler. The jobs are then run within the user private cluster.
    -- [1]_

-----


1. Setup MySGE
    Follow the instructions below to set up MySGE in your
    workspace. These instructions can be found at [1]_::

    $ ssh edison.nersc.gov
    $ module load mysge
    $ mysge_init ( use all defaults )


2. Submit the Job using MySGE
    Once you have set up MySGE, you may use the following command to submit
    :download:`basic_statistics_by_column.py </_static/code/basic_statistics_by_column.py>` using
    EXECUTION_SGE with MySGE.::

        $ ./tigres_submit.sh mysge
        Running on edison with EXECUTION_SGE
        1344091.edique02


3. View job status
    This example uses 2 nodes for a total of 48 cores::

        $qstat -ume

        edique01:
        Job ID                  Username    Queue    Jobname          SessID  NDS   TSK   Memory   Time    S   Time
        ----------------------- ----------- -------- ---------------- ------ ----- ------ ------ --------- - ---------
        7794778.edique01        vch         debug    EnergyConsumptio      0     2     48    --   00:30:00 R  00:01:04

Batch Scripts
=============

PBS Script
----------
The PBS script, :download:`tigres_run.pbs </_static/code/tigres_run.pbs>`, can be submitted on any NERSC system. This script

* starts/stops MySGE if needed (lines 6-10,25-27)
* loads python (line 14)
* activates the Tigres python environment (line 15)
* runs the Tigres workflow (line 23):

.. literalinclude:: /_static/code/tigres_run.pbs
    :linenos:
    :language: bash

Lines 9-12, set up the environment variables expected by :code:`EXECUTION_DISTRIBUTE_PROCESS` execute mechanism. This
is used on Carver but not Edison and Hopper.

Job Submission Script
---------------------
The job submission script, :download:`tigres_submit.sh </_static/code/tigres_submit.sh>`

* sets the number of nodes/cores (lines 3, 5-7)
* chooses the Tigres execution mechanism (line 4)
* prepares the command for running the Tigres workflow and submits the job (line 16)
* If requested, sets the execution to EXECUTION_SGE and mppwidth to 48 or 2 nodes (lines 10-13)

.. literalinclude:: /_static/code/tigres_submit.sh
    :linenos:
    :language: bash

.. [1] https://www.nersc.gov/users/software/workflow-software/mysge/