HPC Tutorial¶

Scaling up a Tigres workflow to an HPC system like those at NERSC is very simple. This tutorial will demonstrate how to run a Tigres workflow at NERSC and assumes that you have a nersc account. It will walk you through setting up a python environment for execution and submitting Portable Batch System (PBS) script to each of the three NERSC systems: Edison, Hopper and Carver.

Python Environment
Basic Statistics Example
Cluster Management with SGE
Batch Scripts

Python Environment ¶

The following steps detail how to setup a python environment for running Tigres worklflows on NERSC resources. The instructions are for edison but should work with other similary configured NERSC resources.

The following set of instructions assume:

You have a NERSC account
You want to submit jobs to your default NERSC repo

Get Latest Tigres Release
Download the Tigres 0.1.0 source distribution from the Tigres bitbucket repository. Now copy this to your NERSC home directory:
```
$ scp tigres-0.1.0.tar.gz dtn01.nersc.gov:./
```
Get Tutorials Archive
Download the Tigres 0.1.0 tutorials from the Tigres bitbucket repository. Now copy this to your NERSC home directory:
```
$ scp tigres-0.1.0-tutorials.zip dtn01.nersc.gov:./
```
Login to NERSC and unzip the tutorials in your home directory:
```
$ ssh edison.nersc.gov
$ unzip  tigres-0.1.0-tutorials.zip
$ cd tigres-0.1.0-tutorials
```
Setup Script
Change to the tutorial directory directory and run setup_env.sh. The setup script prepares the Tigres python Environment. After running the script there will be a new virtualenv environment, env<NERSC_HOST>, that is suffixed with the NERSC host name.:
$ ./set_env.sh
Install Tigres
There is one final step. Tigres must now be installed into the virtualenv environment that was setup in the previous step.
```
$ module load python
$ source envedison/bin/activate
(envedison)$ pip install --no-index $HOME/tigres-0.1.0.tar.gz
```

Basic Statistics Example ¶

Once the tigres environment is setup in your NERSC home directory, you are ready to run the sample program, basic_statistics_by_column.py, which takes a delimited text file and performs basic statistics (total, mean, median, variance, standard deviation) on the specified columns. The Tigres program extracts the data and uses a parallel template for the statistical calculations.

This section will first walk you through the steps to get the sample data and submit the job to the job manager queue.

Get the Data

This example uses a large dataset, household_power_consumption.txt, from the UC Irvine Machine Learning repository.:

$ wget https://archive.ics.uci.edu/ml/machine-learning-databases/00235/household_power_consumption.zip
$ unzip household_power_consumption.zip

Submit the Job

A helper script called, tigres_submit.sh, has been provided in the tutorial archive. This script uses tigres_run.pbs to submit the Tigres program to the workflow queue.

Run the script:

$ ./tigres_submit.sh
Running on edison
1344091.edique02

The job is now submitted and should run in the debug queue:

$ qstat -ume

edique02:
                                                                                  Req'd    Req'd       Elap
Job ID                  Username    Queue    Jobname          SessID  NDS   TSK   Memory   Time    S   Time
----------------------- ----------- -------- ---------------- ------ ----- ------ ------ --------- - ---------
1344091.edique02        me          debug    EnergyConsumptio    --      1     24    --   00:30:00 Q       --

Cluster Management with SGE ¶

For longer workflows and/or workflows where you would like to setup a glidin/private cluster, we provide ways using the MySGE mechansim. At NERSC, MySGE can be used as the execution mechanism for Tigres workflows:

MySGE allows users to create a private Sun GridEngine cluster on large parallel systems like Hopper or Franklin. One the cluster is started, users can submit serial jobs, array jobs, and other through-put oriented workloads into the personal SGE scheduler. The jobs are then run within the user private cluster. – [1]

Setup MySGE
Follow the instructions below to set up MySGE in your workspace. These instructions can be found at [1]:
$ ssh edison.nersc.gov $ module load mysge $ mysge_init ( use all defaults )
Submit the Job using MySGE
Once you have set up MySGE, you may use the following command to submit basic_statistics_by_column.py using EXECUTION_SGE with MySGE.:
$ ./tigres_submit.sh mysge Running on edison with EXECUTION_SGE 1344091.edique02

View job status

This example uses 2 nodes for a total of 48 cores:

$qstat -ume

edique01:
Job ID                  Username    Queue    Jobname          SessID  NDS   TSK   Memory   Time    S   Time
----------------------- ----------- -------- ---------------- ------ ----- ------ ------ --------- - ---------
7794778.edique01        vch         debug    EnergyConsumptio      0     2     48    --   00:30:00 R  00:01:04

Batch Scripts ¶

PBS Script¶

The PBS script, tigres_run.pbs, can be submitted on any NERSC system. This script

starts/stops MySGE if needed (lines 6-10,25-27)
loads python (line 14)
activates the Tigres python environment (line 15)
runs the Tigres workflow (line 23):

#!/bin/bash -l
#PBS -V

cd $PBS_O_WORKDIR

if [ "${EXECUTION}" == "EXECUTION_SGE" ]; then
    module load mysge
    source ~/.vpc.${NERSC_HOST}.sh
    vpc_start -q ccm_queue -l mppwidth=48 -V
    sleep 60

fi

module load python
source env${NERSC_HOST}/bin/activate

export TIGRES_HOSTS="`awk -vORS=, '{ print $1 }' $PBS_NODEFILE | sed 's/,$/\n/'`"
export OTIGRES_PATH=$PATH
export OTIGRES_PYTHONPATH=$PYTHONPATH
export OTIGRES_LD_LIBRARY_PATH=$LD_LIBRARY_PATH

echo "./basic_statistics_by_column.py $EXECUTION household_power_consumption.txt ';' 2,3,4,5,6,7,8"
./basic_statistics_by_column.py $EXECUTION household_power_consumption.txt ';' 2,3,4,5,6,7,8

if [ "${EXECUTION}" == "EXECUTION_SGE" ]; then
    vpc_stop
fi

Lines 9-12, set up the environment variables expected by EXECUTION_DISTRIBUTE_PROCESS execute mechanism. This is used on Carver but not Edison and Hopper.

Job Submission Script¶

The job submission script, tigres_submit.sh

sets the number of nodes/cores (lines 3, 5-7)
chooses the Tigres execution mechanism (line 4)
prepares the command for running the Tigres workflow and submits the job (line 16)
If requested, sets the execution to EXECUTION_SGE and mppwidth to 48 or 2 nodes (lines 10-13)

#!/bin/bash

export NUMCORES="mppwidth=24"
export EXECUTION="EXECUTION_LOCAL_THREAD"
if [ "$NERSC_HOST" == "carver" ]; then
    export EXECUTION="EXECUTION_DISTRIBUTE_PROCESS"
    export NUMCORES="nodes=3:ppn=8"
fi

if [ "$1" == "mysge" ]; then
    export EXECUTION="EXECUTION_SGE"
    export NUMCORES="mppwidth=48"
fi
   
echo "Running on ${NERSC_HOST} with ${EXECUTION}"
qsub -V -NEnergyConsumption${NERSC_HOST} -l${NUMCORES},walltime=00:30:00 tigres_run.pbs

[1]	(1, 2) https://www.nersc.gov/users/software/workflow-software/mysge/

HPC Tutorial¶

Python Environment ¶

Basic Statistics Example ¶

Cluster Management with SGE ¶

Batch Scripts ¶

PBS Script¶

Job Submission Script¶

Table Of Contents

Previous topic

Next topic

This Page

HPC Tutorial¶

Python Environment¶

Basic Statistics Example¶

Cluster Management with SGE¶

Batch Scripts¶

PBS Script¶

Job Submission Script¶

Python Environment ¶

Basic Statistics Example ¶

Cluster Management with SGE ¶

Batch Scripts ¶