![]() |
Most HPX applications are executed on parallel computers. These platforms typically provide integrated job management services that facilitate the allocation of computing resources for each parallel program. HPX includes out of the box support for one of the most common job management systems, the Portable Batch System (PBS).
All PBS jobs require a script to specify the resource requirements and
other parameters associated with a parallel job. The PBS script is basically
a shell script with PBS directives placed within commented sections at
the beginning of the file. The remaining (not commented-out) portions of
the file executes just like any other regular shell script. While the description
of all available PBS options is outside the scope of this tutorial (the
interested reader may refer to in-depth documentation
for more information), below is a minimal example to illustrate the approach.
As a test application we will use the multithreaded hello_world
program, explained in the section Hello
World Example.
#!/bin/bash # #PBS -l nodes=2:ppn=4 APP_PATH=~/packages/hpx/bin/hello_world APP_OPTIONS= pbsdsh -u $APP_PATH $APP_OPTIONS --hpx:nodes=`cat $PBS_NODEFILE`
![]() |
Caution |
---|---|
If the first application specific argument (inside Alternatively, use the option --hpx:endnodes to explicitly mark the end of the list of node names: pbsdsh -u $APP_PATH --hpx:nodes=`cat $PBS_NODEFILE` --hpx:endnodes $APP_OPTIONS |
The #PBS -l nodes=2:ppn=4
directive will cause two compute
nodes to be allocated for the application, as specified in the option
nodes
. Each of the nodes will dedicate four cores to
the program, as per the option ppn
, short for "processors
per node" (PBS does not distinguish between processors and cores).
Note that requesting more cores per node than physically available is pointless
and may prevent PBS from accepting the script.
APP_PATH
and APP_OPTIONS
are shell
variables that respectively specify the correct path to the executable
(hello_world
in this case) and the command line options.
Since the hello_world
application doesn't need any command
line options, APP_OPTIONS
has been left empty. Unlike
in other execution environments, there is no need to use the --hpx:threads
option to indicate the required number of OS threads per node; the HPX
library will derive this parameter automatically from PBS.
Finally, pbsdsh is a PBS command that starts tasks to the resources allocated to the current job. It is recommended to leave this line as shown and modify only the PBS options and shell variables as needed for a specific application.
![]() |
Important |
---|---|
A script invoked by pbsdsh
starts in a very basic environment: the user's |
Another choice is for the pbsdsh
command in your main job script to invoke your program via a shell, like
sh
or bash
, so that it gives an initialized environment
for each instance. We create a small script runme.sh
which
is used to invoke the program:
#!/bin/bash # Small script which invokes the program based on what was passed on its # command line. # # This script is executed by the bash shell which will initialize all # environment variables as usual. $@
Now, we invoke this script using the pbsdsh tool:
#!/bin/bash # #PBS -l nodes=2:ppn=4 APP_PATH=~/packages/hpx/bin/hello_world APP_OPTIONS= pbsdsh -u runme.sh $APP_PATH $APP_OPTIONS --hpx:nodes=`cat $PBS_NODEFILE`
All that remains now is submitting the job to the queuing system. Assuming
that the contents of the PBS script were saved in file pbs_hello_world.sh
in the current directory, this is accomplished by typing:
qsub ./pbs_hello_world_pbs.sh
If the job is accepted, qsub will print out the assigned job ID, which may look like:
$ 42.supercomputer.some.university.edu
To check the status of your job, issue the following command:
qstat 42.supercomputer.some.university.edu
and look for a single-letter job status symbol. The common cases include:
The example qstat output below shows a job waiting for execution resources to become available:
Job id Name User Time Use S Queue ------------------------- ---------------- --------------- -------- - ----- 42.supercomputer ...ello_world.sh joe_user 0 Q batch
After the job completes, PBS will place two files, pbs_hello_world.sh.o42
and pbs_hello_world.sh.e42
, in the directory where the
job was submitted. The first contains the standard output and the second
contains the standard error from all the nodes on which the application
executed. In our example, the error output file should be empty and standard
output file should contain something similar to:
hello world from OS-thread 3 on locality 0 hello world from OS-thread 2 on locality 0 hello world from OS-thread 1 on locality 1 hello world from OS-thread 0 on locality 0 hello world from OS-thread 3 on locality 1 hello world from OS-thread 2 on locality 1 hello world from OS-thread 1 on locality 0 hello world from OS-thread 0 on locality 1
Congratulations! You have just run your first distributed HPX application!