The user creates a file listing the participating machines in the cluster.
shell$ cat lamhosts
# a 2-node LAM
Each machine will be given a node identifier (nodeid) starting with 0
for the first listed machine, 1 for the second, etc.
The recon tool verifies that the cluster is bootable:
shell$ recon -v lamhosts
recon: — testing n0 (node1.cluster.example.com)
recon: — testing n1 (node2.cluster.example.com)
The lamboot tool actually starts LAM on the specified cluster.
% lamboot -v lamhosts
LAM 7.1.1 – Indiana University
Executing hboot on n0 (node1.cluster.example.com – 1 CPU)…
Executing hboot on n1 (node2.cluster.example.com – 1 CPU)…
lamboot returns to the UNIX shell prompt. LAM does not force a canned
environment or a “LAM shell”. The tping command builds user
confidence that the cluster and LAM are running.
shell$ tping -c1 N
1 byte from 1 remote node and 1 local node: 0.008 secs
1 message, 1 byte (0.001K), 0.008 secs (0.246K/sec)
roundtrip min/avg/max: 0.008/0.008/0.008
Compiling MPI Programs
Refer to MPI: It’s Easy to Get Started to
see a simple MPI program. mpicc (and mpiCC and mpif77) is a wrapper
for the C (C++, and F77) compiler that includes all the necessary
command line switches to the underlying compiler to find the LAM
include files, the relevant LAM libraries, etc.
shell$ mpicc -o foo foo.c
shell$ mpif77 -o foo foo.f
Executing MPI Programs
A MPI application is started by one invocation of the mpirun command.
A SPMD application can be started on the mpirun command line.
shell$ mpirun -v -np 2 foo
2445 foo running on n0 (o)
361 foo running on n1
An application with multiple programs must be described in an
application schema, a file that lists each program and its target
shell$ cat appfile
# 1 master, 2 slaves
shell$ mpirun -v appfile
3292 master running on n0 (o)
3296 slave running on n0 (o)
412 slave running on n1
Monitoring MPI Applications
The full MPI synchronization status of all processes and messages can
be displayed at any time. This includes the source and destination
ranks, the message tag, count and datatype, the communicator, and the
TASK (G/L) FUNCTION PEER|ROOT TAG COMM COUNT DATATYPE
0/0 master Recv ANY ANY WORLD 1 INT
1 slave <running>
2 slave <running>
Process rank 0 is blocked receiving a message consisting of a single
integer from any source rank and any message tag, using the
MPI_COMM_WORLD communicator. The other processes are running.
SRC (G/L) DEST (G/L) TAG COMM COUNT DATATYPE MSG
0/0 1/1 7 WORLD 4 INT n0,#0
Later, we see that a message sent by process rank 0 to process rank 1
is buffered and waiting to be received. It was sent with tag 7 using
the MPI_COMM_WORLD communicator and contains 4 integers.
All user processes and messages can be removed, without rebooting.
shell$ lamclean -v
killing processes, done
sweeping messages, done
closing files, done
sweeping traces, done
It is typical for users to mpirun a program, lamclean when it
finishes, and then mpirun another program. It is not necessary to
lamboot to run each user MPI program.
The lamhalt tool removes all traces of the LAM session on the network.
This is only performed when LAM/MPI is no longer needed (i.e., no more
mpirun/lamclean commands will be issued).
In the case of a catastrophic failure (e.g., one or more LAM nodes
crash), the lamhalt utility will hang. In this case, the wipe tool is
necessary. The same boot schema that was used with lamboot is
necessary to list each node where the LAM run-time environment is
shell$ wipe -v lamhosts
Executing tkill on n0 (node1.cluster.example.com)…
Executing tkill on n1 (node2.cluster.example.com)…