GNU/Linux |
CentOS 2.1AS(Slurm) |
|
![]() |
libmpi(3) |
![]() |
libmpi − LAM MPI implementation
LAM features a full implementation of Message-Passing Interface, with the exception that the MPI_CANCEL function will not properly cancel messages that have been sent. Pending receive requests can be canceled; implementation of canceling send messages was judged to be too difficult (and not enough LAM users asked for it).
Compliant applications are source code portable between LAM and any other implementation of MPI. In addition to meeting the standard in a high quality manner, LAM offers extensive monitoring capabilities to support debugging. Monitoring happens on two levels. LAM has the hooks to allow a snapshot of process and message status to be taken at any time during an application run. The status includes all aspects of synchronization plus datatype map / signature, communicator group membership and message contents. On the second level, the MPI library is instrumented to produce a cumulative record of communication, which can be visualized either at runtime or post-mortem.
Another strength of this MPI implementation is the movement of non-blocking communication requests, including those that result from buffered sends. This is the real challenge of implementing MPI; everything else is mostly a straight forward wrapping of an underlying communication mechanism. LAM allows messages to be buffered on the source end in any state of progression, including partially transmitted packets. This capability leads to great portability and robustness.
Up-to-Date
Information
The LAM home page can be found on the World Wide Web at:
http://www.lam-mpi.org/ . It should be consulted for
the most current information about LAM, as well as updates,
patches, etc.
Direct MPI
Communication
The sophisticated message advancing engine at the heart of
the MPI library uses only a handful of routines to drive the
underlying communication system. Runtime flags decide which
implementation of these low-level routines is used, so
recompilation is not necessary. The default implementation
uses LAM’s network message-passing subsystem,
including its buffer daemon. In this "daemon"
mode, LAM’s extensive monitoring features are fully
available. The main purpose of daemon based communication is
development, but depending on the application’s
decomposition granularity and communication requirement, it
may also be entirely adequate for production execution.
The other implementation of the MPI library’s low-level communication intends to use the highest performance underlying mechanism, certainly bypassing the LAM daemon and connecting directly between application processes. This is the "client to client" mode (C2C).
The availability of optimal C2C implementations will continue to change as architectures come and go. At the least, LAM includes a combination TCP/IP and shared memory implementation of C2C that bypasses the LAM daemon.
MPI process and message monitoring commands and tools will be much less effective in C2C mode, usually reporting running processes and empty message queues. Signal delivery with doom(1) is unaffected.
Guaranteed
Envelope Resources
Applications may fail, legitimately, on some implementations
but not others due to an escape hatch in the MPI Standard
called "resource limitations". Most resources are
managed locally and it is easy for an implementation to
provide a helpful error code and/or message when a resource
is exhausted. Buffer space for message envelopes is often a
remote resource (as in LAM) which is difficult to manage. An
overflow may not be reported (as in some other
implementations) to the process that caused the overflow.
Moreover, interpretation of the MPI guarantee on message
progress may confuse the debugging of an application that
actually died on envelope overflow.
LAM has a property called "Guaranteed Envelope Resources" (GER) which serves two purposes. It is a promise from the implementation to the application that a minimum amount of envelope buffering will be available to each process pair. Secondly, it ensures that the producer of messages that overflows this resource will be throttled or cited with an error as necessary.
A minimum GER is configured when LAM is built. The MPI library uses a protocol to ensure GER when running in daemon mode. The default C2C mode (TCP/IP) does not use a protocol, because process-pair protection is provided by TCP/IP itself. Errors are only reported to the receiving process in C2C mode. An option to mpirun(1) disables GER.
Input and
Output
The MPI standard does not specify standard I/O
functionality. LAM does not interfere with the I/O
capabilities of the underlying system but it does make
special provisions for remote terminal I/O using the
ANSI/POSIX routines. See mpirun(1) and tstdio(3).
LAM now includes the ROMIO distribution for MPI-2 file input and output. If ROMIO support is compiled into LAM, the functionality from Chapter 9 of the MPI-2 standard is provided.
ROMIO has some important limitations under LAM; the RELEASE_NOTES file in the LAM distribution should be consulted before writing MPI programs that use MPI I/O.
Dynamic
Processes
LAM includes an implementation of MPI-2 dynamic process
creation. The non-blocking functions are not
implemented.
Debugging
Aids
Opaque MPI objects are monitored by LAM commands but the
user needs a way to cross-reference the identification given
by the commands with variables within MPI application. These
extensions extract identifiers from MPI communicators and
datatypes. See MPIL_Comm_id(2) and MPIL_Type_id(2).
Additionally, LAM provides the capability to launch non-MPI programs on remote nodes. This includes shell scripts, debuggers, etc. As long as an MPI program is eventually launched (as a child, grandchild, etc.), LAM can handle executing as many intermediate programs as necessary. This can greatly help debugging and logging of user programs.
Trace
Generation
To avoid being swamped with trace data from a long running
application, LAM supplies collective operations to turn the
tap on and off. See MPIL_Trace_on(2) and
MPIL_Trace_off(2).
Asynchronous
Signals
LAM has an signal handling package which mirrors but does
not interfere with POSIX signal handling. An MPI extension
routine delivers a signal to a process. See
MPIL_Signal(2).
Overview of
Commands and Libraries
introu(1), introc(2), INTROF(2)
Starting /
Stopping LAM
recon(1), lamboot(1), lamhalt(1), lamnodes(1), wipe(1),
tping(1), lamgrow(1), lamshrink(1)
Compiling
MPI Applications
mpicc(1), mpiCC(1), mpif77(1)
Running MPI
Applications
mpirun(1), lamclean(1)
Running
Non-MPI Applications
lamexec(1)
Monitoring
MPI Applications
mpitask(1), mpimsg(1)
Unloading
MPI Trace Data
lamtrace(1)
Enhanced I/O
in MPI Programs
lam_rfposix(2), tstdio(3), cubix(3)
Reference Documents
"LAM Frequently Asked Questions" |
"MPI Primer / Developing with LAM", Ohio Supercomputer Center | |
"MPI: A Message-Passing Interface Standard", Message-Passing Interface Forum, version 1.1 |
"MPI-2: Extensions to the Message Passing Interface", Message Passing Interface Forum, version 2.0 |
at http://www.mpi-forum.org/
MPI Quick Tutorials
"LAM/MPI ND User Guide / Introduction" |
"MPI: It’s Easy to Get Started" |
||
"MPI: Everyday Datatypes" |
||
"MPI: Everyday Collective Communication" |
"Robust MPI Message Delivery Through Guaranteed Resources", MPI Developer’s Conference, 1995 |
![]() |
libmpi(3) | ![]() |