User's Guide for mpich,a Portable Implementation of MPIVersion 1.2.2

William Gropp and Ewing Lusk

This User's Guide corresponds to Version 1.2.2 of mpich.

MPI (Message-Passing Interface) is a standard specification for message-passing libraries. mpich is a portable implementation of the full MPI specification for a wide variety of parallel and distributed computing environments. This paper describes how to build and run MPI programs using the mpich implementation of MPI.

This document describes how to use mpich [9], the portable implementation of the MPI Message-Passing Standard. Details on acquiring and installing the mpich implementation are presented in a separate Installation Guide for mpich [6]. Version 1.2.2 of mpich is primarily a bug fix and increased portability release, particularly for LINUX-based clusters.

New and improved in 1.2.2:

* A greatly improved ch_p4mpd device.

* Improved support for assorted Fortran 77 and Fortran 90 compilers, including compile-time evaluation of Fortran constants used in the mpich implementation.

* An improved globus2 device, providing better performance.

* A new bproc mode for the ch_p4 device supports Scyld Beowulfs.

* Many TCP performance improvements for the ch_p4 and ch_p4mpd devices, as well as

* Many bug fixes and code improvements. See www.mcs.anl.gov/mpi/mpich/r1_2_2changes.html for a complete list of changes.

Features that were new in 1.2.1 included:
* Improved support for assorted Fortran and Fortran 90 compilers. In particular, a single version of mpich can now be built to use several different Fortran compilers; see the installation manual (in doc/install.ps.gz) for details.

* Using a C compiler for MPI programs that use mpich that is different from the one that mpich was built with is also easier now; see the installation manual.

* Known problems and bugs with this release are documented in the file mpich/KnownBugs .

* There is an FAQ at http://www.mcs.anl.gov/mpi/mpich/faq.html . See this if you get ``permission denied'', ``connection reset by peer'', or ``poll: protocol failure in circuit setup'' when trying to run mpich.

* There is a paper on jumpshot available at ftp://ftp.mcs.anl.gov/pub/mpi/jumpshot.ps.gz . A paper on MPD is available at ftp://ftp.mcs.anl.gov/pub/mpd.ps.gz.


Contents

  • Introduction
  • Linking and running programs
  • Scripts to Compile and Link Applications
  • Using Shared Libraries
  • Fortran 90 and the MPI module
  • Compiling and Linking without the Scripts
  • Running with mpirun
  • Special features of different systems
  • MPMD Programs
  • Workstation clusters and the ch_p4 device
  • Checking your machines list
  • Using the Secure Shell
  • Using the Secure Server
  • SMP Clusters
  • Heterogeneous networks and the ch_p4 device
  • The P4 Procgroup File
  • Tuning P4 Performance
  • Using special interconnects
  • Using Shared Libraries with the ch_p4 device
  • Fast Startup with the Multipurpose Daemon and the ch_p4mpd Device
  • Goals
  • Introduction
  • Examples
  • How the Daemons Work
  • Running MPICH Jobs under MPD
  • Debugging MPI Programs
  • The printf Approach.
  • Using a Commercial Debugger.
  • Using mpigdb.
  • Symmetric Multiprocessors (SMPs) and the ch_shmem device
  • Computational Grids: the globus2 device
  • MPPs
  • Sample MPI programs
  • The MPE library of useful extensions
  • Logfile Creation
  • Logfile Format
  • Parallel X Graphics
  • Other MPE Routines
  • Profiling Libraries
  • Accumulation of Time Spent in MPI routines
  • Automatic Logging
  • Customized Logging
  • Real-Time Animation
  • Logfile Viewers
  • Upshot and Nupshot
  • Jumpshot-2 and Jumpshot-3
  • Automatic generation of profiling libraries
  • Tools for Profiling Library Management
  • Debugging MPI programs with built-in tools
  • Error handlers
  • Command-line arguments for mpirun
  • MPI arguments for the application program
  • Debugging with the ch_p4 Device
  • p4 Debugging
  • Setting the Working Directory for the p4 Device
  • Command-line arguments for the application program
  • Starting jobs with a debugger
  • Starting the debugger when an error occurs
  • Attaching a debugger to a running program
  • Signals
  • Related tools
  • Debugging MPI programs with TotalView
  • Preparing mpich for TotalView debugging
  • Starting an mpich program under TotalView control
  • Attaching to a running program
  • Debugging with TotalView
  • Other MPI Documentation
  • In Case of Trouble
  • Problems compiling or linking Fortran programs
  • General
  • Problems Linking C Programs
  • General
  • Sun Solaris
  • HPUX
  • LINUX
  • Problems starting programs
  • General
  • Workstation Networks
  • IBM RS6000
  • IBM SP
  • Programs fail at startup
  • General
  • Workstation Networks
  • Programs fail after starting
  • General
  • HPUX
  • ch_shmem device
  • LINUX
  • Workstation Networks
  • Trouble with Input and Output
  • General
  • IBM SP
  • Workstation Networks
  • Upshot and Nupshot
  • General
  • HP-UX
  • Appendices
  • Automatic generation of profiling libraries
  • Writing wrapper definitions
  • Options for mpirun
  • mpirun and Globus
  • Using mpirun To Construct An RSL Script For You
  • Using mpirun By Supplying Your Own RSL Script
  • Acknowledgments
  • Deprecated Features
  • More detailed control over compiling and linking
  • Bibliography