NCSA Home
Contact Us | Intranet | Search

ncsa

Point-to-Point Blocking and Nonblocking Communications

  1. Blocking Communications
    1. Blocking Implementation on SGI Systems
  2. Nonblocking Communications
    1. Checking on Nonblocking Communications


MPI point-to-point communications exist in two types: blocking and nonblocking communications.

1.0 Blocking communications

A blocking MPI call means that the program execution will be suspended until the message buffer is safe to use. The MPI standards specify that a blocking SEND or RECV does not return until the send buffer is safe to reuse (for MPI_SEND), or the receive buffer is ready to use (for MPI_RECV). Consider the following code: segment:

   real*8 a(ndim, ndim)
   real*8 b(ndim)

   .....

   call MPI_SEND(a,ndim*ndim,MPI_DOUBLE_PRECISON,dest,tag,comm,ierr)
c  The array a can be safely updated here.

   ...
   call MPI_RECV(b,ndim,MPI_DOUBLE_PRECISON,dest,tag,comm,status,ierr)
c  The array b can be safely used here.
   ...

The statement after MPI_SEND can safely modify the memory location of the array a because the return from MPI_SEND indicates either a successful completion of the SEND process, or that the buffer containing a has been copied to a safe place. In either case, a's buffer can be safely reused.

Also, the return of MPI_RECV indicates that the buffer containing the array b is full and is ready to use, so the code segment after MPI_RECV can safely use b.

A blocking SEND can be implemented in two different ways:

  • The sender can buffer the message to a system buffer; then it can proceed to do other work. The system buffer serves as a safe place to hold the message data until some process is ready to receive it.
  • The sender does not buffer the outgoing message, but instead it waits for a receiver process to start receiving the message.

1.1 Blocking Implementation on SGI systems

The implementation of blocking on the SGI platform is a mix of both techniques. Messages that are smaller than 64 bytes by default are buffered to a system buffer. Messages larger than 64 bytes are not buffered, and the sender waits for a receiver to receive the message before it returns.

To force buffering messages that are larger than 64 bytes, you can change the maximum size of buffered message by setting the environment variable MPI_SHMEM_BUFFER_THRESHOLD to a value larger than 64.

2.0 Nonblocking Communications

An MPI nonblocking call returns immediately after the call is initiated and does not wait to be certain that the communication buffer is safe to use. You must make sure that the send buffer has been copied out before reusing it, or that the receive buffer is full before using it.

The nonblocking MPI_ISEND and MPI_IRECV are distinguished by the letter I, for immediate return. The syntax and argument list are the same as the blocking versions except for an additional argument -- a request handler -- that can later be used to wait for, or check on, the completion of the call.

FORTRAN MPI_ISEND(data,count,datatype,dest,tag,comm,request,ierr)
MPI_IRECV(data,count,datatype,dest,tag,comm,request,ierr)
C MPI_Isend(data,count,datatype,dest,tag,comm,request)
MPI_Irecv(data,count,datatype,dest,tag,comm,request)

Nonblocking communications have the following advantages:

  1. The computation can proceed immediately after a nonblocking MPI communication call without waiting for the call to complete, which improves the program performance.
  2. Because the call returns immediately, nonblocking calls allow both communications and computations to proceed concurrently.

2.1 Checking on Nonblocking Communications

Make sure that the nonblocking communication is complete before using or reusing the communication buffer. This is done using the MPI functions MPI_Test and MPI_Wait with the request handler returned from the nonblocking Isend/Irecv.

  • MPI_Test(request, flag, status)
    tests test for the completion of the operation specified by the handler request and returns immediately. It does not wait for completion. The argument flag is true if the operation is complete and false otherwise.

  • MPI_Wait(request, status)
    waits until the operation specified by request is complete, and then returns.

In both calls, status can be used to get information about the completed operation.

Other variations of MPI_Test and MPI_Wait can be used to check on multiple operations.

  • MPI_Testany(count, array_of_requests, index, flag, status)
    Tests for completion of any of the operations specified by the group of operations (array_of_requests) of length count. flag is true if one operation has completed, and the argument index has the index of the operation that completed in the group list.

  • MPI_Waitany(count, array_of_requests, index, status)
    Returns if any one of the operations specified by the group of operations (array_of_requests) of length count has completed. It blocks if none of them has completed.

  • MPI_Testall(count, array_of_requests, flag, array_of_statuses)
    Tests for all operations in the group array_of_requests. Upon return, flag is true only if all operations are completed; otherwise it is false.

  • MPI_Waitall(count, array_of_requests, array_of_statuses)
    Blocks waiting for all operations in the group array_of_requests to complete. This call has the same effect as looping on a call to MPI_Wait for every request in the group.

  • MPI_Waitall(count, array_of_requests, array_of_statuses)
    Blocks waiting for all operations in the group array_of_requests to complete. This call has the same effect as looping on a call to MPI_Wait for every request in the group.

There are also MPI_Waitsome and MPI_Testsome, which work in a similar way except they wait or test for some operations in the group.