| |
|
|
|
|
NCSA Intel 64 Cluster Abe Upgrade |
The NCSA Intel 64 Cluster (Abe) will go into production with the
upgraded software stack and expanded filesystems on Monday Dec 22, 2008.
Details of the Upgrade
Below is a brief summary of the differences, and other upgrades:
Current Upgrade
-------- --------
Linux Kernel 2.6.9 2.6.18
Lustre 1.4 1.6
OFED 1.2 1.2.5.5
Default Intel Compiler 10.0 10.1
Intel MKL 9.0 10.0
MVAPICH2 0.9.8p2 1.2
HDF4 4.2r1 4.2r3
HDF5,PHDF5 1.6.5 1.6.7
python 2.3.4 2.5.2
java 1.4.2 1.5.0
PAPI 3.5.0 3.6.2
The Linux distribution (Red Hat Enterprise Linux 4) and glibc version
(2.3.4) remain the same.
User impact
- Your $HOME directory, including
the $HOME/.soft file from the previous production environment
have been copied over to the upgraded environment. Any data generated
in the friendly user $HOME filesystem has been deleted.
- In general, programs built in the previous production Abe environment
should
run without problems in the upgraded environment. However, due to various upgrades in the software environment we
recommend that you recompile your programs.
- Third party software applications will work in the upgraded environment
with no changes.
- Jobs that were in the queue prior to the downtime on Dec 15 will be
automatically scheduled to run in the new environment.
- Codes utilizing the HDF libraries should be relinked. The provided default soft keys have changed.
- Codes utilizing the Intel MKL library should be relinked. The +intel-mkl soft key now points to the upgraded version.
- Detailed linking at Intel MKL 10 on Abe
- Codes linked to Intel MKL 9.1 dynamically, should but in some cases may not need to relink to Intel MKL 10 libraries.
- Intel MKL 10 provides ScaLAPACK/BLACS for MVAPICH2/MPICH2 MPI.
- When linking to threaded MKL libraries, the environment variable
MKL_NUM_THREADS (set to 1 in +intel-mkl keys) allows control of MKL threading independant of setting environment variable OMP_NUM_THREADS
Contact
mathsoft@ncsa.uiuc.edu with issues regarding MKL or other math libraries.
- system(), fork(), and popen() calls can now be made in MPI codes over Infiniband.
For details about MVAPICH2 1.2 please click here.
Known Issues:
- async HCA thread disabled - An intermittent issue was discovered with a MVAPICH2 HCA thread that would occassionally cause a hang on MPI_Finalize(). The work-around disables the thread that catches asynchronous events from the HCA. The thread is not related to asynchronous progress, it re-posts buffers if the buffer pool gets too low. A final patch that re-enables the async HCA thread is expected from MVAPICH2 developers soon.
Some new features:
- Lustre support in mvapich2-1.2 ROMIO/ADIO
Lustre example
Lustre runtime issue
- MPE logging
- MVAPICH2 does pinning of MPI tasks to cores. You can disable this by setting
setenv MV2_ENABLE_AFFINITY 0
or you can specify which cores are to be used; for example when running ppn=4
setenv MV2_CPU_MAPPING 0:1:4:5
which would use only a single core on each dual-core die.
- hybrid MPI+OpenMP
- The usual mpd start-up works fine:
mvapich2-start-mpd
mpiexec -machinefile $PBS_NODEFILE -n x ./a.out [...]
mpdallexit
For larger node counts, consider using the more scalable mpirun_rsh method ...
mpirun_rsh -ssh -np X -hostfile $PBS_NODEFILE ./a.out [...]
but this method does not export environment variables. You need to
pass them on the command line:
mpirun_rsh -ssh -np X -hostfile $PBS_NODEFILE ENV1=val1 ENV2=val2 ./a.out
or use the -paramfile option:
mpirun_rsh -ssh -np X -hostfile $PBS_NODEFILE -paramfile=pfile ./a.out
where pfile is:
ENV1=val1
ENV2=val2
Please report any problems or issues to the NCSA Consulting Office via
electronic mail at
consult@ncsa.uiuc.edu
or by telephone at (217)
244-1144.
|
|
|
|
|