From CAELinuxWiki
Jump to: navigation, search

To use MPI with Code-Aster in CAELinux 2011, you don't need (and should not install) MPICH2 or anything else. Everything is already there. Code-Aster 11.0 is already compiled using openMPI libraries (and having several MPI libraries installed in the system may create configuration problems).

Personnally, this is the way I proceed, starting from 2 PC with a fresh install of CAELinux 2011 (even if using LiveDVD/liveUSB mode) so here is a small "How To" for you:

1) setup network to have interconnection: I use Network Manager to setup static IP adresses. set hostnames:

on machine 1:

sudo hostname caepc1

on machine 2:

sudo hostname caepc2

2) edit /etc/hosts of both machines to define host/ip relationships

sudo nano /etc/hosts

add such lines after xxxx : caepc1 caepc2

3) edit your configuration settings directly in /opt/aster110/etc/codeaster/aster-mpihosts

for example (use OpenMPI syntax):

caepc1 slots=1

caepc2 slots=1

4) optional: if you have more than 8Gb Ram per node or more than 16 cores in the cluster, edit also /opt/aster110/etc/codeaster/asrun to tune "interactif_memmax" = max memory per node and "interactif_mpi_nbpmax" = number of cores in the cluster

(optional) passwords: if using liveVD/liveUSB mode, you need to set a password for the default user caelinux. so on each node, run in a terminal "passwd" (default password is empty) to set a new password

5) ssh setup: you need ssh login without passwords between the two hosts: on first node, run

scp /home/caelinux/.ssh/id* caepc2:/home/caelinux/.ssh/

scp /home/caelinux/.ssh/authorized* caepc2:/home/caelinux/.ssh/

ssh-keyscan caepc1 >> /home/caelinux/.ssh/known_hosts

ssh-keyscan caepc2 >> /home/caelinux/.ssh/known_hosts

scp /home/caelinux/.ssh/known_hosts caepc2:/home/caelinux/.ssh/

6) setup a shared temp directory with NFS on node 1

sudo mkdir /srv/shared_tmp

sudo chmod a+rwx /srv/shared_tmp

sudo nano /etc/exports

then add the following line and save:

/srv/shared_tmp    *(rw,async)


sudo exportfs -a

Now create the mount point and mount the shared folder, run this on all nodes:

sudo mkdir /mnt/shared_tmp

sudo chmod a+rwx /mnt/shared_tmp

sudo mount -t nfs -o rw,rsize=8192,wsize=8192 caepc1:/srv/shared_tmp /mnt/shared_tmp

7) setup Aster config to use this shared temp directory:

nano /opt/aster110/eetc/codeaster/asrun

edit the line with "shared_tmp" as follows:

shared_tmp : /mnt/shared_tmp

then save

8) Open ASTK , go in server and refresh; create your Job,

select Options

ncpus=1 (no openMP) ,

mpi_nbcpu= total number of cores to use (nb_noeu*cores_per_host)

mpi_nbnoeud = number of compute nodes

And finally it should run on several nodes!!

Actually , the hard point is that you NEED to have shared tmp folder to run the jobs on a cluster.