For 32-bit only:
To get the Rmpi and doMPI packages working on Windows HPC, using Micrsoft’s MSMPI libraries:
- Install the “foreach” package into R.
- Install RTools for your version of R from Duncan Murdoch's page.
- Install the Windows HPC 2008 SDK from Microsoft Download Center.
- Download and extract the package source for the development version (0.6) from The Rmpi download page.
- Download and extract the package source for doMPI from the CRAN package page.
- Modify the Rmpi\src\Makevars.win to read
PKG_CFLAGS = -I"C:\Program Files\Microsoft HPC Pack 2008 SDK\Include" -DMPI2 -DWin32 "-D__int64=long long"
PKG_LIBS = -L"C:\Program Files\Microsoft HPC Pack 2008 SDK\Lib\i386" -L"C:\Program Files\Microsoft HPC Pack 2008 SDK\Lib\amd64" -lmsmpi
- Open an administrative command prompt (so that your R installation can be updated) and do the following:
cd <folder above the extracted packages>
"c:\Program Files\R\R-2.14.0\bin\R.exe" --vanilla CMD INSTALL --build Rmpi
"c:\Program Files\R\R-2.14.0\bin\R.exe" --vanilla CMD INSTALL –build doMPI
- Copy the zip files created, Rmpi_0.6-0.zip and doMPI_0.1-5.zip, to your cluster head node and pass the file paths to R’s install.packages function together with “foreach”.
- library(doMPI) should then state that it has loaded Rmpi.
That all worked for i386 (32-bit), but I got an access violation when trying to load the Rmpi package for x64. So instead, I built the Rmpi sources using the Visual C++ 2010 compiler (some fixing required), and dropped the new dlls over the top of the ones installed in the R library folder.
Using parallel foreach on Windows HPC
MSMPI (Microsoft’s implementation of MPI) doesn’t support spawning (at least not in the 2008 R2 version), so you need to use the non-spawning method:
cl <- startMPIcluster()
… use of foreach and %dopar% …
Finally, you queue the job using mpiexec on R (or a batch file that calls R).
To run one worker per core:
Set the job resource type to core, and set the minimum and maximum number of cores for the task to the number of cores on your worker nodes.
Use the command mpiexec –n * myRrunner.bat
To run one worker per node:
Set the job resource type to node, and set the minimum and maximum number of nodes for the task to the number of nodes in your cluster.
Use the command mpiexec –cores 1 myRrunner.bat