Questions

1) What is NBODY6++?

2) Where can I download the code?

3) Where can I download the nbody6++ manual by Emil Khalisi / Rainer Spurzem?

4) How can I restart a run?

5) Which Fortran Compiler is the fastest?

6) Formula for the scaling of NBODY6++ and description of timing output?

7) How do I use the ARI SVN Repository?

Answers

1) What is NBODY6++? NBODY6++ is a direct high-accuracy (4th order) NBODY code, using a Hermite scheme. It is suitable for many massively parallel computers, but not yet for GRAPE clusters. It can also run on a single workstation. NBODY6++ is based on the series of Sverre Aarseth's NBODY codes, here on the version NBODY6 with Ahmad-Cohen neighbour scheme and many other nice features. Details and citations can be found in the manual. A WARNING: Note that NBODY6++ is not one-to-one compatible with NBODY6 (single CPU only). Therefore a synchronisation between Sverre's new version and NBODY6++ only takes place in larger time intervals. This time the time lag has grown to MORE THAN 3 YEARS

(since June 2003) - therefore a new version is in preparation and

overdue! You will find notice on it here and on the ftp-URL.

2) Where can I download the code?

ftp://ftp.ari.uni-heidelberg.de/staff/spurzem/nb6mpi/

(This URL will only work from outside ARI, inside look at
/work/ftpserver/pub/... ).
There you will find a README giving recent news, you find the file
(nb6mpi-june2003.tar.gz)

There is a recent beta-snapshot, used at the Cambody School (nb6mpi-new.tar.gz)
in case you want to try.

You might try in the future the ARI SVN repository (please check the README file there!) under

http://svn.ari.uni-heidelberg.de/repos/nbody/

3) Where can I download the nbody6++ manual by Emil Khalisi / Rainer Spurzem?

Documentations:

At the given location you will find:

nb6++manual-new.pdf (Complete Manual by E.Khalisi and R.S.)
nbody6++.ps List of Subroutines in Sverre style.

When you unpack the code you will find more docs in define.f and
little comments in the file in1000.comment (on input parameters).

4) How can I restart a run? For the restart you need a special input file, which looks like this:

5 1.E20 10000. 40 40
0.0 0.0 0.0 0.0 10000.0 2.0E-04 0 0
0.01 0.02 0.1 1.0E-04 0.01 80
--
KSTART, TCOMP, TCRITp, isernb,iserreg
DTADJ, DELTAT, TADJ, TNEXT, TCRIT, QE & J KZ(J)
ETAI, ETAR, ETAU, DTMIN, RMIN, NNBOPT
Wenn J<>0 wird KZ(J) auf den angegebenen Wert gesetzt.
Mehr Details siehe Routine modify.f

The value of the first number KSTART determines, which parameters
are changed for the restart. If KSTART=2, only the first line is read,
if KSTART=3, the first and the second line is read, if KSTART=5, all three
lines are read. If a value is zero, it will generally not be changed, but check
yourself in the routine modify.f.

5) Which Fortran Compiler is the fastest? I made a comparison run on a single-processor machine with N=1k, TCRIT=10.0 (10 N-body time units), Cambody version of Nbody6++:

pgf77 -O4 (Version 6.2-4, highest optimization...):

Total CPU= 70.86000680923462

ifort -O5 (Version 8.0, highest optimization...):

Total CPU= 55.1000009155273

The result was that ifort is 20% faster. Has anybody made similar experiences? (A.E.)

6) Formula for the scaling of NBODY6++ and description of timing output?

(These notes are still preliminary and will hopefully be extended, R. Sp. 16.7.07)

The code scales to nearly everything provided the computing and communication speed (between nodes) are sufficient.

The formula to estimate this is

T = alpha N + beta N**2 / gamma + delta N*Nn + Tcomm*N

where T is the wall clock time to compute some physical time (usually we take
one N-body time unit, approx. 1/3 crossing time), and N is the problem size.

alpha: CPU time constant to advance one particle on one node
beta: CPU time constant for one pairwise force calculation
delta = beta (different only for SPH codes)
Nn: average neighbour number (50-200)
gamma: efficiency factor of neighbour scheme.
Tcomm: average communication time (wall clock)

The code delivers in its own profiling scheme the following:

tirr = delta N*Nn
treg = beta N**2 / gamma
tsub + tsub2 = Tcomm*N
ttot = T
NSTEPIRR/NSTEPREG = gamma (first number after NSTEPS= in output divided
by third one after NSTEPS = )

From a collection of benchmarks with these profiling data it is possible to
determine alpha, beta, delta, tcomm.

Also one can determine how treg, tirr, Tcomm scale with processor number NPE.
For sufficiently large N we expect treg propto 1/NPE ( near ideal scaling) and

tirr propto 1/NPE**(0.7-0.8) (not so ideal scaling but the dominant part is treg) and Tcomm propto NPE (if NPE >> 1), i.e. the communication time is always linear in NPE.

(Note that this is easy to see for any ringlike systolic communication scheme, but it surprisingly also holds for TREE like schemes, if latency is negligible).

We have

Tcomm*N = p * x * Tlatency + N Tbandwidth + Tmov

under the assumption that x*s = N ist (e.g. for a Plummer model mean block size s propto N^(2/3), mean number of blocks per N-Body Time unit x propto N^(1/3) .

Tmov is the overhead in the code for parallelization (copying of data). Its scaling has not been measured yet, but it increases with NPE and N. If we increase NPE and hold N constant, tmov and latency dominate at some point.

If we have a faster processor, typically alpha, beta, delta scale roughly similar. If we have faster communication Tcomm scales. The reader interested in more detail should look at:

Harfst, S., Gualandris, A., Merritt, D., Spurzem, R., Portegies Zwart, S., Berczik, P. Performance analysis of direct N-body algorithms on special-purpose supercomputers New Astron. 12, 357 (2007)

Dorband, E.; Hemsendorf, M.; Merritt, D., Systolic and hyper-systolic algorithms for the gravitational N-body problem, with an application to Brownian motion, Journal of Computational Physics, Volume 185, Issue 2, p. 484-511.

7) How do I use the ARI SVN Repository? The ARI SVN Repository is located under

http://svn.ari.uni-heidelberg.de/repos/nbody/

Check the README file first. To work with it obtain a password from Thoms Brüsemeister. You can put your own variant of Nbody6++ into the Repository.

You can put your code version under "branches" with the following command:

svn import /pfad/zu/deinem/code http://svn.ari.uni-heidelberg.de/repos/nbody/branches/branch-name

To work with the code from the Repository you must generate a local working copy, i.e. do a "checkout" with the command:

svn co http://svn.ari.uni-heidelberg.de/repos/nbody/branches/branch-name

To synchronize your local working copy again with the Repository, type:

cd /meine/arbeitskopie
svn ci

Update the working copy:

svn up

Infos about working copy:

svn info

Last changes of the working copy:

svn log


Page Information

  • 6 months ago [history]
  • View page source
  • You're not logged in
  • No tags yet learn more

Wiki Information

Recent PBwiki Blog Posts