benchmark 1.10

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

benchmark 1.10

Francesco Potorti`-9
I looked at the SPEC web page, and they have no other CPU benchmark
ready after SPEC95.  They plan a spec 98, but that isn't ready as
yet (obvious, isn't it? :-).

Intel architectures are not well defined like a Sun Sparc is.  Then,
unless someone comes up with a solution to this problem, I'd continue
to use Sparc 10/40 as the reference machine, which won't be a problem
in any case, since the numbers published in bm_results contain all the
data necessary to rebuild the reference vectors for any machine.

The new benchmark is for octave 2.0.5 and can be found, as usual, in
<URL:ftp://fly.cnuce.cnr.it/pub/benchmark.m>, while the results are in
<URL:ftp://fly.cnuce.cnr.it/pub/bm_results> (mirrors welcome).  No new
results have been added yet.

Everyone is encouraged to run the benchmark with octave 2.0.5 and send
the results to me.

I have not added yet the results for Alpha because I cannot find a
moment when the machine is idle, but it seems that, with respect to
version 1.1.1, the for loop is about 30% faster (expected), the
differential equation test is about 10% faster (expected), and the
Schur decomposition is about 30% slower (unexpected!).

These informal results seem confirmed by the new reference time vector
for the Sun Sparc 10/40:

1.1.1   bm_reftime = [1.61 4.54 3.88 2.12 2.47];
2.0.5   bm_reftime = [1.63 6.66 3.05 2.09 1.51];

as you see, the third (diff. eq.) and the fifth (for loop) tests run
quicker, but the second (Schur decomp.) is slower!  

For convenience I append the benchmark here.

Discussions on help-octave, results to me.

--
Francesco Potorti` (researcher)        Voice:    +39-50-593203
Computer Network Division              Operator: +39-50-593211
CNUCE-CNR, Via Santa Maria 36          Fax:      +39-50-904052
56126 Pisa - Italy                     Email: [hidden email]



----------------------   benchmark.m   ----------------------------------

bm_version = ["bm ", "1.10"];

# Benchmark for octave.
# Francesco Potorti` <[hidden email]>
# 1997/03/10 14:49:20
#
#   Send the results you get on your machine to the address above,
#   so that I can include them in the result list.
# latest benchmark.m version in <URL:ftp://fly.cnuce.cnr.it/pub/benchmark.m>
# latest result list in <URL:ftp://fly.cnuce.cnr.it/pub/bm_results>

printf ("Octave benchmark version %s\n", bm_version);

# To add reference times for your machine run the benchmark and
# add the values contained in the bm_mytime vector.
#
if (strcmp(version(), "1.1.1"))
  # Matthias Roessler <[hidden email]>
  bm_refname = "Sun Sparc 10/40";
  bm_reftime = [1.61 4.54 3.88 2.12 2.47];
elseif (strcmp(version(), "ss-960323"))
  # Rick Niles <[hidden email]>
  bm_refname = "Sun Sparc 10/50";
  bm_reftime = [2.00 8.95 2.60 3.03 1.11];
elseif (strcmp(version(), "2.0.5"))
  # Christian Jvnsson ISY/DTR <[hidden email]>
  bm_refname = "Sun Sparc 10/40";
  bm_reftime = [1.63 6.66 3.05 2.09 1.51];
else
  error ("No reference time for this version of octave.\n")
endif


# Use clock() if cputime() does not work on this particular port of octave.
# In this case, time will be computed on a wall clock, and will make sense
# only on a machine where no other processes are consuming significant cpu
# time while the benchmark is running.
global bm_uses_cputime = (cputime() != 0);
if (!bm_uses_cputime)
  disp ...
    ("WARNING: if other processes are running the figures will be inaccurate");
endif
function t = bm_start ()
  global bm_uses_cputime
  if (bm_uses_cputime)
    t = cputime();
  else
    t  = clock();
  endif
endfunction
function et = bm_stop (t);
  global bm_uses_cputime
  if (bm_uses_cputime)
    et = cputime()-t;
  else
    et = etime(clock(),t);
  endif
endfunction
 
# Used for the lsode test.  
clear xdot
function xdot = xdot (x, t)
  r = 0.25; k = 1.4; a = 1.5; b = 0.16; c = 0.9; d = 0.8;
  xdot(1) = r*x(1)*(1 - x(1)/k) - a*x(1)*x(2)/(1 + b*x(1));
  xdot(2) = c*a*x(1)*x(2)/(1 + b*x(1)) - d*x(2);
endfunction

#
# Do benchmark
#
function [name, time] = bm_test(f,rep) # Actual test functions
  global t;
  start = bm_start();
  for i = 1:rep
    if     (f==1) name="Matrix inversion (LAPACK)";
                  bm_x=inv(hadamard(8));
    elseif (f==2) name="Schur decomposition (LAPACK)";
                  bm_x=schur(hadamard(7));
    elseif (f==3) name="Differential equation (LSODE)";
                  bm_x=lsode("xdot",[1;2],(t=linspace(0,50,200)'));
    elseif (f==4) name="Fourier transforms (FFTPACK)";
                  bm_x=ifft2(fft2(hadamard(8)));
    elseif (f==5) name="for loop";  
                  for i=1:6000;bm_x=i^2;endfor
    endif
  endfor
  time = bm_stop(start)/rep;
endfunction

bm_targetaccuracy = 0.025; # target accuracy of mean of times
bm_minrepetitions = 7; # min number of repetitions per test
bm_maxtime = 60; # max runtime per test [seconds]
bm_mintime = 0.3; # min runtime per test [seconds]
bm_runtime = 3; # target runtime per test [seconds]

printf ("Speed of octave %s on %s relative to %s\n", ...
        version(), computer(), bm_refname); fflush(stdout);
bm_mytime = zeros(size(bm_reftime));
for f = 1:length(bm_reftime)
  res = [];
  bm_test(f,1); # increase the RSS, load things
  rep = 1; # number of repetitions per run
  while (1) # we would need a do..while really
    [name,time] = bm_test(f,rep); # evaluate name and time
    if (time*rep > bm_mintime) # run for at least bm_mintime
      break; # found approximate time
    endif
    rep = 2*rep; # approaching min run time
  endwhile
  printf("%-33s", name); fflush(stdout);# print name
  rep = round(bm_runtime/time); # no. of repetitions per run
  rep = max(1,rep); # slow machines need this
  for runs = 1:bm_maxtime/bm_runtime # do runs
    [name,time] = bm_test(f,rep); # run
    res(runs) = bm_reftime(f)/time; # store relative performance
    if (runs < bm_minrepetitions) # jump rest of for loop
      continue
    endif
    res = sort(res);
    bm_mean = mean(res(2:runs-1)); # remove min and max results
    if (std(res)/bm_mean < bm_targetaccuracy)
      break
    endif
  endfor # end of repetitions loop
  bm_mytime(f) = bm_reftime(f)/bm_mean;
  # print 95% confidence interval
  printf("%5.2f +/- %.1f%% (%d runs)\n", ...
         bm_mean, 200*std(res)/bm_mean, runs*rep); fflush(stdout);
endfor
clear bm_x
# Display the geometric mean of the results
printf("-- Performance index (%s): %.2g\n\n", bm_version, ...
       prod(bm_reftime./bm_mytime)^(1/length(bm_reftime)));

Reply | Threaded
Open this post in threaded view
|

Re: benchmark 1.10

John A. Turner
Francesco Potorti` writes:

[snip]

 > The new benchmark is for octave 2.0.5 and can be found, as usual, in
 > <URL:ftp://fly.cnuce.cnr.it/pub/benchmark.m>, while the results are in
 > <URL:ftp://fly.cnuce.cnr.it/pub/bm_results> (mirrors welcome).  No new
 > results have been added yet.
 >
 > Everyone is encouraged to run the benchmark with octave 2.0.5 and send
 > the results to me.
 >
 > I have not added yet the results for Alpha because I cannot find a
 > moment when the machine is idle, but it seems that, with respect to
 > version 1.1.1, the for loop is about 30% faster (expected), the
 > differential equation test is about 10% faster (expected), and the
 > Schur decomposition is about 30% slower (unexpected!).

[snip]

While I applaud this effort, I took a look at the results, and there
is far to little information from which to draw real conclusions.

Each entry should include:

o a more complete hardware specification.  For example, one of the
  entries is "Ultra 1".  Is that a 1/140, a 1/170, or a 1/200?  They
  have three different clock speeds.

  Maybe nitpicking, but there's a listing for an Ultra 167.  There's
  no such thing.  It must actually be a 1/170.  (I'm omitting the E
  because it shouldn't matter.)

  It's difficult, though, because should it be the model or the chip?
  For example, there's an entry for a "DEC Alpha 400".  It must be one
  of the models with a 400MHz A21164 chip, and that's probably more
  important than the actual model.

  Still, I'm thinking the precise model should be listed, maybe with
  additional columns for the chip and clock speed.  Something like the
  SPECmark table at ftp://ftp.cdf.toronto.edu/pub/spectable does.
  Here's an excerpt:

-------------------------------------------------------------------------------
System            CPU        ClkMHz  Cache      SPECint SPECfp  Info  Source
Name              (NUMx)Type ext/in  Ext+I/D    base95  base95  Date  Obtained
================= ========== ======= ========== ======= ======= ===== =========
DEC 8[24]00/5/300 A21164     75/300  4M+96+8/8    7.43   11.7   Feb96 Digital
DEC 8[24]00/5/350 A21164     88/350  4M+96+8/8    8.82   13.2   Feb96 Digital
DEC 8[24]00/5/440 A21164     88/440  4M+96+8/8   11.2    16.0   Oct96 Digital
Intel XXpress     Pentium    66/166  1M+8/8       4.76    3.37  Jan96 www.intel
Intel Alder       PentiumPro 166     512+8/8      7.11    5.47  Jan96 www.intel
Intel Alder       PentiumPro 200     256+8/8      8.09    5.99  Jan96 www.intel
SGI O2-R5kSC      R5000      180     512+32/32    4.76    5.37  Oct96 www.specb
SGI Indigo2-R10k  R10000     195     1M+32/32     8.50   10.2   Jul96 www.specb
Sun SS10/40       SuprSP     40      20/16        1.06    1.13  Mar96 c.bmarks
Sun SS[45]/110    MicroSP2   110     16/8         1.37    1.88  Mar96 c.bmarks
Sun Ultra1/140    UltraSP    71/143  512+16/16    4.52    7.73  Mar96 c.bmarks
Sun Ultra1/170    UltraSP    83/167  512+16/16    5.26    8.45  Mar96 c.bmarks

o a precise specification of the operating system, including the
  version (e.g. SunOS 4.1.3, Solaris 2.5.1, etc.).

o a precise specification of the compiler used to compile Octave,
  including version *and compile flags*.  Also, whether an F77
  compiler was used for the Fortran, and if so, version and compile
  flags for it.

So again, I applaud the effort, but it's difficult to do any real
comparison without the entire picture.

--
John Turner
http://www.lanl.gov/home/turner

Reply | Threaded
Open this post in threaded view
|

Re: benchmark 1.10

Francesco Potorti`-9
   Each entry should include:
   
   o a more complete hardware specification.  For example, one of the
     entries is "Ultra 1".  Is that a 1/140, a 1/170, or a 1/200?  They
     have three different clock speeds.

Ehr..  I have no idea, really.  I just write the name of the box as
people send it to me.  Sometimes I asked for more detail, when I knew
there was non enough, but I had no idea about the different types of
Sun Ultra 1 stations out there.  As you say, that data is
meaningless.  I will ask.
   
     It's difficult, though, because should it be the model or the chip?
     For example, there's an entry for a "DEC Alpha 400".  It must be one
     of the models with a 400MHz A21164 chip, and that's probably more
     important than the actual model.

Since there are problems on two of the four the data for the beta
version of octave, which is now obsolete and moreover has a different
reference machine, I think I'll simply delete them :)
   
     Still, I'm thinking the precise model should be listed, maybe with
     additional columns for the chip and clock speed.  Something like the
     SPECmark table at ftp://ftp.cdf.toronto.edu/pub/spectable does.

The SPEC table is well detailed, but I don't plan to ask people to
send me all that data.  For example, it is not always obvious to know
what the bus clock is, or how much cache is used, so I'd like to stick
with just the model name, and the clock speed, as I've done until now.

Unless obviously someone else is willing to take this burden, in which
case, I'll hand it over to them :-).
   
   o a precise specification of the operating system, including the
     version (e.g. SunOS 4.1.3, Solaris 2.5.1, etc.).

In general, it should be useless, as the benchmark is done so that
octave runs 99,99....% in user mode.  I usually list the OS name, but
not the version.
   
   o a precise specification of the compiler used to compile Octave,
     including version *and compile flags*.  Also, whether an F77
     compiler was used for the Fortran, and if so, version and compile
     flags for it.

I ask people to send me data obtained with the *precompiled* versions
of octave.  If that's not the case, it is listed.  Ask John Eaton for
the details about the precompiled versions :-)
   
   So again, I applaud the effort, but it's difficult to do any real
   comparison without the entire picture.

Even with the above clarifications?

--
Francesco Potorti` (researcher)        Voice:    +39-50-593203
Computer Network Division              Operator: +39-50-593211
CNUCE-CNR, Via Santa Maria 36          Fax:      +39-50-904052
56126 Pisa - Italy                     Email:    [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: benchmark 1.10

Francesco Potorti`-9
In reply to this post by John A. Turner
For obscure reasons, my previous mail was missing the first 10 lines
or so (comuputers are so fascinating :-).  I'll repost those here.

"John A. Turner" <[hidden email]> writes:

   Each entry should include:
   
   o a more complete hardware specification.  For example, one of the
     entries is "Ultra 1".  Is that a 1/140, a 1/170, or a 1/200?  They
     have three different clock speeds.

Ehr..  I have no idea, really.  I just write the name of the box as
people send it to me.  Sometimes I asked for more detail, when I knew
there was non enough, but I had no idea about the different types of
Sun Ultra 1 stations out there.  As you say, that data is
meaningless.  I will ask.

--
Francesco Potorti` (researcher)        Voice:    +39-50-593203
Computer Network Division              Operator: +39-50-593211
CNUCE-CNR, Via Santa Maria 36          Fax:      +39-50-904052
56126 Pisa - Italy                     Email:    [hidden email]