I am posting the results of this Sciviews benchmark test on the help
list to hit the widest possible audience. John Eaton and the other contributors to octave and octave-forge have my applause for the effort that they have put in over the last couple of months to extend and speed octave up. I also wish to encourage you to go the whole hog and update to ATLAS, FFW, octave-2.1.56(when it becomes the recommended version) and octave-forge (in that order). On an Athlon 1700 RH9 machine with only 256Mbyte memory (of which more in a moment), the Sciviews benchmark tests (see http://www.sciviews.org/other/benchmark.htm ) gave: Octave Benchmark 2 ================== Number of times each test is run__________________________: 3 I. Matrix calculation octave-2.1.56 (Matlab6.5) --------------------- Creation, transp., deformation of a 1500x1500 matrix (sec): 1.117 (0.6733) 800x800 normal distributed random matrix ^1000______ (sec): 0.1454 (0.2846) Sorting of 2,000,000 random values__________________ (sec): 1.696 (0.9027) 700x700 cross-product matrix (b = a' * a)___________ (sec): 0.4062 (0.5264) Linear regression over a 600x600 matrix (c = a \ b') (sec): 0.247 (0.2885) ------------------------------------------------------ Trimmed geom. mean (2 extremes eliminated): 0.4822 II. Matrix functions -------------------- FFT over 800,000 random values______________________ (sec): 0.3724 (0.5914) Eigenvalues of a 320x320 random matrix______________ (sec): 0.9604 (1.0192) Determinant of a 650x650 random matrix______________ (sec): 0.2894 (0.2450) Cholesky decomposition of a 900x900 matrix__________ (sec): 0.3212 (0.3306) Inverse of a 400x400 random matrix__________________ (sec): 0.1626 (0.1957) ------------------------------------------------------ Trimmed geom. mean (2 extremes eliminated): 0.3259 III. Programmation ------------------ 750,000 Fibonacci numbers calculation (vector calc)_ (sec): 0.6075 (0.9792) Creation of a 2250x2250 Hilbert matrix (matrix calc) (sec): 1.137**** (1.1501) Grand common divisors of 70,000 pairs (recursion)___ (sec): 0.6918 (0.4376) Creation of a 220x220 Toeplitz matrix (loops)_______ (sec): 1.449 (0.0021) Escoufier's method on a 37x37 matrix (mixed)________ (sec): 2.2 (1.1663) ------------------------------------------------------ Trimmed geom. mean (2 extremes eliminated): 1.045 Total time for all 15 tests_________________________ (sec): 11.8 (8.7927) Overall mean (sum of I, II and III trimmed means/3)_ (sec): 0.5475 --- End of test --- **** The Hilbert matrix test uses a lot of memory. With one memory bank AWOL, 256Mbyte is not enough. It took 7-8 repeats to allow the system to clean up the memory enough that no swapping occurred during the test. It will be noted that most of the difference now lies in the last two tests, where Matlab must either be employing automatic vectorisation or an on the fly compiler, when it can. This seems to be where Matworks have been concentrating their effort since the Sciviews tests were posted). Octave, with all the trimmings, now has roughly the performance of Matlab6.0. Vectorising the Toeplitz test, reduces its execution time in octave to 0.018s. TAKE NOTE! ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html ------------------------------------------------------------- |
Daprès Paul Thomas <[hidden email]> (le 07/03/2004):
> Sorting of 2,000,000 random values__________________ (sec): 1.696 > (0.9027) This slowup with octave is also a choice. The sort code in octave-forge uses the mergesort code from python, while matlab uses a quicksort algorithm. The benchmark above is reversed if partially ordered lists are used, as for instance often happen in interpolation codes. So octaves loss in the above benchmark shows more about the choices of the person writing the benchmark than the octaves relative performance. Benchmarks are all figments of someones deranged imagination... D. -- David Bateman [hidden email] Motorola CRM +33 1 69 35 48 04 (Ph) Parc Les Algorithmes, Commune de St Aubin +33 1 69 35 77 01 (Fax) 91193 Gif-Sur-Yvette FRANCE The information contained in this communication has been classified as: [x] General Business Information [ ] Motorola Internal Use Only [ ] Motorola Confidential Proprietary ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html ------------------------------------------------------------- |
I agree entirely, where details (ie. less factor of 1.5-2 either way)
are concerned. However, you and others gained factors of 5 or more that have made the difference that I find impressive. Whether deranged or not....... Paul T David Bateman wrote: >Daprès Paul Thomas <[hidden email]> (le 07/03/2004): > > >>Sorting of 2,000,000 random values__________________ (sec): 1.696 >>(0.9027) >> >> > >This slowup with octave is also a choice. The sort code in octave-forge >uses the mergesort code from python, while matlab uses a quicksort >algorithm. The benchmark above is reversed if partially ordered lists >are used, as for instance often happen in interpolation codes. So octaves >loss in the above benchmark shows more about the choices of the person >writing the benchmark than the octaves relative performance. Benchmarks >are all figments of someones deranged imagination... > >D. > > > ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html ------------------------------------------------------------- |
In reply to this post by Paul Thomas-10
Let me say at the start that I am a big fan of octave. I use both octave and matlab, and I prefer octave for the most part. It is less capricious and cantankerous than matlab for the most part. Also, with matlab, I had better be able to see the license server or I am out of luck. That is not always possible and there are times when I am in a mission critical situation and cannot use matlab because of this. There are places, however, where matlab simply outruns octave by a big margin to the point where octave is nearly unusable for a given task. The run time difference may be as much as 100x. For one type of analysis, matlab's response takes a minute and while octave's is two hours. Even as a die hard octave fan, I will think twice before I run those sorts of analyses in octave as opposed to matlab. Given that some benchmarks have been posted, I will take the liberty of posting some of my results too. I hope it will generate a profitable discussion. The following is the result for the total time of the Octave2 tests on a 2GHz G5 under OS X using both matlab (6.5.1) and octave (2.1.53). (I have posted all the results at the end in case they are of interest) Octave: Total time for all 15 tests_________________________ (sec): 15.53 Matlab: Total time for all 15 tests_________________________ (sec): 5.0555 As you can see, about a 3x difference in the total time. The big offender is the sort, 7.348 for octave and 0.41325 for matlab. The next largest is the Toeplitz matrix, 1.5 for octave and 0.012 for matlab. The third largest difference is the eigenvalues of a 320x320 matrix, 1.144 for octave and 0.43039 for matlab. The forth largest is Escoufier's method, 1.713 for octave and 0.96309 for matlab. All told that accounts for almost 10 seconds of the 10.5 second difference. Now some of the differences are quite likely due to algorithmic differences. The difference in sorting is certainly likely algorithmic, though given that my other numbers are slightly better than the times in Paul Thomas' post, I have to wonder why my sort is quite so much worse than his. Another observation is that this test, Octave2, is mostly testing algorithmic differences between matlab and octave. The individual test cases generally execute a particular function or set of functions repeatedly with matrices being operated on. The major portion of the execution time would be expected to be in the function call(s). It has been my experience that octave for the most part does as well as matlab, and in a few cases (depending upon the edition of matlab) much better when one gets down to this level, namely function calls. Where I see problems in octave is looping & array functions. Let me add some results from my own benchmarks. I ran this benchmark on matlab and octave. The times for each test are reported and then matlab's speed relative to octave is posted below the test result. Float test octave: a = 2001.000000, err = 0.000000, time = 7.752197 matlab: a = 2001.000000, err = 0.000000, time = 0.180680 speedX: 42.906 Sqrt test octave: time = 0.114017 matlab: time = 0.184528 speedX: 0.61788 Log test octave: time = 0.150121 matlab: time = 0.158714 speedX: 0.94586 Exp test octave: time = 0.087117 matlab: time = 0.021750 speedX: 4.0054 Atan test octave: time = 0.113081 matlab: time = 0.045912 speedX: 2.4630 Tan test octave: time = 0.166375 matlab: time = 0.221086 speedX: 0.75254 Matrix test octave: time = 0.001770 matlab: time = 0.001488 speedX: 1.1895 Sieve test using arrays octave: time = 0.020089 matlab: time = 0.001444 speedX: 13.912 Sieve test using loops octave: time = 0.092604 matlab: time = 0.001667 speedX: 55.551 The float test is a rather interesting result. The code for it is as follows: %% Float test --------------------------------------------------- fprintf('Float test\n'); tic for j = 1:100 a = 1; for i = 1:2000 a = tan(atan(exp(log(sqrt(a*a))))) + 1; end end elapsedTime = toc; fprintf ('a = %f, err = %f, time = %f\n\n', a, a - 2001, elapsedTime); As you can see, there is nothing remarkable about this test, other than it runs some 42x slower on octave than matlab. I use it all the time to test numerical stability of transcendental functions on whatever computer I am running on (though usually in C). It is the individual transcendental function tests that may be revealing the answer, at least in part. Each of those test takes a 100x100 matrix and performs a given transcendental function on each element of the matrix (e.g. result = log(matrix)). On the sqrt, log, and tan tests, octave comes out ahead of matlab. But on the exp octave is 4x slower and on the atan, octave is 2.5x slower. Algorithmic certainly, but if one deals with these transcendental functions a lot, then the slow down can be crushing, as was observed in the float test. The other observation is the Sieve tests. This is the classic sieve of Eratosthenes. The first method uses array operators to implement the algorithm, the second method uses loops, much like one would do in a C program. There is a very large slowdown with the arrays (13X), and then the looping version takes a 55x hit over matlab. If there are folks working on speed improvements, I would suggest that you have parity with matlab as far as the built-in functions go for the most part, but arrays and looping definitely need work. [Yes, I know that conversion of some of these sorts of algorithms to C & .oct files will improve speed considerably, but I do not always have that option for a whole host of reasons.] ---------------------------------------------------------------------- Michael W. Martin Phone: (281) 333-2177 Draper Laboratory FAX: (281) 333-5276 2200 Space Park Dr. EMail: [hidden email] Houston, TX 77058 WWW: http://www.jsc.draper.com/ USA Mail Code: EG/Draper ---------------------------------------------------------------------- Octave Benchmark 2 ================== Number of times each test is run__________________________: 3 --------------------- Creation, transp., deformation of a 1500x1500 matrix (sec): 1.251 [0.48235] 800x800 normal distributed random matrix ^1000______ (sec): 0.1284 [0.21914] Sorting of 2,000,000 random values__________________ (sec): 7.348 [0.41325] 700x700 cross-product matrix (b = a' * a)___________ (sec): 0.1312 [0.19244] Linear regression over a 600x600 matrix (c = a \ b') (sec): 0.09014 [0.10584] ------------------------------------------------------ Trimmed geom. mean (2 extremes eliminated): 0.2762 II. Matrix functions -------------------- FFT over 800,000 random values______________________ (sec): 0.2844 [0.28988] Eigenvalues of a 320x320 random matrix______________ (sec): 1.144 [0.43039] Determinant of a 650x650 random matrix______________ (sec): 0.1085 [0.095085] Cholesky decomposition of a 900x900 matrix__________ (sec): 0.1051 [0.13952] Inverse of a 400x400 random matrix__________________ (sec): 0.06585 [0.073918] ------------------------------------------------------ Trimmed geom. mean (2 extremes eliminated): 0.148 III. Programmation ------------------ 750,000 Fibonacci numbers calculation (vector calc)_ (sec): 0.5173 [0.77387] Creation of a 2250x2250 Hilbert matrix (matrix calc) (sec): 0.7473 [0.6179] Grand common divisors of 70,000 pairs (recursion)___ (sec): 0.3986 [0.24719] Creation of a 220x220 Toeplitz matrix (loops)_______ (sec): 1.5 [0.011642] Escoufier's method on a 37x37 matrix (mixed)________ (sec): 1.713 [0.96309] ------------------------------------------------------ Trimmed geom. mean (2 extremes eliminated): 0.8338 Total time for all 15 tests_________________________ (sec): 15.53 [5.0555] Overall mean (sum of I, II and III trimmed means/3)_ (sec): 0.3243 --- End of test --- My speed.m file: %% Speed tests %% Float test --------------------------------------------------- fprintf('Float test\n'); tic for j = 1:100 a = 1; for i = 1:2000 a = tan(atan(exp(log(sqrt(a*a))))) + 1; end end elapsedTime = toc; fprintf ('a = %f, err = %f, time = %f\n\n', a, a - 2001, elapsedTime); %% Needed for following ---------------------------------------- mat = reshape(1:10000, 100,100); %% sqrt test --------------------------------------------------- fprintf('Sqrt test\n'); tic for j = 1:100 mata = sin(mat); end elapsedTime = toc; fprintf ('time = %f\n\n', elapsedTime); %% log test --------------------------------------------------- fprintf('Log test\n'); tic for j = 1:100 mata = log(mat); end elapsedTime = toc; fprintf ('time = %f\n\n', elapsedTime); %% exp test --------------------------------------------------- fprintf('Exp test\n'); tic for j = 1:100 mata = exp(mat); end elapsedTime = toc; fprintf ('time = %f\n\n', elapsedTime); %% Atan test --------------------------------------------------- fprintf('Atan test\n'); tic for j = 1:100 mata = atan(mat); end elapsedTime = toc; fprintf ('time = %f\n\n', elapsedTime); %% Tan test --------------------------------------------------- fprintf('Tan test\n'); tic for j = 1:100 mata = tan(mat); end elapsedTime = toc; fprintf ('time = %f\n\n', elapsedTime); %% Matrix test --------------------------------------------------- fprintf('Matrix test\n'); tic res = mat*mat; elapsedTime = toc; fprintf ('time = %f\n\n', elapsedTime); %% sieve test --------------------------------------------------- fprintf('Sieve test using arrays\n'); tic xlen = 2000; x = linspace(1, xlen, xlen); for i = 2:xlen/2 if x(i) ~= 0 x(2*i:i:xlen) = 0; end end elapsedTime = toc; fprintf ('time = %f\n\n', elapsedTime); fprintf('Sieve test using loops\n'); tic x = linspace(1, xlen, xlen); for i = 2:xlen/2 if x(i) ~= 0 for j = 2*i:i:xlen x(j) = 0; end end end elapsedTime = toc; fprintf ('time = %f\n\n', elapsedTime); ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html ------------------------------------------------------------- |
Since we are speaking about benchmarks, there is an old benchmark of
mine at <ftp://fly.isti.cnr.it/pub/software/octave/benchmark.m>, and the companion <ftp://fly.isti.cnr.it/pub/software/octave/bm_results>. Both are outdated, that is, they list results for machines that are now obsolete (1997). At the time, I stopped collecting results because of some inconcistencies caused by compilation options which were not clear. Anyway, it would be easy to update that bechmark and use it for a quick assessment of how Octave performs on a given box. Probably, because of the new Octave features, the set of tests should be updated, which is fairly easy. So, if anyone feels like upgrading this, I will put it under the GPL. -- Francesco Potortì (ricercatore) Voice: +39 050 315 3058 (op.2111) ISTI - Area della ricerca CNR Fax: +39 050 313 8091 via G. Moruzzi 1, I-56124 Pisa Email: [hidden email] Web: http://fly.isti.cnr.it/ Key: fly.isti.cnr.it/public.key ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html ------------------------------------------------------------- |
In reply to this post by Michael Martin-8
On Mar 8, 2004, at 12:58 PM, Michael Martin wrote: > Now some of the differences are quite likely due to algorithmic > differences. The difference in sorting is certainly likely > algorithmic, though given that my other numbers are slightly better > than the times in Paul Thomas' post, I have to wonder why my sort is > quite so much worse than his. He is using David Bateman's sort from octave-forge (http://octave.sf.net). It is reported to have better performance on partially ordered lists, but worse on random data compared to matlab. Hmmm... I wonder if it is worthwhile to run through the data once to see how ordered it is, then do quicksort or merge sort as needed? Also, somebody with access to matlab might want to see if "[y,idx]=sort(x)" is any slower than "y=sort(x)". Maybe they have a fast stable sort algorithm that we don't know about. Paul Kienzle [hidden email] ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html ------------------------------------------------------------- |
Paul Kienzle wrote:
> It is reported to have better performance on partially ordered lists, > but worse on random data compared to matlab. Well, may be I misunderstood what does "partially ordered means", but here is what I got. (Matlab 6.5): >> x1=rand(3000); % random >> x2=[1:3000]; % ordered >> x3=repmat(x2,3000,1); % >> x=x3+2*x1; % partially ordered ? >> tic ; sort(x1) ; toc elapsed_time = 1.4351 >> tic ; sort(x3) ; toc elapsed_time = 0.8567 >> tic ; sort(x) ; toc elapsed_time = 1.4285 >> x=x3+1.1*x1; >> tic ; sort(x) ; toc elapsed_time = 1.4355 >> tic ; [y,idx]=sort(x) ; toc elapsed_time = 2.2073 >> tic ; [y,idx]=sort(x) ; toc % So we will not need to allocate y and idx elapsed_time = 2.0607 (Octave 2.1.56): octave:1> x1=rand(3000); octave:2> x2=[1:3000]; octave:3> x3=repmat(x2,3000,1); octave:4> x=x3+2*x1; octave:5> tic ; sort(x) ; toc ans = 2.1968 octave:6> tic ; sort(x1) ; toc ans = 1.9328 octave:7> tic ; sort(x3) ; toc ans = 0.58155 octave:8> tic ; [y,idx]= sort(x) ; toc ans = 3.5003 octave:9> tic ; [y,idx]= sort(x) ; toc ans = 3.4638 > Paul Kienzle > [hidden email] > Dmitri. ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html ------------------------------------------------------------- |
On Mar 8, 2004, at 7:17 PM, Dmitri A. Sergatskov wrote: > Paul Kienzle wrote: > >> It is reported to have better performance on partially ordered lists, >> but worse on random data compared to matlab. > > Well, may be I misunderstood what does "partially ordered means", but > here is > what I got. Thanks. I don't know the details of the algorithm, but I believe it is a merge sort variant. These tend to do better if you start with a small number of long 'runs' rather than a large number of short 'runs', as in the following: octave:42> w=[1:1000000]; octave:43> x=repmat([1:500000],1,2); octave:44> y=repmat([1:100000],1,10); octave:45> z=rand(1000000,1); octave:46> tic; sort(w); toc ans = 0.60063 octave:47> tic; sort(x); toc ans = 0.80005 octave:48> tic; sort(y); toc ans = 1.0342 octave:49> tic; sort(z); toc ans = 1.4839 x is an imporant case for table lookups. It would be nice if the random case were 1x rather 1.5x matlab. Still, 1.5x is loads better than the 7x it used to be. Paul Kienzle [hidden email] ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html ------------------------------------------------------------- |
In reply to this post by Michael Martin-8
Michael,
Please do the following 1) Upgrade to 2.1.56 from 2.1.53 due to the bugs in 2.1.53 and some speed-up 2) Please install the latest version of octave-forge where the sort and randn codes are sped-up, which you are clearly missing. 3) Please check that you are linked against Atlas and FFTW for acceleration of matrix and fft operations. Use the 'octave_config_info' command and look for BLAS_LIBS and FFTW_LIBS and see what they are set to. 4) For the 100x case please vectorize your code.... Do this and then we can talk on an even footing about speed.... According to Michael Martin <[hidden email]> (on 03/08/04): > > The float test is a rather interesting result. The code for it is as > follows: > > %% Float test --------------------------------------------------- > fprintf('Float test\n'); > tic > for j = 1:100 > a = 1; > for i = 1:2000 > a = tan(atan(exp(log(sqrt(a*a))))) + 1; > end > end > elapsedTime = toc; > fprintf ('a = %f, err = %f, time = %f\n\n', a, a - 2001, > elapsedTime); Vectorize code wherever possible!!!!! Octave doesn't have JIT which is where Matlab wins on this case. The above is a null example, since the core of the algorithm results in a null operation as 'tan(atan(exp(log(sqrt(a*a))))) = a', so a vectorized version is a = 2001; But if you are getting such big differences between matlab and octave it is time to consider how you have written your code, and probably make similar simplifications to the above. Regards David -- David Bateman [hidden email] Motorola CRM +33 1 69 35 48 04 (Ph) Parc Les Algorithmes, Commune de St Aubin +33 1 69 35 77 01 (Fax) 91193 Gif-Sur-Yvette FRANCE The information contained in this communication has been classified as: [x] General Business Information [ ] Motorola Internal Use Only [ ] Motorola Confidential Proprietary ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html ------------------------------------------------------------- |
In reply to this post by Dmitri A. Sergatskov-2
Dmitri,
Please find attached my test code for the speed of the sort code. This is based on a simialr benchmark within Python itself. Code is called like octave:1> sort_time = testsort('sort'); Test of the sorting function (sort) \ N |1.00e+02 1.00e+03 1.00e+04 1.00e+05 1.00e+06 Test \ | *sort |3.82e-05 5.26e-04 6.41e-03 8.65e-02 1.18e+00 \sort |1.15e-05 1.12e-04 1.13e-03 1.75e-02 2.08e-01 /sort |1.10e-05 1.07e-04 1.09e-03 1.49e-02 1.69e-01 3sort |2.06e-05 1.26e-04 1.19e-03 1.79e-02 2.22e-01 +sort |1.69e-05 1.19e-04 1.14e-03 1.66e-02 1.99e-01 =sort |1.09e-05 1.06e-04 1.11e-03 1.52e-02 1.67e-01 Normalized times against Matlab R12 on an IBM T23 \ N |1.00e+02 1.00e+03 1.00e+04 1.00e+05 1.00e+06 Test \ | *sort |2.25e+00 2.30e+00 2.35e+00 2.40e+00 2.31e+00 \sort |9.94e-01 8.17e-01 6.47e-01 7.78e-01 6.33e-01 /sort |1.03e+00 8.33e-01 6.59e-01 6.75e-01 5.49e-01 3sort |1.88e+00 9.63e-01 7.28e-01 8.20e-01 7.15e-01 +sort |1.12e+00 6.38e-01 5.17e-01 5.43e-01 4.27e-01 =sort |8.97e-01 6.76e-01 5.22e-01 5.64e-01 4.43e-01 octave:2> sort_time_index = testsort('sort, [], 1); Test of the sorting function (sort) with indexing \ N |1.00e+02 1.00e+03 1.00e+04 1.00e+05 1.00e+06 Test \ | *sort |6.39e-05 8.74e-04 1.15e-02 1.80e-01 2.55e+00 \sort |2.50e-05 2.54e-04 2.66e-03 3.95e-02 4.70e-01 /sort |2.46e-05 2.43e-04 2.51e-03 3.33e-02 4.03e-01 3sort |4.03e-05 2.73e-04 2.63e-03 3.66e-02 4.30e-01 +sort |3.34e-05 2.61e-04 2.56e-03 3.52e-02 4.20e-01 =sort |2.49e-05 2.49e-04 2.54e-03 3.40e-02 4.07e-01 Normalized times against Matlab R12 on an IBM T23 \ N |1.00e+02 1.00e+03 1.00e+04 1.00e+05 1.00e+06 Test \ | *sort |2.69e+00 2.89e+00 3.09e+00 3.12e+00 3.04e+00 \sort |1.54e+00 1.37e+00 1.19e+00 1.24e+00 1.14e+00 /sort |1.75e+00 1.50e+00 1.25e+00 1.18e+00 1.11e+00 3sort |2.67e+00 1.68e+00 1.31e+00 1.30e+00 1.19e+00 +sort |1.59e+00 1.08e+00 8.78e-01 8.42e-01 7.50e-01 =sort |1.28e+00 9.49e-01 6.93e-01 6.62e-01 6.07e-01 Note the times calibrated against Matlab R12 are only valid on my machine. D. According to Dmitri A. Sergatskov <[hidden email]> (on 03/09/04): > Paul Kienzle wrote: > > >It is reported to have better performance on partially ordered lists, > >but worse on random data compared to matlab. > > Well, may be I misunderstood what does "partially ordered means", but here > is > what I got. > (Matlab 6.5): > > >> x1=rand(3000); % random > >> x2=[1:3000]; % ordered > >> x3=repmat(x2,3000,1); % > >> x=x3+2*x1; % partially ordered ? > > > > >> tic ; sort(x1) ; toc > > elapsed_time = > > 1.4351 > > >> tic ; sort(x3) ; toc > > elapsed_time = > > 0.8567 > > >> tic ; sort(x) ; toc > > elapsed_time = > > 1.4285 > > >> x=x3+1.1*x1; > >> tic ; sort(x) ; toc > > elapsed_time = > > 1.4355 > > >> tic ; [y,idx]=sort(x) ; toc > > elapsed_time = > > 2.2073 > > >> tic ; [y,idx]=sort(x) ; toc % So we will not need to allocate y > and idx > > elapsed_time = > > 2.0607 > > > (Octave 2.1.56): > > octave:1> x1=rand(3000); > octave:2> x2=[1:3000]; > octave:3> x3=repmat(x2,3000,1); > octave:4> x=x3+2*x1; > octave:5> tic ; sort(x) ; toc > ans = 2.1968 > octave:6> tic ; sort(x1) ; toc > ans = 1.9328 > octave:7> tic ; sort(x3) ; toc > ans = 0.58155 > octave:8> tic ; [y,idx]= sort(x) ; toc > ans = 3.5003 > octave:9> tic ; [y,idx]= sort(x) ; toc > ans = 3.4638 > > > >Paul Kienzle > >[hidden email] > > > > Dmitri. > > > > ------------------------------------------------------------- > Octave is freely available under the terms of the GNU GPL. > > Octave's home on the web: http://www.octave.org > How to fund new projects: http://www.octave.org/funding.html > Subscription information: http://www.octave.org/archive.html > ------------------------------------------------------------- -- David Bateman [hidden email] Motorola CRM +33 1 69 35 48 04 (Ph) Parc Les Algorithmes, Commune de St Aubin +33 1 69 35 77 01 (Fax) 91193 Gif-Sur-Yvette FRANCE The information contained in this communication has been classified as: [x] General Business Information [ ] Motorola Internal Use Only [ ] Motorola Confidential Proprietary ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html ------------------------------------------------------------- |
In reply to this post by David Bateman-3
On Mar 9, 2004, at 3:44 AM, David Bateman wrote: > 1) Upgrade to 2.1.56 from 2.1.53 due to the bugs in 2.1.53 and some > speed-up I will do that if I get a chance. My installation is via fink, which is generally clean and easy, if not always the latest ad greatest. Since I am in a production mode, I would prefer not to live on the bleeding edge. Fink is quite acceptable to me. Mac OSX has ATLAS & BLAS built-in via its Frameworks, and my octave/fink does indeed use them. I would say that my results show this given the similarity between my numbers for the matrix operations and Paul Thomas' numbers. Though we are running on different architectures, the same sorts of numbers are seen in both or our octave and matlab results. fftw libs is also required by fink, by the way. I have experimented over the last several years with compiler settings, compilers etc. I have run different level of both octave and matlab, and on different platforms, etc. The results have shown pretty much the same trends wrt looping. I doubt an upgrade will change the results significantly. > 2) Please install the latest version of octave-forge where the sort > and randn > codes are sped-up, which you are clearly missing. Yes, I will try and do that. I am sure the discrepancy in the sort times was due to different codes being run. Again, that is consistent with Paul Thomas' numbers and mine. Ours match pretty closely EXCEPT for the sort. Ergo different codes are likely the source of the difference. Taking the next two out of order.... > Vectorize code wherever possible!!!!! Octave doesn't have JIT which is > where Matlab wins on this case. The above is a null example, since the > core of the algorithm results in a null operation as > 'tan(atan(exp(log(sqrt(a*a))))) = a', so a vectorized version is > > a = 2001; You have missed the point of the test. If the test had been to calculate 2001 as quickly as possible, then I would certainly agree! :-) However, the test was not to see how quickly one can calculate 2001, but to test the speed and numerical stability of ocatve for the common transcendental functions. Since 2001 and is a nice human-readable number, it is easy too subtract it from the result get a feel for the error. The repeated use of the transcendental functions and inverses will usually quickly show if there are any serious problems with them. I always run this test whenever I port code to a new platform or whenever a platform has seen a significant OS or compiler update. Usually I use my standard C program and Fortran program that I have had for what must be close to 20 years now. However, when I am working with a package such as octave or matlab, I will code it into the package's langauge and test it there. In the past this test has revealed significant problems with the code of transcendental functions on more than one platform/compiler/package. All of my applications make extensive use of transcendental functions and any significant errors in them will lead to erroneous results. Call me paranoid :-), but since my algorithms ends up in manned spacecraft, I would rather know **before** flight that my codes have problems in them as opposed to **in** flight, at which time it would invariably be too late. :-) Now octave certainly passed the numerical stability test, so that is not an issue. However, the test did reveal that for whatever reasons, this particular piece of code does not run as fast in octave as opposed to matlab. My intent on this test was simply to inform the community. Perhaps someone is currently working on some octave code that may find the data of use. > 4) For the 100x case please vectorize your code.... Regrettably, this is not possible in the particular case I mentioned. Not all codes can be vectorized as much as one might like. There a great number of different reasons for this. Let me touch upon just two of them; namely algorithmic similarity, and heuristic type codes. In the case of algorithmic similarity, if one has a code that produces a result independently of other systems, then one does indeed have great flexibility in coding style, etc. However, if one has a code or algorithm that one is testing and that will be used in the "real world", then it is best to keep the code as reasonably close to what will run in the "real world". In other words, if the goal of the analysis is to ensure the algorithm will behave properly under a wide range of input scenarios in the real world, then the algorithm should be kept as close as possible to the actual algorithm as it will be used in the real world. Only then can one have confidence that the algorithm will behave the same in the real world as it is behaving in the laboratory. In the case of heuristic type codes, at least the ones I have, there are no linear/matrix formulations of the problem. Yes, vectors are involved. Vectors are associated with the control system effectors and vector/matrix calculations are taking place, so at that level the code can be vectorized. However, these vectors are searched for suitability to satisfy the control system requirements. The problem is combinatorial and cannot really be vectorized. Not only does this control system algorithm not lend itself very well to vectorization, it itself is used in a non-vectorized fashion. A great number of cases with varying control system requirements are run in a non-linear fashion in order to determine control system performance and robustness at extrema. Again, my intent on mentioning looping & arrays (where the inefficiency seems to lie), was simply to inform the community. Perhaps someone is currently working on some octave code that may find the observations of use. Given how well octave does with most of the common vector and matrix functions, I would propose that if folks are looking at speed improvement strategies for the octave code, then there is much gold to mine here with looping & arrays. More so, I would say then further improvements to matrix & vector functions, since for the most part they are on par with matlab, if not better. ---------------------------------------------------------------------- Michael W. Martin Phone: (281) 333-2177 Draper Laboratory FAX: (281) 333-5276 2200 Space Park Dr. EMail: [hidden email] Houston, TX 77058 WWW: http://www.jsc.draper.com/ USA Mail Code: EG/Draper ---------------------------------------------------------------------- ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html ------------------------------------------------------------- |
According to Michael Martin <[hidden email]> (on 03/09/04):
> > On Mar 9, 2004, at 3:44 AM, David Bateman wrote: > > >1) Upgrade to 2.1.56 from 2.1.53 due to the bugs in 2.1.53 and some > >speed-up > > I will do that if I get a chance. My installation is via fink, which > is generally clean and easy, if not always the latest ad greatest. > Since I am in a production mode, I would prefer not to live on the > bleeding edge. Fink is quite acceptable to me. Mac OSX has ATLAS & BLAS > built-in via its Frameworks, and my octave/fink does indeed use them. I > would say that my results show this given the similarity between my > numbers for the matrix operations and Paul Thomas' numbers. Though we > are running on different architectures, the same sorts of numbers are > seen in both or our octave and matlab results. fftw libs is also > required by fink, by the way. Humm, as for bleeding-edge 2.1.53 is as bleeding as it gets. Since then has mostly be bug fixes. Try octave:1> a(:,:,1) = ones(2,2); octave:2> a(:,:,2) = 2*ones(2,2) a = ans(:,:,1) = 1 1 1 1 ans(:,:,2) = 2 2 2 2 and see what your answer is. There are lots of other fixes, though. octave as of 2.1.54 needs FFTW3 and not FFTW2. So the presence of FFTW on your system is not a guarantee that it can be used. > > I have experimented over the last several years with compiler > settings, compilers etc. I have run different level of both octave and > matlab, and on different platforms, etc. The results have shown pretty > much the same trends wrt looping. I doubt an upgrade will change the > results significantly. Loop performance is related to JIT. Vectorize ALL loops... If you can avoid the loop, the the code is a strong candidate for conversion to an oct-file. In this manor I have virtually no performance difference between matlab/octave. > >2) Please install the latest version of octave-forge where the sort > >and randn > > codes are sped-up, which you are clearly missing. > > Yes, I will try and do that. I am sure the discrepancy in the sort > times was due to different codes being run. Again, that is consistent > with Paul Thomas' numbers and mine. Ours match pretty closely EXCEPT > for the sort. Ergo different codes are likely the source of the > difference. > > Taking the next two out of order.... > > >Vectorize code wherever possible!!!!! Octave doesn't have JIT which is > >where Matlab wins on this case. The above is a null example, since the > >core of the algorithm results in a null operation as > >'tan(atan(exp(log(sqrt(a*a))))) = a', so a vectorized version is > > > >a = 2001; > > You have missed the point of the test. If the test had been to > calculate 2001 as quickly as possible, then I would certainly agree! > :-) However, the test was not to see how quickly one can calculate > 2001, but to test the speed and numerical stability of ocatve for the > common transcendental functions. Since 2001 and is a nice > human-readable number, it is easy too subtract it from the result get a > feel for the error. The repeated use of the transcendental functions > and inverses will usually quickly show if there are any serious > problems with them. > > I always run this test whenever I port code to a new platform or > whenever a platform has seen a significant OS or compiler update. > Usually I use my standard C program and Fortran program that I have had > for what must be close to 20 years now. However, when I am working with > a package such as octave or matlab, I will code it into the package's > langauge and test it there. In the past this test has revealed > significant problems with the code of transcendental functions on more > than one platform/compiler/package. All of my applications make > extensive use of transcendental functions and any significant errors in > them will lead to erroneous results. Call me paranoid :-), but since my > algorithms ends up in manned spacecraft, I would rather know **before** > flight that my codes have problems in them as opposed to **in** flight, > at which time it would invariably be too late. :-) > > Now octave certainly passed the numerical stability test, so that is > not an issue. However, the test did reveal that for whatever reasons, > this particular piece of code does not run as fast in octave as opposed > to matlab. My intent on this test was simply to inform the community. > Perhaps someone is currently working on some octave code that may find > the data of use. I didn't miss the point as such..... We all know that Matlab is faster in loops by many orders of magnitude. However, there are no reason to use loops in virtually all cases. As a test of numerical stability, sure your test is fine. But its not a benchmark and this type of abuse of loops should be avoided at all costs (even in Matlab) > >4) For the 100x case please vectorize your code.... > > Regrettably, this is not possible in the particular case I > mentioned. Not all codes can be vectorized as much as one might like. There > a great number of different reasons for this. Let me touch upon just two > of them; namely algorithmic similarity, and heuristic type codes. > > In the case of algorithmic similarity, if one has a code that > produces a result independently of other systems, then one does indeed have > great flexibility in coding style, etc. However, if one has a code or > algorithm that one is testing and that will be used in the "real > world", then it is best to keep the code as reasonably close to what > will run in the "real world". In other words, if the goal of the > analysis is to ensure the algorithm will behave properly under a wide > range of input scenarios in the real world, then the algorithm should > be kept as close as possible to the actual algorithm as it will be used > in the real world. Only then can one have confidence that the algorithm > will behave the same in the real world as it is behaving in the > laboratory. > > In the case of heuristic type codes, at least the ones I have, there > are no linear/matrix formulations of the problem. Yes, vectors are > involved. Vectors are associated with the control system effectors and > vector/matrix calculations are taking place, so at that level the code > can be vectorized. However, these vectors are searched for suitability > to satisfy the control system requirements. The problem is > combinatorial and cannot really be vectorized. Not only does this > control system algorithm not lend itself very well to vectorization, it > itself is used in a non-vectorized fashion. A great number of cases > with varying control system requirements are run in a non-linear > fashion in order to determine control system performance and robustness > at extrema. > > Again, my intent on mentioning looping & arrays (where the > inefficiency seems to lie), was simply to inform the community. > Perhaps someone is currently working on some octave code that may find > the observations of use. Given how well octave does with most of the > common vector and matrix functions, I would propose that if folks are > looking at speed improvement strategies for the octave code, then there > is much gold to mine here with looping & arrays. More so, I would say > then further improvements to matrix & vector functions, since for the > most part they are on par with matlab, if not better. Yeah, I have some viterbi codes that also couldn't be vectorized. In this case, they should be implemented as oct-files.... The improvement in loops can only come from a complete rewrite of the octave parser, trying to get it to predict the types of arguments early and spit out byte-code for stuff it runs several times. Much like what matlab does. However, the amount of work involved in this is overwhelming and you won't see anything soon on this front. Regards David -- David Bateman [hidden email] Motorola CRM +33 1 69 35 48 04 (Ph) Parc Les Algorithmes, Commune de St Aubin +33 1 69 35 77 01 (Fax) 91193 Gif-Sur-Yvette FRANCE The information contained in this communication has been classified as: [x] General Business Information [ ] Motorola Internal Use Only [ ] Motorola Confidential Proprietary ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html ------------------------------------------------------------- |
In reply to this post by David Bateman-3
> > Vectorize code wherever possible!!!!! Octave doesn't have JIT which is > where Matlab wins on this case. Can someone please tell me what JIT is, and explain why Octave does not have/could not do JIT, if Octave has/does almost everything else that Matlab has/does? Henry ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html ------------------------------------------------------------- |
On Mar 9, 2004, at 5:50 PM, Henry F. Mollet wrote: > >> >> Vectorize code wherever possible!!!!! Octave doesn't have JIT which is >> where Matlab wins on this case. > > Can someone please tell me what JIT is, and explain why Octave does not > have/could not do JIT, if Octave has/does almost everything else that > Matlab > has/does? Just In Time compiler. For best effect, you want to turn: function total=sum(x) total=0; for i=1:n, total+=x(i); end into something like: double total = 0; int i; for (i=0; i < n; i++) total += x[i]; return total; If that looks easy, consider that x may be real or complex or a sparse matrix or a galois field or a indefinite precision number, ... And consider that the size of x may be less than n. And consider that x may be a function (matlab syntax does not distinguish between functions and array indexing), which returns a different value type for each i. You don't know until you call the function what the type of x is. The function case is particularly tricky since there are no guarantees. For example, I can override the builtin sin function so that it returns strings. Or the function that I'm calling could replace a builtin function during the loop. Yes you can build in checks to make sure the results are as expected. If you are not careful, though, you will end up rebuilding the octave interpreter, and it won't be any faster. We could probably hack in some simple cases which would make the benchmarks look good. Some of them would even be practical. I'm sure if you poke at the matlab interpreter enough you will find places where their optimization doesn't result in any speed up. Try for example putting an eval statement in the loop (after eval, all bets are off). Paul Kienzle [hidden email] ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html ------------------------------------------------------------- |
And for those of you that live in North America, here is an opportunity
that I found, whilst researching JIT a couple of weeks back: Work on the Just In Time (JIT) Compiler Team to improve MATLAB performance. JIT Compiler Team Member 10/14/2003 Natick, MA The JIT (Just In Time compiler) team is a small group within The MathWorks charged with improving the performance of MATLAB to near FORTRAN / C levels. Because MATLAB is a typeless, interactive language, this effort involves technical challenges rarely seen in other languages. We are looking for another team member who is experienced, creative, can work effectively independently as well as in a group, and is willing to expand the state of the art while delivering value to tens of thousands of customers. For the reasons that Paul describes below, "technical challenge" is an understatement. I prefer the Japanese phrase - chotto muri desu-ka > "it's slightly impossible" Paul T Paul Kienzle wrote: > > If that looks easy, consider that x may be real or complex or > a sparse matrix or a galois field or a indefinite precision > number, ... And consider that the size of x may be less than > n. And consider that x may be a function (matlab syntax does > not distinguish between functions and array indexing), which > returns a different value type for each i. You don't know until > you call the function what the type of x is. > ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html ------------------------------------------------------------- |
I have been asking around about destructors and octave classes. I
couldn't get a satisafactory answer, so here are the questions for the list: (i) Why do the derived classes at the bottom of the inheritance tree, like Matrix or ColumnVector, not have destructors? (ii) The following compiles and doesn't obviously cause any runtime problems: Matrix x( irow, icol ); ...... delete [] &x ( 0 , 0 ); Is this accomplishing deallocation of the space allocated to x, or does it cause memory leakage? If these seem trivial to you, please excuse me, I am still something of a C++ tyro. Paul T ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html ------------------------------------------------------------- |
Powered by Nabble | Edit this page |