I was just wondering, how well is GNU Octave compiled to recent CPU architectures?
When running the script below, my CPU simply remains running at the lowest speed: 1.6GHz.
In fact 'turbo mode' should kick in and the CPU should run at 3.8GHz, but it doesn't.
When I trigger the CPU with another program, the script runs twice as fast.
Is there a solution for this problem?
Linux Kernel: 3.5.0-17
Gnu Octave: 3.6.4-rc0
cumulate = 0; p = 0; vt = 0; vr = 0; vrt = 0; rvt = 0; RV = 0; j = 0; k = 0;
x2 = 0; R = 0; Rxx = 0; Ryy = 0; Rxy = 0; Ryx = 0; Rvmax = 0; f = 0;
for i = 1:runs
x = abs(randn(100,100));
% Calculation of Escoufier's equivalent vectors
p = size(x, 2);
vt = [1:p]; % Variables to test
vr = ; % Result: ordered variables
RV = [1:p]; % Result: correlations
for j = 1:p % loop on the variable number
Rvmax = 0;
for k = 1:(p-j+1) % loop on the variables
if j == 1
x2 = [x, x(:, vt(k))];
x2 = [x, x(:, vr), x(:, vt(k))]; % New table to test
R = corr(x2); % Correlations table
Ryy = R(1:p, 1:p);
Rxx = R(p+1:p+j, p+1:p+j);
Rxy = R(p+1:p+j, 1:p);
Ryx = Rxy';
rvt = trace(Ryx*Rxy)/((trace(Ryy^2)*trace(Rxx^2))^0.5); % RV calculation
if rvt > Rvmax
Rvmax = rvt; % test of RV
vrt(j) = vt(k); % temporary held variable
vr(j) = vrt(j); % Result: variable
RV(j) = Rvmax; % Result: correlation
f = find(vt~=vr(j)); % identify the held variable
vt = vt(f); % reidentify variables to test
timing = toc;
cumulate = cumulate + timing;
times(5, 3) = timing;
disp(['Escoufier''s method on a 100x100 matrix (mixed)________ (sec): ' num2str(timing)])
clear x; clear p; clear vt; clear vr; clear vrt; clear rvt; clear RV; clear j; clear k;
clear x2; clear R; clear Rxx; clear Ryy; clear Rxy; clear Ryx; clear Rvmax; clear f;
----- Original Message -----
> From: Graaf_van_Vlaanderen <[hidden email]>
> To: [hidden email] > Cc:
> Sent: Saturday, October 20, 2012 11:08 AM
> Subject: Sandy Bridge
> Is there a solution for this problem?
If and when your CPU steps to a different speed has absolutely nothing
to do with how a program is compiled, it solely depends on the settings
of the CPU governor in your operating system.
Help-octave mailing list
[hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave
> From: Martin Helm <[hidden email]>
> To: Graaf_van_Vlaanderen <[hidden email]>; "[hidden email]" <[hidden email]>
> Sent: Monday, October 22, 2012 2:05 AM
> Subject: Re: Sandy Bridge
> If and when your CPU steps to a different speed has absolutely nothing
> to do with how a program is compiled, it solely depends on the settings
> of the CPU governor in your operating system.
I am not sure that the situation is that straightforward.
For example, it is known that RAM accesses are much slower than cache ones. And if the CPU HW sees too many RAM accesses it may not speed up the CPU - it will still be waiting for data from RAM.
If OTOH there are a lot of cache accesses, it makes for the CPU sense to work faster.
Cache <-> RAM may be dependent on compiler optimizations - compilers are often aware of cache friendliness.
And, specifically, ATLAS is built in cache friendliness awareness mode.
Thanks already for the feedback.
When I run the script on older Intel CPU's, like a Q6600 or E5400, the system steps up to the maximum CPU speed. So, yes probably there is something missing in the Linux kernel for this.
Regarding compilation there are some tweaks available for Sandy and the last Ivy bridge in the GNU compiler.