eig on several matrices in a vectorized manner

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

eig on several matrices in a vectorized manner

niconeuman
Greetings,
this is my first post in this mailing-list. I used Matlab for many years but
since 2018 I switched to Octave, and plan to keep using it in my research.
I have to calculate eigenvalues of multiple matrices which have all the same
size (2x2,4x4,8x8, etc) and are Hermitian, repeatedly.
Running this on a loop seems to me to be inefficient. I have tried to do a
small benchmark test in which I compare looping, changing the size of the
matrices and the number of matrices; against generating a supermatrix by
using blkdiag, and taking the eigenvalues all at once.
The blkdiag approach is only faster when the size of the matrices are 2x2,
but for larger systems it becomes quite slower than looping. I imagined this
has to do with the use of memory by the large block supermatrices. So I
tried to convert the individual matrices to sparse before making the block
supermatrix. This has not improved the timings for the block approach.
Perhaps I'm doing something wrong, but in any case, is there some way of
broadcasting the eig function.
I imagine I can write the Cholesky algorithm so that it acts on the elements
of multiple matrices at the same time, and taylor that to my problem. But if
there is some solution already implemented I think it would be better.
Thank you very much!
Best regards,
Nicolás Neuman



--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


Reply | Threaded
Open this post in threaded view
|

Re: eig on several matrices in a vectorized manner

Juan Pablo Carbajal-2
Hi,

It would be useful to have a simple code to test your results.

Also, how many matrices do you need to process? for a few hundrerds,
looping is not the source of cost, it is just the cost of eig itself.
I do not see much point on vectorizing eig (maybe I can be convinced
otehrwise) so I think the options for speed up here are:

1. paralelization
2. sending the loop a level bellow e.g. cellfun (do not expect huge
improvements)

Alternatively, (and risky) if your matrix set is homogenous in size
(ie.e. all matrice have the same size), try to reduce the set into a
basis of matrices and try exploit Horn's conjecture[1].

[1]: https://terrytao.wordpress.com/tag/schur-horn-inequalities/
On Fri, Oct 19, 2018 at 5:02 PM niconeuman <[hidden email]> wrote:

>
> Greetings,
> this is my first post in this mailing-list. I used Matlab for many years but
> since 2018 I switched to Octave, and plan to keep using it in my research.
> I have to calculate eigenvalues of multiple matrices which have all the same
> size (2x2,4x4,8x8, etc) and are Hermitian, repeatedly.
> Running this on a loop seems to me to be inefficient. I have tried to do a
> small benchmark test in which I compare looping, changing the size of the
> matrices and the number of matrices; against generating a supermatrix by
> using blkdiag, and taking the eigenvalues all at once.
> The blkdiag approach is only faster when the size of the matrices are 2x2,
> but for larger systems it becomes quite slower than looping. I imagined this
> has to do with the use of memory by the large block supermatrices. So I
> tried to convert the individual matrices to sparse before making the block
> supermatrix. This has not improved the timings for the block approach.
> Perhaps I'm doing something wrong, but in any case, is there some way of
> broadcasting the eig function.
> I imagine I can write the Cholesky algorithm so that it acts on the elements
> of multiple matrices at the same time, and taylor that to my problem. But if
> there is some solution already implemented I think it would be better.
> Thank you very much!
> Best regards,
> Nicolás Neuman
>
>
>
> --
> Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html
>
>


Reply | Threaded
Open this post in threaded view
|

Re: eig on several matrices in a vectorized manner

niconeuman
Hi, thank you for the prompt response!
The problem I want to apply this for is simulation of magnetic resonance spectra. Basically the matrices are functions of a magnetic field B, and I have to find the values B0 for which energy (eigenvalue) differences in the energy matrix match a certain value (microwave frequency). Finding of the values B0 is done adaptively, by dividing the interval [Bmin,Bmax] in halfs, and finding by approximation the solutions. My algorithm is probably very far from efficient at this point, but I estimate I need to diagonalize tens of matrices per cycle of the adaptive scheme.
It is highly possible that the actual diagonalization is not the time consuming part, but then the short loops I run to find the eigenvalue differences are what is slowing everything down. I should rethink the whole thing.
Anyway, below I send the code for the small benchmark I made. Perhaps there is a much faster way of doing this.
Thank you very much!
Best,
Nicolas

%begin

clear all, close all
more off

disp(['Size of each matrix: ',' Number of matrices: ',' Loop to block ratio: ']);
for pMatSize = [2,4,8,16,32,64,128]
    for nMat = 2:20
        SizeMatrices = pMatSize;
        Nmatrices = nMat;
        eigA = zeros(SizeMatrices,Nmatrices);
        timeLoop = zeros(Nmatrices,1);

        augA = cell(Nmatrices,1);
        for k = 1:Nmatrices       
            tmpA = rand(SizeMatrices,SizeMatrices);
            augA{k} = tmpA'*tmpA;
        end
       
        tic
        for k = 1:Nmatrices   
            eigA(:,k) = eig(augA{k});  
        end
        timeLoop = toc;
               
        augAmat = blkdiag(augA{:});
        tic
        eigaugA = eig(augAmat);
        timeBlock = toc;
       
        disp([num2str(SizeMatrices) '        ' num2str(Nmatrices) '        ' num2str(timeLoop/timeBlock)]);
    end
end


%end



On Sun, Oct 21, 2018 at 11:06 PM Juan Pablo Carbajal <[hidden email]> wrote:
Hi,

It would be useful to have a simple code to test your results.

Also, how many matrices do you need to process? for a few hundrerds,
looping is not the source of cost, it is just the cost of eig itself.
I do not see much point on vectorizing eig (maybe I can be convinced
otehrwise) so I think the options for speed up here are:

1. paralelization
2. sending the loop a level bellow e.g. cellfun (do not expect huge
improvements)

Alternatively, (and risky) if your matrix set is homogenous in size
(ie.e. all matrice have the same size), try to reduce the set into a
basis of matrices and try exploit Horn's conjecture[1].

[1]: https://terrytao.wordpress.com/tag/schur-horn-inequalities/
On Fri, Oct 19, 2018 at 5:02 PM niconeuman <[hidden email]> wrote:
>
> Greetings,
> this is my first post in this mailing-list. I used Matlab for many years but
> since 2018 I switched to Octave, and plan to keep using it in my research.
> I have to calculate eigenvalues of multiple matrices which have all the same
> size (2x2,4x4,8x8, etc) and are Hermitian, repeatedly.
> Running this on a loop seems to me to be inefficient. I have tried to do a
> small benchmark test in which I compare looping, changing the size of the
> matrices and the number of matrices; against generating a supermatrix by
> using blkdiag, and taking the eigenvalues all at once.
> The blkdiag approach is only faster when the size of the matrices are 2x2,
> but for larger systems it becomes quite slower than looping. I imagined this
> has to do with the use of memory by the large block supermatrices. So I
> tried to convert the individual matrices to sparse before making the block
> supermatrix. This has not improved the timings for the block approach.
> Perhaps I'm doing something wrong, but in any case, is there some way of
> broadcasting the eig function.
> I imagine I can write the Cholesky algorithm so that it acts on the elements
> of multiple matrices at the same time, and taylor that to my problem. But if
> there is some solution already implemented I think it would be better.
> Thank you very much!
> Best regards,
> Nicolás Neuman
>
>
>
> --
> Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html
>
>


--
Dr. Nicolas Neuman

Institut für Chemie und Biochemie, Anorganische Chemie
Freie Universität Berlin,
Fabeckstraße 34–36, 14195, Berlin (Germany)

Instituto de Desarrollo Tecnológico para la Industria Química - INTEC
UNL-CONICET
CCT-CONICET Santa Fe
S3000ZAA Santa Fe - Santa Fe - Argentina

Jefe de Trabajos Prácticos
Departamento de Física
Fac. de Bioquímica y Ciencias Biológicas
Universidad Nacional del Litoral
Ciudad Universitaria - Paraje El Pozo
S3000ZAA Santa Fe - Santa Fe - Argentina


Reply | Threaded
Open this post in threaded view
|

Re: eig on several matrices in a vectorized manner

Juan Pablo Carbajal-2
Hi,

Here is (at least I think so) a simplified version of your example

N = 500;
sz = 16;
printf ('Run test for %d matrices of size %d\n', N, sz);

for k = 1:N
 M{k} = rand (sz); M{k} = M{k}.' * M{k};
endfor

printf ('** Interpreter loop\n');
tic
for k = 1:N
  eig (M{k});
endfor
toc

printf ('** C++ loop\n');
tic
cellfun (@eig, M, 'UniformOutput', false);
toc

printf ('** Blk matrix\n');
printf ('  - Assembly\n');
tic
MM = blkdiag (cellfun (@sparse, M, 'UniformOutput', false){:});
toc
printf ('  - eig\n');
tic
eig (MM);
toc

printf ('** Parallel loop\n');
pkg load parallel
tic
parcellfun (nproc (), @eig, M, 'UniformOutput', false);
toc


Run test for 500 matrices of size 16
** Interpreter loop
Elapsed time is 0.0342941 seconds.
** C++ loop
Elapsed time is 0.0255151 seconds.
** Blk matrix
  - Assembly
Elapsed time is 0.105091 seconds.
  - eig
Elapsed time is 57.415 seconds.
** Parallel loop
parcellfun: 500/500 jobs done
Elapsed time is 0.783254 seconds.

The timing results show that the block matrix is slower and also has
memory issues (for moderate matrix sizes (not FEM!!) will need batch
blk matrices).
The other solutions are comparable, the parallel one getting better
with the length of the loop.

What is the costly part? the length of the loop (number of matrices)
or the call to eig (size of the matrices)

If it is the length of the loop then parallelization and low level
loops are your answer (Octave doesn't get much faster than that
regarding loops).
If it is the call to eig, then: can you use eigs? (i.e. how many and
which eigenvalues do you care about?). Can you compress you matrix set
and then call eig in only a few generating matrices?

On Mon, Oct 22, 2018 at 8:57 AM nicolas neuman <[hidden email]> wrote:

>
> Hi, thank you for the prompt response!
> The problem I want to apply this for is simulation of magnetic resonance spectra. Basically the matrices are functions of a magnetic field B, and I have to find the values B0 for which energy (eigenvalue) differences in the energy matrix match a certain value (microwave frequency). Finding of the values B0 is done adaptively, by dividing the interval [Bmin,Bmax] in halfs, and finding by approximation the solutions. My algorithm is probably very far from efficient at this point, but I estimate I need to diagonalize tens of matrices per cycle of the adaptive scheme.
> It is highly possible that the actual diagonalization is not the time consuming part, but then the short loops I run to find the eigenvalue differences are what is slowing everything down. I should rethink the whole thing.
> Anyway, below I send the code for the small benchmark I made. Perhaps there is a much faster way of doing this.
> Thank you very much!
> Best,
> Nicolas
>
> %begin
>
> clear all, close all
> more off
>
> disp(['Size of each matrix: ',' Number of matrices: ',' Loop to block ratio: ']);
> for pMatSize = [2,4,8,16,32,64,128]
>     for nMat = 2:20
>         SizeMatrices = pMatSize;
>         Nmatrices = nMat;
>         eigA = zeros(SizeMatrices,Nmatrices);
>         timeLoop = zeros(Nmatrices,1);
>
>         augA = cell(Nmatrices,1);
>         for k = 1:Nmatrices
>             tmpA = rand(SizeMatrices,SizeMatrices);
>             augA{k} = tmpA'*tmpA;
>         end
>
>         tic
>         for k = 1:Nmatrices
>             eigA(:,k) = eig(augA{k});
>         end
>         timeLoop = toc;
>
>         augAmat = blkdiag(augA{:});
>         tic
>         eigaugA = eig(augAmat);
>         timeBlock = toc;
>
>         disp([num2str(SizeMatrices) '        ' num2str(Nmatrices) '        ' num2str(timeLoop/timeBlock)]);
>     end
> end
>
>
> %end
>
>
>
> On Sun, Oct 21, 2018 at 11:06 PM Juan Pablo Carbajal <[hidden email]> wrote:
>>
>> Hi,
>>
>> It would be useful to have a simple code to test your results.
>>
>> Also, how many matrices do you need to process? for a few hundrerds,
>> looping is not the source of cost, it is just the cost of eig itself.
>> I do not see much point on vectorizing eig (maybe I can be convinced
>> otehrwise) so I think the options for speed up here are:
>>
>> 1. paralelization
>> 2. sending the loop a level bellow e.g. cellfun (do not expect huge
>> improvements)
>>
>> Alternatively, (and risky) if your matrix set is homogenous in size
>> (ie.e. all matrice have the same size), try to reduce the set into a
>> basis of matrices and try exploit Horn's conjecture[1].
>>
>> [1]: https://terrytao.wordpress.com/tag/schur-horn-inequalities/
>> On Fri, Oct 19, 2018 at 5:02 PM niconeuman <[hidden email]> wrote:
>> >
>> > Greetings,
>> > this is my first post in this mailing-list. I used Matlab for many years but
>> > since 2018 I switched to Octave, and plan to keep using it in my research.
>> > I have to calculate eigenvalues of multiple matrices which have all the same
>> > size (2x2,4x4,8x8, etc) and are Hermitian, repeatedly.
>> > Running this on a loop seems to me to be inefficient. I have tried to do a
>> > small benchmark test in which I compare looping, changing the size of the
>> > matrices and the number of matrices; against generating a supermatrix by
>> > using blkdiag, and taking the eigenvalues all at once.
>> > The blkdiag approach is only faster when the size of the matrices are 2x2,
>> > but for larger systems it becomes quite slower than looping. I imagined this
>> > has to do with the use of memory by the large block supermatrices. So I
>> > tried to convert the individual matrices to sparse before making the block
>> > supermatrix. This has not improved the timings for the block approach.
>> > Perhaps I'm doing something wrong, but in any case, is there some way of
>> > broadcasting the eig function.
>> > I imagine I can write the Cholesky algorithm so that it acts on the elements
>> > of multiple matrices at the same time, and taylor that to my problem. But if
>> > there is some solution already implemented I think it would be better.
>> > Thank you very much!
>> > Best regards,
>> > Nicolás Neuman
>> >
>> >
>> >
>> > --
>> > Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html
>> >
>> >
>
>
>
> --
> Dr. Nicolas Neuman
>
> Institut für Chemie und Biochemie, Anorganische Chemie
> Freie Universität Berlin,
> Fabeckstraße 34–36, 14195, Berlin (Germany)
>
> Instituto de Desarrollo Tecnológico para la Industria Química - INTEC
> UNL-CONICET
> CCT-CONICET Santa Fe
> S3000ZAA Santa Fe - Santa Fe - Argentina
>
> Jefe de Trabajos Prácticos
> Departamento de Física
> Fac. de Bioquímica y Ciencias Biológicas
> Universidad Nacional del Litoral
> Ciudad Universitaria - Paraje El Pozo
> S3000ZAA Santa Fe - Santa Fe - Argentina


Reply | Threaded
Open this post in threaded view
|

Re: eig on several matrices in a vectorized manner

niconeuman
Thank you very much for your response!
I have been busy this week with experimental work and I couldn't answer before. Your suggestions seem really useful, I had seen the cellfun function before but I had no idea of its capabilities; will explore it on my code.
I probably need to run the eig on a loop, collect matrices containing the eigenvalues of all the individual matrices, and then do the rest of the operations in the function, which involve find() and vector-matrix multiplications, in a vectorized manner.
Thank you once again!

On Wed, Oct 24, 2018 at 12:56 AM Juan Pablo Carbajal <[hidden email]> wrote:
Hi,

Here is (at least I think so) a simplified version of your example

N = 500;
sz = 16;
printf ('Run test for %d matrices of size %d\n', N, sz);

for k = 1:N
 M{k} = rand (sz); M{k} = M{k}.' * M{k};
endfor

printf ('** Interpreter loop\n');
tic
for k = 1:N
  eig (M{k});
endfor
toc

printf ('** C++ loop\n');
tic
cellfun (@eig, M, 'UniformOutput', false);
toc

printf ('** Blk matrix\n');
printf ('  - Assembly\n');
tic
MM = blkdiag (cellfun (@sparse, M, 'UniformOutput', false){:});
toc
printf ('  - eig\n');
tic
eig (MM);
toc

printf ('** Parallel loop\n');
pkg load parallel
tic
parcellfun (nproc (), @eig, M, 'UniformOutput', false);
toc


Run test for 500 matrices of size 16
** Interpreter loop
Elapsed time is 0.0342941 seconds.
** C++ loop
Elapsed time is 0.0255151 seconds.
** Blk matrix
  - Assembly
Elapsed time is 0.105091 seconds.
  - eig
Elapsed time is 57.415 seconds.
** Parallel loop
parcellfun: 500/500 jobs done
Elapsed time is 0.783254 seconds.

The timing results show that the block matrix is slower and also has
memory issues (for moderate matrix sizes (not FEM!!) will need batch
blk matrices).
The other solutions are comparable, the parallel one getting better
with the length of the loop.

What is the costly part? the length of the loop (number of matrices)
or the call to eig (size of the matrices)

If it is the length of the loop then parallelization and low level
loops are your answer (Octave doesn't get much faster than that
regarding loops).
If it is the call to eig, then: can you use eigs? (i.e. how many and
which eigenvalues do you care about?). Can you compress you matrix set
and then call eig in only a few generating matrices?

On Mon, Oct 22, 2018 at 8:57 AM nicolas neuman <[hidden email]> wrote:
>
> Hi, thank you for the prompt response!
> The problem I want to apply this for is simulation of magnetic resonance spectra. Basically the matrices are functions of a magnetic field B, and I have to find the values B0 for which energy (eigenvalue) differences in the energy matrix match a certain value (microwave frequency). Finding of the values B0 is done adaptively, by dividing the interval [Bmin,Bmax] in halfs, and finding by approximation the solutions. My algorithm is probably very far from efficient at this point, but I estimate I need to diagonalize tens of matrices per cycle of the adaptive scheme.
> It is highly possible that the actual diagonalization is not the time consuming part, but then the short loops I run to find the eigenvalue differences are what is slowing everything down. I should rethink the whole thing.
> Anyway, below I send the code for the small benchmark I made. Perhaps there is a much faster way of doing this.
> Thank you very much!
> Best,
> Nicolas
>
> %begin
>
> clear all, close all
> more off
>
> disp(['Size of each matrix: ',' Number of matrices: ',' Loop to block ratio: ']);
> for pMatSize = [2,4,8,16,32,64,128]
>     for nMat = 2:20
>         SizeMatrices = pMatSize;
>         Nmatrices = nMat;
>         eigA = zeros(SizeMatrices,Nmatrices);
>         timeLoop = zeros(Nmatrices,1);
>
>         augA = cell(Nmatrices,1);
>         for k = 1:Nmatrices
>             tmpA = rand(SizeMatrices,SizeMatrices);
>             augA{k} = tmpA'*tmpA;
>         end
>
>         tic
>         for k = 1:Nmatrices
>             eigA(:,k) = eig(augA{k});
>         end
>         timeLoop = toc;
>
>         augAmat = blkdiag(augA{:});
>         tic
>         eigaugA = eig(augAmat);
>         timeBlock = toc;
>
>         disp([num2str(SizeMatrices) '        ' num2str(Nmatrices) '        ' num2str(timeLoop/timeBlock)]);
>     end
> end
>
>
> %end
>
>
>
> On Sun, Oct 21, 2018 at 11:06 PM Juan Pablo Carbajal <[hidden email]> wrote:
>>
>> Hi,
>>
>> It would be useful to have a simple code to test your results.
>>
>> Also, how many matrices do you need to process? for a few hundrerds,
>> looping is not the source of cost, it is just the cost of eig itself.
>> I do not see much point on vectorizing eig (maybe I can be convinced
>> otehrwise) so I think the options for speed up here are:
>>
>> 1. paralelization
>> 2. sending the loop a level bellow e.g. cellfun (do not expect huge
>> improvements)
>>
>> Alternatively, (and risky) if your matrix set is homogenous in size
>> (ie.e. all matrice have the same size), try to reduce the set into a
>> basis of matrices and try exploit Horn's conjecture[1].
>>
>> [1]: https://terrytao.wordpress.com/tag/schur-horn-inequalities/
>> On Fri, Oct 19, 2018 at 5:02 PM niconeuman <[hidden email]> wrote:
>> >
>> > Greetings,
>> > this is my first post in this mailing-list. I used Matlab for many years but
>> > since 2018 I switched to Octave, and plan to keep using it in my research.
>> > I have to calculate eigenvalues of multiple matrices which have all the same
>> > size (2x2,4x4,8x8, etc) and are Hermitian, repeatedly.
>> > Running this on a loop seems to me to be inefficient. I have tried to do a
>> > small benchmark test in which I compare looping, changing the size of the
>> > matrices and the number of matrices; against generating a supermatrix by
>> > using blkdiag, and taking the eigenvalues all at once.
>> > The blkdiag approach is only faster when the size of the matrices are 2x2,
>> > but for larger systems it becomes quite slower than looping. I imagined this
>> > has to do with the use of memory by the large block supermatrices. So I
>> > tried to convert the individual matrices to sparse before making the block
>> > supermatrix. This has not improved the timings for the block approach.
>> > Perhaps I'm doing something wrong, but in any case, is there some way of
>> > broadcasting the eig function.
>> > I imagine I can write the Cholesky algorithm so that it acts on the elements
>> > of multiple matrices at the same time, and taylor that to my problem. But if
>> > there is some solution already implemented I think it would be better.
>> > Thank you very much!
>> > Best regards,
>> > Nicolás Neuman
>> >
>> >
>> >
>> > --
>> > Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html
>> >
>> >
>
>
>
> --
> Dr. Nicolas Neuman
>
> Institut für Chemie und Biochemie, Anorganische Chemie
> Freie Universität Berlin,
> Fabeckstraße 34–36, 14195, Berlin (Germany)
>
> Instituto de Desarrollo Tecnológico para la Industria Química - INTEC
> UNL-CONICET
> CCT-CONICET Santa Fe
> S3000ZAA Santa Fe - Santa Fe - Argentina
>
> Jefe de Trabajos Prácticos
> Departamento de Física
> Fac. de Bioquímica y Ciencias Biológicas
> Universidad Nacional del Litoral
> Ciudad Universitaria - Paraje El Pozo
> S3000ZAA Santa Fe - Santa Fe - Argentina


--
Dr. Nicolas Neuman

Institut für Chemie und Biochemie, Anorganische Chemie
Freie Universität Berlin,
Fabeckstraße 34–36, 14195, Berlin (Germany)

Instituto de Desarrollo Tecnológico para la Industria Química - INTEC
UNL-CONICET
CCT-CONICET Santa Fe
S3000ZAA Santa Fe - Santa Fe - Argentina

Jefe de Trabajos Prácticos
Departamento de Física
Fac. de Bioquímica y Ciencias Biológicas
Universidad Nacional del Litoral
Ciudad Universitaria - Paraje El Pozo
S3000ZAA Santa Fe - Santa Fe - Argentina