Critical matrix sparsity

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Critical matrix sparsity

dariodematties
Hello People,

If I have a two-dimensional matrix with double elements.
Is there a sparsity threshold which let me know if it is convenient to save such
matrix as sparse or as dense?
Is there any general formula to compute this decision?

Thanks
Reply | Threaded
Open this post in threaded view
|

Re: Critical matrix sparsity

siko1056
dariodematties wrote
Hello People,

If I have a two-dimensional matrix with double elements.
Is there a sparsity threshold which let me know if it is convenient to save such
matrix as sparse or as dense?
Is there any general formula to compute this decision?

Thanks
For me there is no "golden rule" to use sparse matrices. If you know your matrices are really sparse you definitely should work with them. Otherwise I would consider the following things:

1. Do I face memory limitations?
a) A dense m*n-matrix requires about  m*n*sizeof(double) Bytes of memory [1]
b) A sparse m*n* matrix about [(n+1) + nnz(matrix)]*sizeof(int) + nnz(matrix)*sizeof(double) Bytes of memory [2,3]

=> So the number of nonzero matrix entries should at least be below 50%. If your matrix is really sparse, then the number of columns n should be smaller than the number of rows m (transpose if not). Otherwise you waste space for storing zero columns.

2. Do I often extract parts of the matrix or insert elements?
Randomly extracting or adding parts of a sparse matrix might be a real performance killer! If you only intent to call system functions on the sparse matrices (like factorizations or solving systems of equations, which are optimized for those data structures) you'll face the desired effect. Otherwise you wisely have to profile your code to detect operations that slow down the calculation by improper memory access.

HTH,
Kai

[1]: sizeof(double) usually 8 Byte
[2]: nnz(matrix) is the number of nonzero matrix elements, sizeof(int) usually 4 or 8 Byte
[3]: Read more at https://arxiv.org/abs/cs/0604006
Reply | Threaded
Open this post in threaded view
|

Re: Critical matrix sparsity

dariodematties
Thanks siko1056,

Up to now, my concern is about taking the decision of whether to save the matrix in sparse or dense format (in hard disk).
I am working with dense matrices in my code.
Yet, ram memory shortages can come soon, then I think all your advises are very handy.

Thank you very much,

Dario
Reply | Threaded
Open this post in threaded view
|

Re: Critical matrix sparsity

NJank
In reply to this post by siko1056
On Jun 8, 2017 5:52 PM, "siko1056" <[hidden email]> wrote:
dariodematties wrote

 If
your matrix is really sparse, then the number of columns n should be smaller
than the number of rows m (transpose if not). Otherwise you waste space for
storing zero columns.

Maybe a naive side question, but wouldn't it make sense to have the transposition automatic and transparent as part of the storage class? A "transposed?" flag and the check would be the only overhead.

_______________________________________________
Help-octave mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Critical matrix sparsity

siko1056
NJank wrote
siko1056 wrote
If your matrix is really sparse, then the number of columns n should be smaller
than the number of rows m (transpose if not). Otherwise you waste space for
storing zero columns.
Maybe a naive side question, but wouldn't it make sense to have the
transposition automatic and transparent as part of the storage class? A
"transposed?" flag and the check would be the only overhead.
I don't think it is a naive question, I also thought about it. In my opinion the programmer should control the sparse matrix orientation, especially in regard of my second assertion:

siko1056 wrote
2. Do I often extract parts of the matrix or insert elements?
Randomly extracting or adding parts of a sparse matrix might be a real performance killer!
If I construct a sparse matrix and manipulate it occasionally, and despite this inefficiency, Octave transposes back and forth (because the number of columns is temporarily below a threshold value) I have a serious not controllable performance breakdown. So Octave should remain predictive. To mention this nuance came up to my mind, as I work with matrices like this one, where it saves about 2/3 space to choose the "right" orientation:

>> typeinfo (A)
ans = sparse matrix
>> size (A)
ans = 2,569,260      7,230
>> nnz(A)
ans =  855,800
>> nnz(A) / prod (size (A)) % less than 1% filled
ans =    4.6071e-05
>> whos A
Variables in the current scope:

   Attr Name         Size                     Bytes  Class
   ==== ====         ====                     =====  =====
        A      2,569,260x7,230                13,834,168  double

Total is 18,575,749,800 elements using 13,834,168 bytes

>> A = A'; size (A)
ans = 7230   2569260

>> whos A
Variables in the current scope:

   Attr Name        Size                     Bytes  Class
   ==== ====        ====                     =====  =====
        A        7,230x2,569,260             34,246,888  double

Total is 18,575,749,800 elements using 34,246,888 bytes

>> spy(A) % as attachment



Best,
Kai
Reply | Threaded
Open this post in threaded view
|

Re: Critical matrix sparsity

NJank
On Jun 9, 2017 3:05 AM, "siko1056" <[hidden email]> wrote:
NJank wrote
>
> siko1056 wrote
>> If your matrix is really sparse, then the number of columns n should be
>> smaller
>> than the number of rows m (transpose if not). Otherwise you waste space
>> for
>> storing zero columns.
> Maybe a naive side question, but wouldn't it make sense to have the
> transposition automatic and transparent as part of the storage class? A
> "transposed?" flag and the check would be the only overhead.

I don't think it is a naive question, I also thought about it. In my opinion
the programmer should control the sparse matrix orientation, especially in
regard of my second assertion:


siko1056 wrote
> 2. Do I often extract parts of the matrix or insert elements?
> Randomly extracting or adding parts of a sparse matrix might be a real
> performance killer!

If I construct a sparse matrix and manipulate it occasionally, and despite
this inefficiency, Octave transposes back and forth (because the number of
columns is temporarily below a threshold value) I have a serious not
controllable performance breakdown. So Octave should remain predictive. To
mention this nuance came up to my mind, as I work with matrices like this
one, where it saves about 2/3 space to choose the "right" orientation:

>> typeinfo (A)
ans = sparse matrix
>> size (A)
ans = 2,569,260      7,230
>> nnz(A)
ans =  855,800
>> nnz(A) / prod (size (A)) % less than 1% filled
ans =    4.6071e-05
>> whos A
Variables in the current scope:

   Attr Name         Size                     Bytes  Class
   ==== ====         ====                     =====  =====
        A      2,569,260x7,230                13,834,168  double

Total is 18,575,749,800 elements using 13,834,168 bytes

>> A = A'; size (A)
ans = 7230   2569260

>> whos A
Variables in the current scope:

   Attr Name        Size                     Bytes  Class
   ==== ====        ====                     =====  =====
        A        7,230x2,569,260             34,246,888  double

Total is 18,575,749,800 elements using 34,246,888 bytes

>> spy(A) % as attachment

<http://octave.1599824.n4.nabble.com/file/n4683608/A.png>

Best,
Kai

Perhaps it would be a decision on sparse matrix creation only to avoid the continued reflipping possibility.

_______________________________________________
Help-octave mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-octave