On 29 April 2013 06:25, Miroslaw Kwasniak <[hidden email]> wrote:
> it's something wrong whith sparse matrices A(n,n) when n is a multiple > of 65536=2^16. > > Demonstration code ====================================== > > for i=1:3; > for n=i*2^16+(1:1); > A=spdiags(ones(n,1),0,n,n); > t=trace(A); > printf("n=%8d trace=%8d %s\n",n,t,["ERR";"ok"]((t==n)+1,:)); > endfor; > endfor > > Results ====================================== > > n= 65535 trace= 65535 ok > n= 65536 trace= 0 ERR > n= 65537 trace= 65537 ok > n= 131071 trace= 131071 ok > n= 131072 trace= 0 ERR > n= 131073 trace= 131073 ok > n= 196607 trace= 196607 ok > n= 196608 trace= 0 ERR > n= 196609 trace= 196609 ok Confirmed. The problem is that the numel function is limited to returning octave_idx_type, which ordinarily of size 2^32, and certainly is so for Debian. This makes sense, since you can only index that many elements in a matrix. You're hitting the indexing limit. To get 64bit indexing, you would need to recompile all of Octave's Fortran dependencies with fdefaultinteger8. I'm not sure exactly what the bug is here. For instance, you can't index your matrix A either, and this is checked for correctly: A(end) Perhaps the best thing to do would be to forbid creation of sparse matrices where numel(A) > std::numeric_limits<int>::max(). Your matrix is simply too large to be indexed, and this breaks assumptions elsewhere in our code.  Jordi G. H. 
On Mon, Apr 29, 2013 at 6:10 AM, Jordi Gutiérrez Hermoso <[hidden email]> wrote: On 29 April 2013 06:25, Miroslaw Kwasniak <[hidden email]> wrote: I'm confused  this is a diagonal sparse matrix so you should be able to trace() (or any other op) up to n = 2^32, not n^2 = 2^32. The limit on sparse matrices should be number of nonzeros < 2^32
Ed Meyer 
On 29 April 2013 12:40, Ed Meyer <[hidden email]> wrote:
> > > On Mon, Apr 29, 2013 at 6:10 AM, Jordi Gutiérrez Hermoso > <[hidden email]> wrote: >> >> On 29 April 2013 06:25, Miroslaw Kwasniak <[hidden email]> >> wrote: >> > it's something wrong whith sparse matrices A(n,n) when n is a multiple >> > of 65536=2^16. >> > >> > Demonstration code ====================================== >> > >> > for i=1:3; >> > for n=i*2^16+(1:1); >> > A=spdiags(ones(n,1),0,n,n); >> > t=trace(A); >> > printf("n=%8d trace=%8d %s\n",n,t,["ERR";"ok"]((t==n)+1,:)); >> > endfor; >> > endfor >> > >> > Results ====================================== >> > >> > n= 65535 trace= 65535 ok >> > n= 65536 trace= 0 ERR >> > n= 65537 trace= 65537 ok >> > n= 131071 trace= 131071 ok >> > n= 131072 trace= 0 ERR >> > n= 131073 trace= 131073 ok >> > n= 196607 trace= 196607 ok >> > n= 196608 trace= 0 ERR >> > n= 196609 trace= 196609 ok >> >> Confirmed. The problem is that the numel function is limited to >> returning octave_idx_type, which ordinarily of size 2^32, and >> certainly is so for Debian. This makes sense, since you can only index >> that many elements in a matrix. You're hitting the indexing limit. To >> get 64bit indexing, you would need to recompile all of Octave's >> Fortran dependencies with fdefaultinteger8. >> >> I'm not sure exactly what the bug is here. For instance, you can't >> index your matrix A either, and this is checked for correctly: >> >> A(end) >> >> Perhaps the best thing to do would be to forbid creation of sparse >> matrices where numel(A) > std::numeric_limits<int>::max(). >> >> Your matrix is simply too large to be indexed, and this breaks >> assumptions elsewhere in our code. >> >>  Jordi G. H. > > > I'm confused  this is a diagonal sparse matrix so you should be able to > trace() (or any other op) > up to n = 2^32, not n^2 = 2^32. The limit on sparse matrices should be > number of nonzeros < 2^32 All matrices need to be linearly indexable, and of course, this is how they are actually stored in memory, as a single long array indexed by a single index. Thus, the total number of indexable elements of a matrix can't be larger than std::numeric_limits<octave_idx_type>::max(). There could be some tricks we could do to relax this requirement for sparse matrices, but it would require some pretty deep surgery of the current code.  Jordi G. H. 
On Mon, Apr 29, 2013 at 9:47 AM, Jordi Gutiérrez Hermoso <[hidden email]> wrote:
true for full matrices but sparse matrices are stored as three arrays and the nonzero and row index arrays are the only ones that need be limited. So you are saying that sparse matrices are treated as full in some places?
Ed Meyer 
On 29 April 2013 13:21, Ed Meyer <[hidden email]> wrote:
> > > On Mon, Apr 29, 2013 at 9:47 AM, Jordi Gutiérrez Hermoso > <[hidden email]> wrote: >> >> >> All matrices need to be linearly indexable, and of course, this is how >> they are actually stored in memory, as a single long array indexed by >> a single index. Thus, the total number of indexable elements of a >> matrix can't be larger than >> std::numeric_limits<octave_idx_type>::max(). >> >> There could be some tricks we could do to relax this requirement for >> sparse matrices, but it would require some pretty deep surgery of the >> current code. > true for full matrices but sparse matrices are stored as three arrays and > the nonzero > and row index arrays are the only ones that need be limited. So you are > saying that > sparse matrices are treated as full in some places? No, the issue is that *all* indices are limited to octave_idx_type. We would have to do some sort of deep surgery to introduce a special indexing type for sparse matrices only. It seems like a lot of work for potentially little benefit. And yes, sparse matrices can be indexed by a single index instead of two, like any other matrix. Internally in Octave's source, the assumption that a single index of octave_idx_type is available is used throughout.  Jordi G. H. 
On 29 April 2013 14:00, Jordi Gutiérrez Hermoso <[hidden email]> wrote:
> And yes, sparse matrices can be indexed by a single index instead of > two, like any other matrix. Internally in Octave's source, the > assumption that a single index of octave_idx_type is available is used > throughout. The specific case where this fails in this instance is octave_idx_type dim_vector::numel (), which is obtained simply by multiplying each of the dimensions of the matrix, even for sparse matrices (this is unlike nnz, so a workaround just for trace.m would be to use nnz instead of numel, but I think this would still leave some pretty broken sparse matrices lying around with other problems). We would have to change numel () to use some other type that can hold the result of a larger size, but this is a pretty fundamental function in Octave. The overall assumption is that you can linearly index up to numel (). Every place that calls this function would need to be checked to see what happens if we change its return type to be some special bigint.  Jordi G. H. 
On Mon, Apr 29, 2013 at 11:07 AM, Jordi Gutiérrez Hermoso <[hidden email]> wrote:
I'm not proposing using anything but octave_idx_type for indexing or changing the return type of numel()  I just question why numel() is used for sparse matrices. It should be irrelevant for anything but ccs2full().
Rather than restrict the size of sparse matrices I think it would make more sense to fix problems like this as they come up so that sparse storage is used as it was intended  to reduce storage & op count.
Ed Meyer 
On 29 April 2013 14:50, Ed Meyer <[hidden email]> wrote:
> I'm not proposing using anything but octave_idx_type for indexing or > changing the return type of numel()  I just question why numel() is > used for sparse matrices. It should be irrelevant for anything but > ccs2full(). The numel function is just one example where this sparse matrix with big dimensions broke. Sparse matrices this large can still break in other ways. Furthermore, what do you propose to do if numel is called for sparse matrices, despite your suggestion to not call it for sparse matrices? A lot of code out there already assumes that you can treat a sparse matrix like any other matrix. I don't think numel is to blame. Another way I can think of would be that A(idx) would also break if idx is greater than the maximum index size. We would have to introduce special rules to handle that for sparse matrices of large dimensions. The problem isn't the storage size, it's the index size, which is why for other matrices Octave's error message says out of memory or index too big. I really don't see a way around this other than introducing a special index type for sparse matrices, and I don't see this as hugely useful. It also looks like a lot of boring work. But if you insist on doing this, don't let me discourage you. If you can figure out a consistent behaviour that doesn't break current code, you write the patch, and all tests pass, I'll happily apply it.  Jordi G. H. 
On 04/29/2013 03:12 PM, Jordi Gutiérrez
Hermoso wrote:
Will 64bit Octave "automatically" fix this? Michael 
On 29 April 2013 16:46, Michael D. Godfrey <[hidden email]>
wrote: > > On 04/29/2013 03:12 PM, Jordi Gutiérrez Hermoso wrote: > > > > On 29 April 2013 14:50, Ed Meyer <[hidden email]> wrote: > > > >> I'm not proposing using anything but octave_idx_type for indexing > >> or changing the return type of numel()  I just question why > >> numel() is used for sparse matrices. It should be irrelevant for > >> anything but ccs2full(). > > > > The numel function is just one example where this sparse matrix > > with big dimensions broke. Sparse matrices this large can still > > break in other ways. Furthermore, what do you propose to do if > > numel is called for sparse matrices, despite your suggestion to > > not call it for sparse matrices? A lot of code out there already > > assumes that you can treat a sparse matrix like any other matrix. > > I don't think numel is to blame. > Will 64bit Octave "automatically" fix this? Yes, if you mean compiling Octave with 64bit indexing. Or at least it will delay the problem until someone tries to create a matrix of size 2^64 by 2^64.  Jordi G. H. 
In reply to this post by Jordi Gutiérrez Hermoso2
On Mon, Apr 29, 2013 at 12:12 PM, Jordi Gutiérrez Hermoso <[hidden email]> wrote:
I'm not saying numel() is to blame or should be changed, only that I see no reason to ever use numel when handling sparse matrices unless you are converting it
to full in which case the current behavior is ok.
Here again, why would you ever want A(idx) for a sparse matrix?
I agree that a special index type is the wrong approach; what I'm saying is that with sparse matrices you should never run into this problem in the first
place if you don't try to treat them the same as full. Otherwise why have sparse matrices at all?
this doesn't seem to be a burning issue so I'll shut up Ed Meyer 
On 29 April 2013 17:51, Ed Meyer <[hidden email]> wrote:
> I'm not saying numel() is to blame or should be changed, only that I > see no reason to ever use numel when handling sparse matrices unless > you are converting it to full in which case the current behavior is > ok. Well, look at the current implementation of trace (), in which numel is a perfectly reasonable function to call. If you don't want to ever call numel () for sparse matrices, then you would need to rewrite this function to check for sparsity as well as any other function that calls numel (). > Here again, why would you ever want A(idx) for a sparse matrix? Because this is the only way to do arbitrary indexing, say, indexing with a logical index: n = 2^16; A = sprandsym (n, 1e7); idx = A > 0.5; A(idx) *= 1; The alternative to arbitrary indexing is looping. > I agree that a special index type is the wrong approach; what I'm > saying is that with sparse matrices you should never run into this > problem in the first place if you don't try to treat them the same > as full. Otherwise why have sparse matrices at all? It is desirable to have sparse matrices behave like ordinary matrices under most circumstances, such as indexing and when writing the trace function. Otherwise, you would have to write special code all over the place to make sure that if the matrix is sparse, you don't linearly index it nor do you call numel or who knows what other functions that I haven't thought about yet.  Jordi G. H. 
On Tue, Apr 30, 2013 at 6:37 AM, Jordi Gutiérrez Hermoso <[hidden email]> wrote: On 29 April 2013 17:51, Ed Meyer <[hidden email]> wrote: I don't see numel() used in trace(), nor in diag() which it calls  what am I missing? > Here again, why would you ever want A(idx) for a sparse matrix? or if octave_idx_type were a class hierarchy with a (row,col) class in addition to linear indexing
Not only is it desirable to have sparse and full matrices behave similarly, I believe the user should not need to be aware of which storage format is used so functions like eig() would work for either.
The key is to use the C++ class system to have different implementations for each storage format.
Ed Meyer 
On 30 April 2013 12:56, Ed Meyer <[hidden email]> wrote:
> > > On Tue, Apr 30, 2013 at 6:37 AM, Jordi Gutiérrez Hermoso > <[hidden email]> wrote: >> >> On 29 April 2013 17:51, Ed Meyer <[hidden email]> wrote: >> >> > I'm not saying numel() is to blame or should be changed, only that I >> > see no reason to ever use numel when handling sparse matrices unless >> > you are converting it to full in which case the current behavior is >> > ok. >> >> Well, look at the current implementation of trace (), in which numel >> is a perfectly reasonable function to call. If you don't want to ever >> call numel () for sparse matrices, then you would need to rewrite this >> function to check for sparsity as well as any other function that >> calls numel (). > > > I don't see numel() used in trace(), nor in diag() which it calls  what am > I missing? Sorry, I was going from memory of a gdb session yesterday. The C++ isempty function ends up calling numel. I really don't see another way to reliably implement this function. >> > Here again, why would you ever want A(idx) for a sparse matrix? >> >> Because this is the only way to do arbitrary indexing, say, indexing >> with a logical index: >> >> n = 2^16; >> A = sprandsym (n, 1e7); >> idx = A > 0.5; >> A(idx) *= 1; >> >> The alternative to arbitrary indexing is looping. > > > or if octave_idx_type were a class hierarchy with a (row,col) class > in addition to linear indexing This seems again like that special indexing type that seems to me like a lot of boring work. >> > I agree that a special index type is the wrong approach; what I'm >> > saying is that with sparse matrices you should never run into this >> > problem in the first place if you don't try to treat them the same >> > as full. Otherwise why have sparse matrices at all? >> >> It is desirable to have sparse matrices behave like ordinary matrices >> under most circumstances, such as indexing and when writing the trace >> function. Otherwise, you would have to write special code all over the >> place to make sure that if the matrix is sparse, you don't linearly >> index it nor do you call numel or who knows what other functions that >> I haven't thought about yet. > The key is to use the C++ class system to have different implementations > for each storage format. This again sounds like support for a special indexing type just for sparse matrices.  Jordi G. H. 
Administrator

In reply to this post by eem2314
On 04/30/2013 12:56 PM, Ed Meyer wrote:
> Not only is it desirable to have sparse and full matrices behave similarly, > I believe the user should not need to be aware of which storage format > is used so functions like eig() would work for either. > The key is to use the C++ class system to have different implementations > for each storage format. I haven't been following this thread closely and I haven't thought much about the details but I have no objection to trying to do a better job with handling numel and dimensions/indices generally. Is there some way we can get the better behavior in a minimally invasive way? Even if it requires significant changes, maybe we should consider what the options are anyway. What changes are needed to make octave_idx_type behave the way you way you want? jwe 
On 04/30/2013 01:19 PM, John W. Eaton wrote:
> On 04/30/2013 12:56 PM, Ed Meyer wrote: > >> Not only is it desirable to have sparse and full matrices behave >> similarly, >> I believe the user should not need to be aware of which storage format >> is used so functions like eig() would work for either. >> The key is to use the C++ class system to have different implementations >> for each storage format. > > I haven't been following this thread closely and I haven't thought > much about the details but I have no objection to trying to do a > better job with handling numel and dimensions/indices generally. > > Is there some way we can get the better behavior in a minimally > invasive way? > > Even if it requires significant changes, maybe we should consider what > the options are anyway. > > What changes are needed to make octave_idx_type behave the way you way > you want? > > jwe > current system against completing the 64bit compiling system. The 64bit system will be useful in other ways and should be done in any case. Of course, any improvement in the current system is good, but the full implementation of large matrices could be a 64bit system feature. Jordi seemed to think that 2^63 indicies would work with "no trouble" in the 64bit system... Michael 
In reply to this post by Jordi Gutiérrez Hermoso2
On Tue, Apr 30, 2013 at 6:37 AM, Jordi Gutiérrez Hermoso <[hidden email]> wrote:  On 29 April 2013 17:51, Ed Meyer <[hidden email]> wrote: I think I see why numel() is getting called from trace(). isempty() is called which calls is_empty() which should be virtual. Were it to call the sparse version of is_empty() there would be no problem because it tests the row & column
dimensions instead of calling numel(). So shouldn't is_empty() be virtual? Ed Meyer 
On 4 May 2013 20:50, Ed Meyer <[hidden email]> wrote:
> I think I see why numel() is getting called from trace(). isempty() > is called which calls is_empty() which should be virtual. Were it to > call the sparse version of is_empty() there would be no problem > because it tests the row & column dimensions instead of calling > numel(). So shouldn't is_empty() be virtual? Sure, this patches this hole. What about linear indexing, how will you patch that one?  Jordi G. H. 
On 05/06/2013 04:24 PM, Jordi Gutiérrez Hermoso wrote:
> On 4 May 2013 20:50, Ed Meyer <[hidden email]> wrote: >> I think I see why numel() is getting called from trace(). isempty() >> is called which calls is_empty() which should be virtual. Were it to >> call the sparse version of is_empty() there would be no problem >> because it tests the row & column dimensions instead of calling >> numel(). So shouldn't is_empty() be virtual? > Sure, this patches this hole. What about linear indexing, how will you > patch that one? > >  Jordi G. H. > shouldn't work either n = 2^16; s = sparse (1:n,1:n,1); t = s * s; as t will have more elements than can be indexed with octave_idx_type. Clear it works. So essentially you're saying that sparse matrices with 32bit indexing and numel larger than 2^31 are useless!! A lot of attention was made in the sparse implementation to not rely either on linear indexing or the value of numel to avoid this issue. I'd think that any function or operator that relies on either for sparse matrices, or at least when it doesn't have to, is buggy. Ed is right the is_empty method of the octave_base_value class should be specialized for sparse matrices as in the attached changeset as it has no need to rely on numel. Then any function that relies on isempty should now work for sparse matrices. David patch.is_empty (2K) Download Attachment 
On 15 June 2013 11:00, David Bateman <[hidden email]> wrote:
> Ed is right the is_empty method of the octave_base_value class should be > specialized for sparse matrices as in the attached changeset as it has > no need to rely on numel. Then any function that relies on isempty > should now work for sparse matrices. Now what should happen if numel is actually called for sparse matrices? And how do you propose to fix linear indexing? I'm not saying you shouldn't do it, just that it's difficult and boring. Also, can't you push your own patches? I thought you were still in the access list.  Jordi G. H. 
Powered by Nabble  Edit this page 