full/sparse/banded/triangular matrices...

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

full/sparse/banded/triangular matrices...

David Bateman-3
In thinking a little bit about what to do with Andy's spinv function,
I found myself asking lots of questions about how to address triangular
matrices, since the heart of his function is the back subsitution on
a sparse upper triangular matrix. It seemed to me that this routine
has applications beyond just this function, for example in a cholesky
based solver. To achieve something like this it would make sense to
have sparse triangular matrices, perhaps in a manner like that in
ov-re-tri.cc in octave-forge. The risk is an explosion in the number of
mixed operators. This is even more true if banded matrices are
addressed at the same time.

Why do we want sparse/triangular/banded matrices? What is the ideal
implementation of full, sparse, triangular and banded matrices? What
should the classes in liboctave look like? Is there some way to get to
that vision in easier steps?

Why seems to be clear. Reduced memory usage and faster computation.
However, the two issues aren't necessarily related. There is nothing
to stop us storing the zero values, as a fast way of at least getting
some of the benefits of banded or triangular matrices in terms of more
rapid computation. The LAPACK routines for banded routines can easily
be used in this case just by selecting the LDA parameter appropriately
and there are special version of the LAPACK triangular back substitute
functions for a full matrix with only the upper/lower part defined.

In the longer run there should be ideally three different storage
classes, Full, Sparse and Banded. There is only a factor of two saving
in memory use in a triangular storage class and I'm not sure its worth
having a special storage class for this, but rather flag for the Full
and Sparse classes as being triangular (an attribute in matlab
terminology). This means that all of the operators on sparse/full
matrices also directly apply to triangular matrices, with only the
things like "/", "\", etc needing special casing for acceleration.
There would also need to be a means of preserving, modifying or
destroying the attribute during other operations. For example

UpperTriangular .* Full -> UpperTriangular
UpperTriangular .' -> LowerTriangular
UpperTriangular + Scalar -> Full

At at the outset something similar might be done for a banded matrix.
This might be either done by defining classes for triangular/ banded
matrices in the same way as in ov-re-tri.cc in octave-forge, and
expanding the number of specialized operators on these class to define
the preservation of the attribute, or the attribute might be explicitly
included in the Array<T> class and the operators in MArray, dNDArray,
etc modified to preserve the attributes. In the long run it would be better
to do it in the Array<T> class as then even applications that don't
like to liboctinterp.so can have access to the benefits of these
classes. But in the short term it is probably easier and certainly
less distruptive to do it in the same manner as ov-re-tri.cc.

So longer term I see the best situation as having Array<T> evolve into
something like

template <class T>
class
Array
{
protected
  class FullRep
  {
    ...
  };

  class SparseRep
  {
    ...
  };

  class BandRep
  {
    ...
  };

public:
  enum storage_type
  {
     DENSE,
     SPARSE,
     BANDED
  };
  enum attribute_type
  {
     UNKNOWN,
     FULL,
     UPPER,
     LOWER,
  };

protected:
  typename Array<T>::FullRep *fullrep;
  typename Array<T>::SparseRep *sparserep;
  typename Array<T>::BandedRep *bandedrep;
 
  storage_type storage;
  attribute_type attribute;

  dim_vector dimensions;

  idx_vector *idx;
  int idx_count;

public:
  ...

  Array (const Array<T>& a)
    : attribute (a.attribute), dimensions (a.dimensions), idx (0), idx_count (0)
    {
      switch (a.storage)
      {
      case SPARSE:
        sparserep = a.sparserep;
        sparserep->count++;
        break;
      case BANDED:
        bandedrep = a.bandedrep;
        bandedrep->count++;
        break;        
      case DENSE:
      default:
        fullrep = a.fullrep;
        fullrep->count++;
        break;
      }
    }
  ...
};

with the consequent changes to all of the operators in the derived classes.
This would allow such crazy things as banded cell arrays as being possible.  
However such changes are likely to be very disruptive as it imposes changes
at the very lowest level of how octave currently does things, and so is
probably not something that should be done before 3.0.

Should the other approach of having the classes

class octave_triangular_matrix : public octave_matrix { ... };
class octave_triangular_complex_matrix : public octave_complex_matrix { ...};
class octave_banded_matrix : public octave_matrix { ... };
class octave_banded_complex_matrix : public octave_complex_matrix { ... };
class octave_triangular_sparse_matrix : public octave_sparse_matrix { ... };
class octave_triangular_sparse_complex_matrix :
        public octave_sparse_complex_matrix { ... };

be implemented? I believe yes, as the implementation is relatively straight
forward. I didn't include banded sparse matrices as there seems to be no
point.

If we missed a few of the attribute preserving operations in the
op-*.cc files then it just falls back to the current case of
converting to a full/sparse version.. Also even the majority of the
attribute preserving functions use the existing code, with just the
returned octave_value being adapted appropriately, so there is a low
risk of introduces major errors.

This also means that only cases where the attribute is preserved need
to be treated, and a conversion to a full matrix can fall back to the
full matrix code. Then it just comes done to making tables like

Unary Ops

OP            INPUT        
           UP   LO   BND
---        --   --   ---
!          UP   LO   BND
+          UP   LO   BND
-          UP   LO   BND
transpose  LO   UP   BND
hermitian  LO   UP   BND

Binary OPS     UP/UP  UP/LO   UP/BND  UP/FULL  UP/SCAL
----------     -----  -----   ------  -------  -------
+                UP   FULL     FULL    FULL     FULL
-                UP   FULL     FULL    FULL     FULL
*                UP   FULL     FULL    FULL      UP
/                **    **       **     FULL      UP
^                NA    NA       NA     NA        UP
\                **    **       **     **        **
<                UP    FULL     FULL   FULL     FULL
>                UP    FULL     FULL   FULL     FULL
==              FULL   FULL     FULL   FULL     FULL
>=              FULL   FULL     FULL   FULL     FULL
<=              FULL   FULL     FULL   FULL     FULL
.*               UP     UP     UP/BND   UP       UP  
.^              FULL   FULL     FULL   FULL      UP (!=0)    

** Accelerated special case


to identify what to do for the op-*.cc files, and the rest is straight
forward. If we want to get a little bit more complicated we could include
the acceleration for things like BND+BND or UP+UP, that don't calculate
the zero terms for a bit of extra speed. Though its not these operations
that are going to dominant any calculation, but rather the "/" and "\"
operators, so I'm not sure I see any advantage.

As a final simplification the constructors of the octave_values might be
modified like

enum attribute_type
{
     UNKNOWN,
     FULL,
     BANDED
     UPPER,
     LOWER,
};

octave_value::octave_value (const NDArray& a, attribute_type typ = UNKNOWN,
        int band_size = -1)
{
  switch (typ)
  {
  case BANDED:
    rep = new octave_banded_matrix (a, band_size);
    break;
  case UPPER:
    rep = new octave_triangular_matrix (a, octave_triangular_matrix::UPPER);
    break;
  case LOWER:
    rep = new octave_triangular_matrix (a, octave_triangular_matrix::LOWER);
    break;
  default
    rep = new octave_matrix (a);
    break;
  }
  rep->count = 1;
  maybe_mutate ();
}

and similarly for all of the other constructors, so that the process of
writing the op-*.cc files might then be simplified to a cut-and-paste
with a new macro to call the constructor with the correct matrix type.
The default args allow the exist code to work as is.

Finally the functions chol, lu, triu, tril would need to be modified
to return the correct matrix type, and if a matrix is flagged as
unknown the "/" and "\" operators should probably attempt to
determine the matrix type, so the attribute should be cached for
the octave_matrix and octave_complex_matrix type.

Anyway, these are just my late night ruminations on the topic. It all seems
relatively straight forward, and seems a natural addition at the same
time as sparse matrices in the octave core. But its up to John to see what
he whats to do. In any case I hope this rather long mail might kick off
a small discussion.

Regards
David



--
David Bateman                                [hidden email]
Motorola CRM                                 +33 1 69 35 48 04 (Ph)
Parc Les Algorithmes, Commune de St Aubin    +33 1 69 35 77 01 (Fax)
91193 Gif-Sur-Yvette FRANCE

The information contained in this communication has been classified as:

[x] General Business Information
[ ] Motorola Internal Use Only
[ ] Motorola Confidential Proprietary


Reply | Threaded
Open this post in threaded view
|

Re: full/sparse/banded/triangular matrices...

Paul Kienzle
LAPACK also has tridiagonal functions.  In the spline code we are
doing a couple of rotations outside the solver so that the cyclic
case doesn't need a full matrix solve.  I doubt you want to support
a mostly banded matrix class though.

Another matrix attribute to consider is a transpose marker, but like
triangular matrices, this only gives a small speedup compared to the
cost of the op.

- Paul


Reply | Threaded
Open this post in threaded view
|

full/sparse/banded/triangular matrices...

John W. Eaton-6
In reply to this post by David Bateman-3
On 30-Nov-2004, David Bateman <[hidden email]> wrote:

| The risk is an explosion in the number of
| mixed operators. This is even more true if banded matrices are
| addressed at the same time.

Yes, we will need to come up with a systematic way of handling all
these so they can be generated automatically (at least as much as is
possible).

| There would also need to be a means of preserving, modifying or
| destroying the attribute during other operations. For example
|
| UpperTriangular .* Full -> UpperTriangular
| UpperTriangular .' -> LowerTriangular
| UpperTriangular + Scalar -> Full

Wouldn't this last one only have to convert to Full if Scalar is
nonzero?  In that case, we really do want the storage class to be a
dynamic attribute of the array class instead of a fixed type attribute
that would be required if we had separate "top-level" classes for each
storage type.

| So longer term I see the best situation as having Array<T> evolve into
| something like
|
| template <class T>
| class
| Array
| {
| protected
|   class FullRep
|   {
|     ...
|   };
|
|   class SparseRep
|   {
|     ...
|   };

Yes, I think this is a good direction.  The rep classes don't have to
be nested, they are just handled that way now because it seemed good
to hide them.

I'm not sure that we want to define math operations on the Array
object, so the operators would be separate.

To be able to take advantage of external libraries like lapack, we
would probably need to be able to expose some details about the
internals (we do that already, with fortran_vec).

For efficiency, the operators would be defined on

|   class BandRep
|   {
|     ...
|   };
|
| public:
|   enum storage_type
|   {
|      DENSE,
|      SPARSE,
|      BANDED
|   };
|   enum attribute_type
|   {
|      UNKNOWN,
|      FULL,
|      UPPER,
|      LOWER,
|   };

Do we really need these enums?  Wouldn't you want to dispatch to the
rep class and have it do the right thing instead of using the enum
value?  It seems redundant to have both and switches are bad form in
C++ when a dispatch would do the job (think about the trouble caused
by the enum when you add a new rep type that requires a new enum value).

jwe


Reply | Threaded
Open this post in threaded view
|

Re: full/sparse/banded/triangular matrices...

David Bateman-3
According to John W. Eaton <[hidden email]> (on 12/01/04):

> On 30-Nov-2004, David Bateman <[hidden email]> wrote:
>
> | The risk is an explosion in the number of
> | mixed operators. This is even more true if banded matrices are
> | addressed at the same time.
>
> Yes, we will need to come up with a systematic way of handling all
> these so they can be generated automatically (at least as much as is
> possible).
>
> | There would also need to be a means of preserving, modifying or
> | destroying the attribute during other operations. For example
> |
> | UpperTriangular .* Full -> UpperTriangular
> | UpperTriangular .' -> LowerTriangular
> | UpperTriangular + Scalar -> Full
>
> Wouldn't this last one only have to convert to Full if Scalar is
> nonzero?  In that case, we really do want the storage class to be a
> dynamic attribute of the array class instead of a fixed type attribute
> that would be required if we had separate "top-level" classes for each
> storage type.

Yes, I caught the Scalar zero case in the table I sent though. However,
I don't see why this implicitly means that we are forced to having
dynamic attributes... I see there as being two approaches, the better
one that we both agree in the longer term is to have the storage classes
as a dynamic property of the Array<T> class.

However, it seems to me that that is a really long term goal, but we
can still get a lot of the benefits of such storage classes, at least
in terms of computational speed with using seperate top level storage
classes, of the form I mentioned. With using the differentiation at
the top level class the dynamic conversion is handled at the top
level.  So in the short term where the triangular matrices are just a
flag on the full (or sparse) matrix storage, the binop for add with a
scalar might look like

static octave_value
oct_binop_add (const octave_value& a1, const octave_value& a2)
{
    CAST_BINOP_ARGS (const octave_triangular_matrix &, const octave_scalar&);
    double d = v2.double_value ();
    if (d == 0)
      return octave_value (a1);
    else
      return octave_value (v1.matrix_value () + d, v1.attribute);
}

In fact in the long term the seperate top-level classes could also stay,
and use different storage classes of the underlying Array<T> class. In
that case the dynamic change between classes would be handled in a similar
manner. The advantage of this approach is that if the top level
classes are defined like

template <class T> octave_triangular_matrix : puble octave_matrix { ... };

and inherit from the underlying matrix class, then we only need to explicitly
include the operations that preserve the attribute of the matrix class. This
remains true even with additional storage classes in the Array<T> type.

> | So longer term I see the best situation as having Array<T> evolve into
> | something like
> |
> | template <class T>
> | class
> | Array
> | {
> | protected
> |   class FullRep
> |   {
> |     ...
> |   };
> |
> |   class SparseRep
> |   {
> |     ...
> |   };
>
> Yes, I think this is a good direction.  The rep classes don't have to
> be nested, they are just handled that way now because it seemed good
> to hide them.
>
> I'm not sure that we want to define math operations on the Array
> object, so the operators would be separate.

True, in fact if the operators are on derived classes of Array<T> then
it would be a single operator with special cases for each storage class.

> To be able to take advantage of external libraries like lapack, we
> would probably need to be able to expose some details about the
> internals (we do that already, with fortran_vec).
>
> For efficiency, the operators would be defined on

If we only have full, sparse and banded storage classes, then fortran_vec
plus the size of the band for banded matrices is sufficient. The sparse
class would need to expose the row and column indexing...


>
> |   class BandRep
> |   {
> |     ...
> |   };
> |
> | public:
> |   enum storage_type
> |   {
> |      DENSE,
> |      SPARSE,
> |      BANDED
> |   };
> |   enum attribute_type
> |   {
> |      UNKNOWN,
> |      FULL,
> |      UPPER,
> |      LOWER,
> |   };
>
> Do we really need these enums?  Wouldn't you want to dispatch to the
> rep class and have it do the right thing instead of using the enum
> value?  It seems redundant to have both and switches are bad form in
> C++ when a dispatch would do the job (think about the trouble caused
> by the enum when you add a new rep type that requires a new enum value).

The reason I see for an enum is that imagine a case like


a = zeros(n,n)

## Create n-by-n banded matrix with banded of +/-k
for i = -k:k
 if (i <= 0)
   a ((0:(n+k-1))*(n+1) - k + 1)  = foo (n+k);    # Returns n+k < n values
 else
   a ((k:(n-1))*(n+1) + 1) = foo (n-k);    # Returns n-k < n values
 endif
endfor  

b = a \ randn(n,1);

We don't know that "a" is banded. In fact we have no information about
it, and its attribute is known. However, if the "\" operator attempts to
determine the attribute of unknown matrices, then we can benefit from
the acceleration for banded operations in any case. However this needs
caching of the attribute. If the attribute wasn't cached then each time
"\" was called on a full matrix its attribute would be checked, even if
it had already been determined. I don't see how to treat this with some
sort of enum..

Regards
David

--
David Bateman                                [hidden email]
Motorola CRM                                 +33 1 69 35 48 04 (Ph)
Parc Les Algorithmes, Commune de St Aubin    +33 1 69 35 77 01 (Fax)
91193 Gif-Sur-Yvette FRANCE

The information contained in this communication has been classified as:

[x] General Business Information
[ ] Motorola Internal Use Only
[ ] Motorola Confidential Proprietary


Reply | Threaded
Open this post in threaded view
|

Re: full/sparse/banded/triangular matrices...

David Bateman-3
In reply to this post by Paul Kienzle
According to Paul Kienzle <[hidden email]> (on 12/01/04):
> LAPACK also has tridiagonal functions.  In the spline code we are
> doing a couple of rotations outside the solver so that the cyclic
> case doesn't need a full matrix solve.  I doubt you want to support
> a mostly banded matrix class though.

Isn't tridiagonal just a special case of a banded matrix? Does it need
special treatment relative to a banded matrix? Your cyclic solver would
though, humm...

> Another matrix attribute to consider is a transpose marker, but like
> triangular matrices, this only gives a small speedup compared to the
> cost of the op.

Yeah, I think I place that one in the nice but not worth it basket...

D.

--
David Bateman                                [hidden email]
Motorola CRM                                 +33 1 69 35 48 04 (Ph)
Parc Les Algorithmes, Commune de St Aubin    +33 1 69 35 77 01 (Fax)
91193 Gif-Sur-Yvette FRANCE

The information contained in this communication has been classified as:

[x] General Business Information
[ ] Motorola Internal Use Only
[ ] Motorola Confidential Proprietary