error saving empty matrix in HDF5 format

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

error saving empty matrix in HDF5 format

John W. Eaton-6
I just noticed the following error with the current CVS sources:

  octave:1> clear
  octave:2> x = [];
  octave:3> save -hdf5 x.hdf5
  HDF5-DIAG: Error detected in HDF5 library version: 1.6.1 thread 16384.  Back trace follows.
    #000: ../../../src/H5S.c line 1708 in H5Screate_simple(): zero sized dimension for non-unlimited dimension
      major(01): Function arguments
      minor(05): Bad value
  error: save: error while writing `x' to hdf5 file

I tried to fix this, but it seems that HDF5 does not like empty arrays.

Any ideas on what the right was is to save empty arrays and preserve
their dimensions when using HDF5?

Also, the following cod seems a bit clumsy to me

  bool
  octave_matrix::save_hdf5 (hid_t loc_id, const char *name, bool save_as_floats)
  {
    dim_vector d = dims ();
    hsize_t hdims[d.length () > 2 ? d.length () : 3];
    hid_t space_hid = -1, data_hid = -1;
    int rank = ( (d (0) == 1) && (d.length () == 2) ? 1 : d.length ());
    bool retval = true;
    NDArray m = array_value ();

    // Octave uses column-major, while HDF5 uses row-major ordering
    for (int i = 0, j = d.length() - 1; i < d.length (); i++, j--)
      hdims[i] = d (j);

    space_hid = H5Screate_simple (rank, hdims, (hsize_t*) 0);

Why not just

    dim_vector d = dims ();
    int rank = d.length ();
    hsize_t hdims[rank];
    hid_t space_hid = -1, data_hid = -1;
    bool retval = true;
    NDArray m = array_value ();

    // Octave uses column-major, while HDF5 uses row-major ordering
    for (int i = 0; < rank; i++)
      hdims[i] = d (rank-i-1);

    space_hid = H5Screate_simple (rank, hdims, (hsize_t*) 0);

Finally, I suspect that there is very little difference in the code
for various matrix types.  It would be nice to implement this in the
base class (though that might not be possible) or as a template
function to avoid repeated code.

jwe


Reply | Threaded
Open this post in threaded view
|

Re: error saving empty matrix in HDF5 format

David Bateman-3
According to John W. Eaton <[hidden email]> (on 02/27/04):

> I just noticed the following error with the current CVS sources:
>
>   octave:1> clear
>   octave:2> x = [];
>   octave:3> save -hdf5 x.hdf5
>   HDF5-DIAG: Error detected in HDF5 library version: 1.6.1 thread 16384.  Back trace follows.
>     #000: ../../../src/H5S.c line 1708 in H5Screate_simple(): zero sized dimension for non-unlimited dimension
>       major(01): Function arguments
>       minor(05): Bad value
>   error: save: error while writing `x' to hdf5 file
>
> I tried to fix this, but it seems that HDF5 does not like empty arrays.

Looking at it now...

>
> Any ideas on what the right was is to save empty arrays and preserve
> their dimensions when using HDF5?
>
> Also, the following cod seems a bit clumsy to me
>
>   bool
>   octave_matrix::save_hdf5 (hid_t loc_id, const char *name, bool save_as_floats)
>   {
>     dim_vector d = dims ();
>     hsize_t hdims[d.length () > 2 ? d.length () : 3];
>     hid_t space_hid = -1, data_hid = -1;
>     int rank = ( (d (0) == 1) && (d.length () == 2) ? 1 : d.length ());
>     bool retval = true;
>     NDArray m = array_value ();
>
>     // Octave uses column-major, while HDF5 uses row-major ordering
>     for (int i = 0, j = d.length() - 1; i < d.length (); i++, j--)
>       hdims[i] = d (j);
>
>     space_hid = H5Screate_simple (rank, hdims, (hsize_t*) 0);
>
> Why not just
>
>     dim_vector d = dims ();
>     int rank = d.length ();
>     hsize_t hdims[rank];
>     hid_t space_hid = -1, data_hid = -1;
>     bool retval = true;
>     NDArray m = array_value ();
>
>     // Octave uses column-major, while HDF5 uses row-major ordering
>     for (int i = 0; < rank; i++)
>       hdims[i] = d (rank-i-1);
>
>     space_hid = H5Screate_simple (rank, hdims, (hsize_t*) 0);


Well the reason for this was that I had very little knowledge of HDF5
stuff when I rewrote the load-save code. Therefore, I based what I did
heavily on the existing code. I also found the above a bit messy, but
since I didn't understand the intention of the original author, rather
than breaking code that read the HDF5 stuff on the other end, I kept
similar code.

However, I've already broken HDF5 compatiability in that I've now
implemented each saved variable as a group with a string containing the
name of the octave_value and a second value with the actual data.

So, go ahead a do it as above.

> Finally, I suspect that there is very little difference in the code
> for various matrix types.  It would be nice to implement this in the
> base class (though that might not be possible) or as a template
> function to avoid repeated code.

Humm, this is probably true for the ascii formats, and it might even
be true for the binary formats, since we could use for instance the
"<<" and ">>" operators for the actual work. However, I doubt it is
true for the HDF5 type due to the use of coumpound types for complex,
etc.

I'd propose putting this idea on the backburner and just commenting
a "// XXX FIXME XX Implement as template class ??" above the relevant
code...

Cheers
David

--
David Bateman                                [hidden email]
Motorola CRM                                 +33 1 69 35 48 04 (Ph)
Parc Les Algorithmes, Commune de St Aubin    +33 1 69 35 77 01 (Fax)
91193 Gif-Sur-Yvette FRANCE

The information contained in this communication has been classified as:

[x] General Business Information
[ ] Motorola Internal Use Only
[ ] Motorola Confidential Proprietary


Reply | Threaded
Open this post in threaded view
|

Re: error saving empty matrix in HDF5 format

David Bateman-3
In reply to this post by John W. Eaton-6

Ok there is a thread on this exact problem at

http://www.unidata.ucar.edu/projects/coohl/mhonarc/MailArchives/netcdf-hdf-list/msg00097.html

The bottom line is that, it appears that the HDF people are implementing
something like "H5Screate(H5S_EMPTY)" however that doesn't help us, since
the dimensionality of our empty matrices won't be maintained.... That is
a rank=n matrix when loaded will become "zeros(zeros(n))" and not the
correct dimensionality.

Frankly, I see no simple way to address this. The only thing I can see is
if we save an attribute in the HDF5 file flagging an empty matrix, then
save the dimensions as a vector. This is slighly incompatibile in that
if other software doesn't check this attribute then they'll end up
loading a 1xN vector with the dimensions of the matrix instead of an
empty matrix.....

Should I go this way?

Regards
David

--
David Bateman                                [hidden email]
Motorola CRM                                 +33 1 69 35 48 04 (Ph)
Parc Les Algorithmes, Commune de St Aubin    +33 1 69 35 77 01 (Fax)
91193 Gif-Sur-Yvette FRANCE

The information contained in this communication has been classified as:

[x] General Business Information
[ ] Motorola Internal Use Only
[ ] Motorola Confidential Proprietary


Reply | Threaded
Open this post in threaded view
|

Re: error saving empty matrix in HDF5 format

David Bateman-3
In reply to this post by John W. Eaton-6
Ok, here is a patch for this and the problem identified by Hans Ekkehard. It
also in passing adds Nd array support for load/save to strings. Its not
necessary the best approach, but since HDF5 doesn't support zero dimensioned
matrices I don't see another way of doing this...

Cheers
David

According to John W. Eaton <[hidden email]> (on 02/27/04):

> I just noticed the following error with the current CVS sources:
>
>   octave:1> clear
>   octave:2> x = [];
>   octave:3> save -hdf5 x.hdf5
>   HDF5-DIAG: Error detected in HDF5 library version: 1.6.1 thread 16384.  Back trace follows.
>     #000: ../../../src/H5S.c line 1708 in H5Screate_simple(): zero sized dimension for non-unlimited dimension
>       major(01): Function arguments
>       minor(05): Bad value
>   error: save: error while writing `x' to hdf5 file
>
> I tried to fix this, but it seems that HDF5 does not like empty arrays.
>
> Any ideas on what the right was is to save empty arrays and preserve
> their dimensions when using HDF5?
>
> Also, the following cod seems a bit clumsy to me
>
>   bool
>   octave_matrix::save_hdf5 (hid_t loc_id, const char *name, bool save_as_floats)
>   {
>     dim_vector d = dims ();
>     hsize_t hdims[d.length () > 2 ? d.length () : 3];
>     hid_t space_hid = -1, data_hid = -1;
>     int rank = ( (d (0) == 1) && (d.length () == 2) ? 1 : d.length ());
>     bool retval = true;
>     NDArray m = array_value ();
>
>     // Octave uses column-major, while HDF5 uses row-major ordering
>     for (int i = 0, j = d.length() - 1; i < d.length (); i++, j--)
>       hdims[i] = d (j);
>
>     space_hid = H5Screate_simple (rank, hdims, (hsize_t*) 0);
>
> Why not just
>
>     dim_vector d = dims ();
>     int rank = d.length ();
>     hsize_t hdims[rank];
>     hid_t space_hid = -1, data_hid = -1;
>     bool retval = true;
>     NDArray m = array_value ();
>
>     // Octave uses column-major, while HDF5 uses row-major ordering
>     for (int i = 0; < rank; i++)
>       hdims[i] = d (rank-i-1);
>
>     space_hid = H5Screate_simple (rank, hdims, (hsize_t*) 0);
>
> Finally, I suspect that there is very little difference in the code
> for various matrix types.  It would be nice to implement this in the
> base class (though that might not be possible) or as a template
> function to avoid repeated code.
>
> jwe
--
David Bateman                                [hidden email]
Motorola CRM                                 +33 1 69 35 48 04 (Ph)
Parc Les Algorithmes, Commune de St Aubin    +33 1 69 35 77 01 (Fax)
91193 Gif-Sur-Yvette FRANCE

The information contained in this communication has been classified as:

[x] General Business Information
[ ] Motorola Internal Use Only
[ ] Motorola Confidential Proprietary

patch (28K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: error saving empty matrix in HDF5 format

Paul Kienzle
In reply to this post by David Bateman-3

On Mar 1, 2004, at 8:44 AM, David Bateman wrote:

>
> Ok there is a thread on this exact problem at
>
> http://www.unidata.ucar.edu/projects/coohl/mhonarc/MailArchives/ 
> netcdf-hdf-list/msg00097.html
>
> The bottom line is that, it appears that the HDF people are  
> implementing
> something like "H5Screate(H5S_EMPTY)" however that doesn't help us,  
> since
> the dimensionality of our empty matrices won't be maintained.... That  
> is
> a rank=n matrix when loaded will become "zeros(zeros(n))" and not the
> correct dimensionality.
>
> Frankly, I see no simple way to address this. The only thing I can see  
> is
> if we save an attribute in the HDF5 file flagging an empty matrix, then
> save the dimensions as a vector. This is slighly incompatibile in that
> if other software doesn't check this attribute then they'll end up
> loading a 1xN vector with the dimensions of the matrix instead of an
> empty matrix.....
>
> Should I go this way?

Can't you put the dimensions themselves in an attribute?  Other
applications could read the empty array and treat it as an empty
array, but octave aware applications could read the empty_dim
attribute and act accordingly.


Reply | Threaded
Open this post in threaded view
|

Re: error saving empty matrix in HDF5 format

John W. Eaton-6
In reply to this post by David Bateman-3
On  2-Mar-2004, David Bateman <[hidden email]> wrote:

| Ok, here is a patch for this and the problem identified by Hans Ekkehard. It
| also in passing adds Nd array support for load/save to strings. Its not
| necessary the best approach, but since HDF5 doesn't support zero dimensioned
| matrices I don't see another way of doing this...

I applied these changes, but I think maybe we should use Paul's
suggestion of storing the dimensions in the attributes section.

Thanks,

jwe


Reply | Threaded
Open this post in threaded view
|

Re: error saving empty matrix in HDF5 format

David Bateman-3
In reply to this post by Paul Kienzle
According to Paul Kienzle <[hidden email]> (on 03/02/04):
> Can't you put the dimensions themselves in an attribute?  Other
> applications could read the empty array and treat it as an empty
> array, but octave aware applications could read the empty_dim
> attribute and act accordingly.

The dimensions could be in an attribute. But it appears that the empty
array is only available in the latest HDF sources, so its a bit of a
pain...

D.

--
David Bateman                                [hidden email]
Motorola CRM                                 +33 1 69 35 48 04 (Ph)
Parc Les Algorithmes, Commune de St Aubin    +33 1 69 35 77 01 (Fax)
91193 Gif-Sur-Yvette FRANCE

The information contained in this communication has been classified as:

[x] General Business Information
[ ] Motorola Internal Use Only
[ ] Motorola Confidential Proprietary


Reply | Threaded
Open this post in threaded view
|

Re: error saving empty matrix in HDF5 format

David Bateman-3
In reply to this post by John W. Eaton-6
According to John W. Eaton <[hidden email]> (on 03/02/04):
> I applied these changes, but I think maybe we should use Paul's
> suggestion of storing the dimensions in the attributes section.

I agree, but we need to check that the empty array is available in
the hdf library. In any case this would just be a matter of fixing up
the ls-hdf5.cc{save_hdf5_empty, load_hdf5_empty} functions I added.
So a reasonable structure is there even in this patch...

Cheers
David

--
David Bateman                                [hidden email]
Motorola CRM                                 +33 1 69 35 48 04 (Ph)
Parc Les Algorithmes, Commune de St Aubin    +33 1 69 35 77 01 (Fax)
91193 Gif-Sur-Yvette FRANCE

The information contained in this communication has been classified as:

[x] General Business Information
[ ] Motorola Internal Use Only
[ ] Motorola Confidential Proprietary


Reply | Threaded
Open this post in threaded view
|

Re: error saving empty matrix in HDF5 format

Paul Kienzle

On Mar 2, 2004, at 11:19 AM, David Bateman wrote:

> According to John W. Eaton <[hidden email]> (on 03/02/04):
>> I applied these changes, but I think maybe we should use Paul's
>> suggestion of storing the dimensions in the attributes section.
>
> I agree, but we need to check that the empty array is available in
> the hdf library. In any case this would just be a matter of fixing up
> the ls-hdf5.cc{save_hdf5_empty, load_hdf5_empty} functions I added.
> So a reasonable structure is there even in this patch...

It would be nice to have the data files not so heavily dependent
on a particular release of hdf5.  Could you save it instead as an
empty string or empty something else, but with the vector attribute
empty_dims?  Or is nothing allowed to be empty?  How about
an empty group?

Paul Kienzle
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: error saving empty matrix in HDF5 format

David Bateman-3
Dapr├Ęs Paul Kienzle <[hidden email]> (le 03/03/2004):

>
> On Mar 2, 2004, at 11:19 AM, David Bateman wrote:
>
> >According to John W. Eaton <[hidden email]> (on 03/02/04):
> >>I applied these changes, but I think maybe we should use Paul's
> >>suggestion of storing the dimensions in the attributes section.
> >
> >I agree, but we need to check that the empty array is available in
> >the hdf library. In any case this would just be a matter of fixing up
> >the ls-hdf5.cc{save_hdf5_empty, load_hdf5_empty} functions I added.
> >So a reasonable structure is there even in this patch...
>
> It would be nice to have the data files not so heavily dependent
> on a particular release of hdf5.  Could you save it instead as an
> empty string or empty something else, but with the vector attribute
> empty_dims?  Or is nothing allowed to be empty?  How about
> an empty group?
>
> Paul Kienzle
> [hidden email]

Ok, looking at this again it seems that the discussion I pointed out
was about the possibility of including empty objects in the HDF files.
At the moment this is not even implemented.

So, not sure what the best thing to do here is, as there are no empty array
structures in HDF files... At the moment the structure is

<variable name> <GROUP> -- >  type "octave_value" <DATASET>
                           >  value <DATASET>
                           >  OCTAVE_EMPTY_MATRIX <ATTRIBUTE>

where "OCTAVE_EMPTY_MATRIX" is a flag and "value" contains either the
data of a non empty matrix or the dimensions of the matrix if
"OCATVE_EMPTY_MATRIX" is set. What we would like is

<variable name> <GROUP> -- >  type "octave_value" <DATASET>
                           >  value "empty dataset" <DATASET>
                           >  OCTAVE_EMPTY_MATRIX <ATTRIBUTE>

where OCTAVE_EMPTY_MATRIX is now an attribute containing the array dimensions.
However, this is not possible even with the latest versions of HDF5. Are
there any other choices that make sense?

Regards
David

--
David Bateman                                [hidden email]
Motorola CRM                                 +33 1 69 35 48 04 (Ph)
Parc Les Algorithmes, Commune de St Aubin    +33 1 69 35 77 01 (Fax)
91193 Gif-Sur-Yvette FRANCE

The information contained in this communication has been classified as:

[x] General Business Information
[ ] Motorola Internal Use Only
[ ] Motorola Confidential Proprietary


Reply | Threaded
Open this post in threaded view
|

Re: error saving empty matrix in HDF5 format

John W. Eaton-6
On  3-Mar-2004, David Bateman <[hidden email]> wrote:

| Ok, looking at this again it seems that the discussion I pointed out
| was about the possibility of including empty objects in the HDF files.
| At the moment this is not even implemented.
|
| So, not sure what the best thing to do here is, as there are no empty array
| structures in HDF files... At the moment the structure is
|
| <variable name> <GROUP> -- >  type "octave_value" <DATASET>
|                            >  value <DATASET>
|                            >  OCTAVE_EMPTY_MATRIX <ATTRIBUTE>
|
| where "OCTAVE_EMPTY_MATRIX" is a flag and "value" contains either the
| data of a non empty matrix or the dimensions of the matrix if
| "OCATVE_EMPTY_MATRIX" is set. What we would like is
|
| <variable name> <GROUP> -- >  type "octave_value" <DATASET>
|                            >  value "empty dataset" <DATASET>
|                            >  OCTAVE_EMPTY_MATRIX <ATTRIBUTE>
|
| where OCTAVE_EMPTY_MATRIX is now an attribute containing the array dimensions.
| However, this is not possible even with the latest versions of HDF5. Are
| there any other choices that make sense?

I don't know what the best solution is.

Unfortunately, since HDF does not handle zero-length dimensions
directly, that means that we have to insert kluges for every N-d
type.  For example, it seems that with the latest CVS I can save and
reload an empty N-d numeric array, but not an empty N-d cell array.

Perhaps we need to get the attention of the HDF people and get them to
handle zero-length dimensions?

jwe


Reply | Threaded
Open this post in threaded view
|

Re: error saving empty matrix in HDF5 format

David Bateman-3
According to John W. Eaton <[hidden email]> (on 03/05/04):
>
> I don't know what the best solution is.
>
> Unfortunately, since HDF does not handle zero-length dimensions
> directly, that means that we have to insert kluges for every N-d
> type.  For example, it seems that with the latest CVS I can save and
> reload an empty N-d numeric array, but not an empty N-d cell array.

Ok, that was an oversight... Just add

  dim_vector d = dims ();
  int empty = save_hdf5_empty (loc_id, name, d);
  if (empty != 0)
    return (empty > 0);

At the top of save_hdf5 functions and

  dim_vector dv;
  int empty = load_hdf5_empty (loc_id, name, dv);
  if (empty > 0)
    matrix.resize(dv);
  if (empty != 0)
      return (empty > 0);

At the top of the load_hdf5 function should be all that is needed for all
Nd-array types..

> Perhaps we need to get the attention of the HDF people and get them to
> handle zero-length dimensions?

Probably.... Anyone already been in contact with them? In any case this
would have to be done quickly since if my hack gets entrenched, then
it will be harder to replace it. It will also mean that octave would only
be compatiable with version containing the ability to save empty matrices.
So everyone using HDF5 would have to upgrade....

D.

--
David Bateman                                [hidden email]
Motorola CRM                                 +33 1 69 35 48 04 (Ph)
Parc Les Algorithmes, Commune de St Aubin    +33 1 69 35 77 01 (Fax)
91193 Gif-Sur-Yvette FRANCE

The information contained in this communication has been classified as:

[x] General Business Information
[ ] Motorola Internal Use Only
[ ] Motorola Confidential Proprietary


Reply | Threaded
Open this post in threaded view
|

Re: error saving empty matrix in HDF5 format

Paul Kienzle

On Mar 7, 2004, at 8:03 AM, David Bateman wrote:

>> Perhaps we need to get the attention of the HDF people and get them to
>> handle zero-length dimensions?
>
> Probably.... Anyone already been in contact with them?

I'm about to be.  The source tarball for 1.6.2 compiles and tests
perfectly on irix and debian, but fail for cygwin 3.2 and mingw 3.3
gcc compilers.  Anybody compiled their own version of hdf for
windows successful can tell me what's going on?

- Paul


Reply | Threaded
Open this post in threaded view
|

Re: error saving empty matrix in HDF5 format

John W. Eaton-6
In reply to this post by David Bateman-3
On  7-Mar-2004, David Bateman <[hidden email]> wrote:

| According to John W. Eaton <[hidden email]> (on 03/05/04):
| >
| > I don't know what the best solution is.
| >
| > Unfortunately, since HDF does not handle zero-length dimensions
| > directly, that means that we have to insert kluges for every N-d
| > type.  For example, it seems that with the latest CVS I can save and
| > reload an empty N-d numeric array, but not an empty N-d cell array.
|
| Ok, that was an oversight... Just add
|
|   dim_vector d = dims ();
|   int empty = save_hdf5_empty (loc_id, name, d);
|   if (empty != 0)
|     return (empty > 0);
|
| At the top of save_hdf5 functions and
|
|   dim_vector dv;
|   int empty = load_hdf5_empty (loc_id, name, dv);
|   if (empty > 0)
|     matrix.resize(dv);
|   if (empty != 0)
|       return (empty > 0);
|
| At the top of the load_hdf5 function should be all that is needed for all
| Nd-array types..

I made this change.

Thanks,

jwe