Quantcast

octave_idx_type vs mwSize, mwIndex, and mwSignedIndex

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

octave_idx_type vs mwSize, mwIndex, and mwSignedIndex

John W. Eaton
Administrator
In Octave, we currently declare both array sizes and indices with
octave_idx_tyep, which is normally defined to be int, or, if compiling
with --enable-64, it is whatever type is required to get a 64-bit
signed integer.

Looking at the Matlab docs, it has

  mwSize:        normally int, or, with large array dimensions, size_t
  mwIndex:       normally int, or, with large array dimensions, size_t
  mwSignedIndex: normally int, or, with large array dimensions, ptrdiff_t

Does anyone else think it would be good for us to also have a size
type in addition to an index type?

Do we actually need a signed index type?  What is it used for?

Even if we do not decide to use both size and index types and use
only octave_idx_type, then should we just define it to be size_t?

The reason we are using a signed integer now is so that it matches the
signed integer values in Fortran 77 (which has no unsigned types).
But I don't think there is any harm in passing an object of size_t
(which should be an unsigned 64-bit value on a system with 64-bit
pointers) to a function that expects an INTEGER*8 value, unless the
value passed is larger than 2^63 or less than 0.  An array dimension
or index that is greater than 2^63 is probably not useful for a system
like Octave (or pretty much any actual system today, as that is one
gigantic number of array elements).

Currently when building Octave with --enable-64, we compile all the
Fortran bits with an option like gfortran's -fdefault-integer-8 switch
that makes all INTEGER values 8-bytes wide.  For values that are array
dimensions or indices, 8-bytes makes sense, but there are other values
that are passed as option flags or error indicators that don't really
need to be 8 bytes wide.  So it would probably be better to not have
the Fortran code use 8-byte values for all integer values, but fixing
that correctly would probably be a lot of work.

There are also other cases where blindly using 8-byte values cause
some real trouble.  For example, the algorithms used in ranlib and
also the {S,D}LARUV random number generators in LAPACK apparently
assume that the integers they are working with are 32-bits wide and
will not work properly if they are not.  So compiling a correctly
functioning version of the reference LAPACK that uses large array
dimensions is not quite as simple as just setting an option like
-fdefault-integer-8.

Comments?

jwe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: octave_idx_type vs mwSize, mwIndex, and mwSignedIndex

Jaroslav Hajek-2
On Thu, Feb 18, 2010 at 7:04 PM, John W. Eaton <[hidden email]> wrote:

> In Octave, we currently declare both array sizes and indices with
> octave_idx_tyep, which is normally defined to be int, or, if compiling
> with --enable-64, it is whatever type is required to get a 64-bit
> signed integer.
>
> Looking at the Matlab docs, it has
>
>  mwSize:        normally int, or, with large array dimensions, size_t
>  mwIndex:       normally int, or, with large array dimensions, size_t
>  mwSignedIndex: normally int, or, with large array dimensions, ptrdiff_t
>
> Does anyone else think it would be good for us to also have a size
> type in addition to an index type?
>
> Do we actually need a signed index type?  What is it used for?
>

Having a signed index type has several advantages:
1. -1 is commonly used as a special case. Using 0xFFFFFFFF is clumsy
and less elegant.
2. With unsigned indices you generally have to be more careful , for
instance with tests as ub-lb < 2 (won't work if ub < lb).
3. Signed integer arithmetics is better optimized by C++ compilers,
because overflow behavior is not mandated.

> Even if we do not decide to use both size and index types and use
> only octave_idx_type, then should we just define it to be size_t?

size_t is unsigned. If anything, use ptrdiff_t.

> The reason we are using a signed integer now is so that it matches the
> signed integer values in Fortran 77 (which has no unsigned types).
> But I don't think there is any harm in passing an object of size_t
> (which should be an unsigned 64-bit value on a system with 64-bit
> pointers) to a function that expects an INTEGER*8 value, unless the
> value passed is larger than 2^63 or less than 0.  An array dimension
> or index that is greater than 2^63 is probably not useful for a system
> like Octave (or pretty much any actual system today, as that is one
> gigantic number of array elements).

The reasons why Fortran basically only uses signed integers are the
same as above.

> Currently when building Octave with --enable-64, we compile all the
> Fortran bits with an option like gfortran's -fdefault-integer-8 switch
> that makes all INTEGER values 8-bytes wide.  For values that are array
> dimensions or indices, 8-bytes makes sense, but there are other values
> that are passed as option flags or error indicators that don't really
> need to be 8 bytes wide.  So it would probably be better to not have
> the Fortran code use 8-byte values for all integer values, but fixing
> that correctly would probably be a lot of work.
>
> There are also other cases where blindly using 8-byte values cause
> some real trouble.  For example, the algorithms used in ranlib and
> also the {S,D}LARUV random number generators in LAPACK apparently
> assume that the integers they are working with are 32-bits wide and
> will not work properly if they are not.  So compiling a correctly
> functioning version of the reference LAPACK that uses large array
> dimensions is not quite as simple as just setting an option like
> -fdefault-integer-8.
>
> Comments?
>
> jwe
>



--
RNDr. Jaroslav Hajek, PhD
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

Loading...