

hi,
I'm working on a patch to improve integer arithmetics in Octave. While
the semantics for homogeneous operations is basicallly a POC, I'm not
confident about the semantics mixedinteger and realinteger
operations.
Currently, it seems that Matlab allows homogeneous integer
arithmetics, and mixed integerdouble arithmetics with scalar double,
while at the same time it disallows int64 arithmetics at all. Octave
is somewhat more permissive in that the double part doesn't need to be
a scalar.
Currently, the semantic of a doubleexpr OP intexpr is "convert
intexpr to double, do OP, convert result to int". Although this
behaviour is incompatible with almost all other languages, it makes
some sense  the primary motivation being probably that mcode like "a
= int32 (1); a += 1;" should work and produce "a = int32 (2)".
The problem is, that this semantics does not work with 64bit
integers, because they do not all fit into doubles without losing
digits.
So, the question is what to do? There are several paths:
1. Solve nothing and just mimick Matlab.
Okay, not quite. IMO, not having int64 arithmetics is not dignified
enough for a 64bit ready system (yeah, Octave is not quite 64bit
ready, but I hope it aims to be). So at least int64 OP int64 should be
allowed, but we simply disallow int64 OP double.
2. Simply extend the current behaviour to int64double ops. This sort
of kills the initial point, because then int64(a) + 1 is not the same
as int64 (a) + int64 (1).
3. make `a OP b' (a is intXX, b is double) mean `a OP intXX (b)'. This
solves the "add small constant" problem elegantly, but breaks some
other tricks, like a * 0.5 would no longer mean a / 2.
4. use 3, but only for addition and subtraction. Although it sounds
weird, I actually like this one because I think that few people care
about multiplying and dividing integer and real numbers together.
5. preserve the current semantics for < 32bit integers, and do 3 or 4
for int64. Though this is close to a "least disturbance" solution, I
think it is (a little?) nasty, having different integer types work
differently.
6. curse all of this mess and make Octave compatible with the rest of
the world, i.e. integer literals have integer type, and the numeric
tower is logical > integers > reals. Okay, I don't mean this really
seriously  that would break backward and Matlab compatibility, and
would take a lot of work. It's here just for completeness.
7. pretend that there always exists a real type wide enough to hold
all 64bit integers, and use the semantics "convert both operands to
wide real, do OP, convert to intXX". On systems where this type really
exists (like the 80bit long double on x86) it could be implemented
directly; otherwise, it would need software emulation (painfully slow
and laborious).
I have the homogeneous arithmetic working, but before proceeding, I
need to decide this matter and I found out it's not that easy. Perhaps
someone can suggest some particularly good argument for some option,
or suggest something fresh.
Mixedinteger operations are another question. Currently, they're not
allowed from within interpreter, but they're implemented in liboctave,
using "convert to double, do OP, convert back".
The question is, should they be allowed? And if yes, what meaning
should they have? If no, they should probably be removed from
liboctave, because they apparently consume a not quite negligible
amount of compilation time.
regards,

RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz


Quoting Jaroslav Hajek < [hidden email]>:
> 7. pretend that there always exists a real type wide enough to hold
> all 64bit integers, and use the semantics "convert both operands to
> wide real, do OP, convert to intXX". On systems where this type really
> exists (like the 80bit long double on x86) it could be implemented
> directly; otherwise, it would need software emulation (painfully slow
> and laborious).
Just out of curiosity, but would it be possible to have a long_double
type in Octave, and would such a type solve the difficulties? Having a
long_double type might be nice to have anyway.
Søren


[hidden email] wrote:
> Quoting Jaroslav Hajek < [hidden email]>:
>> 7. pretend that there always exists a real type wide enough to hold
>> all 64bit integers, and use the semantics "convert both operands to
>> wide real, do OP, convert to intXX". On systems where this type really
>> exists (like the 80bit long double on x86) it could be implemented
>> directly; otherwise, it would need software emulation (painfully slow
>> and laborious).
>
> Just out of curiosity, but would it be possible to have a long_double
> type in Octave, and would such a type solve the difficulties? Having a
> long_double type might be nice to have anyway.
>
> Søren
>
>
Then we definitely want to clean up Array<T> before doing that, as this
is basically the same change as I went through adding teh single
precision type.
D.

David Bateman [hidden email]
Motorola Labs  Paris +33 1 69 35 48 04 (Ph)
Parc Les Algorithmes, Commune de St Aubin +33 6 72 01 06 33 (Mob)
91193 GifSurYvette FRANCE +33 1 69 35 77 01 (Fax)
The information contained in this communication has been classified as:
[x] General Business Information
[ ] Motorola Internal Use Only
[ ] Motorola Confidential Proprietary


On Mon, Sep 8, 2008 at 10:14 AM, < [hidden email]> wrote:
> Quoting Jaroslav Hajek < [hidden email]>:
>>
>> 7. pretend that there always exists a real type wide enough to hold
>> all 64bit integers, and use the semantics "convert both operands to
>> wide real, do OP, convert to intXX". On systems where this type really
>> exists (like the 80bit long double on x86) it could be implemented
>> directly; otherwise, it would need software emulation (painfully slow
>> and laborious).
>
> Just out of curiosity, but would it be possible to have a long_double type
> in Octave, and would such a type solve the difficulties? Having a
> long_double type might be nice to have anyway.
>
> Søren
>
>
Well the main problem with this is that only basic arithmetic
operation would work on these (no BLAS/LAPACK).
It could still be benefitial, though. But what to do on architectures
where 80bit extended prec is not present? Emulate? What if
quadprecision reals are available instead? Anyway, David's single
precision implementation showed us that adding a new type to Octave is
far from smooth.
So, should I take it that you vote for 7.?

RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz


Quoting Jaroslav Hajek < [hidden email]>:
> So, should I take it that you vote for 7.?
No, I was just asking out of curiosity.
Søren


On 8Sep2008, Jaroslav Hajek wrote:
 Mixedinteger operations are another question. Currently, they're not
 allowed from within interpreter, but they're implemented in liboctave,
 using "convert to double, do OP, convert back".
 The question is, should they be allowed? And if yes, what meaning
 should they have? If no, they should probably be removed from
 liboctave, because they apparently consume a not quite negligible
 amount of compilation time.
Looking at the files like mxi32i16.cc, it seems that we only
instantiate mixed comparison and boolean ops. These are used in the
interpreter, and are also allowed in Matlab.
However, we seem to be missing the  and & operators for mixed type
integer operations. Oops.
jwe


On 8Sep2008, Jaroslav Hajek wrote:
 I'm working on a patch to improve integer arithmetics in Octave.
What do you mean by improve? Make them faster? I think that's OK,
but is it really worth the effort?
If you want to change the semantics so that they make better sense (or
at least are more like the semantics of similar operations in other
languages) then I think I'd have to vote no against doing that since
it will just be confusing for most users if integer operations work
differently in Octave and Matlab. Also, while it might be tempting to
make mixed type operations work, it is a bit risky since whatever we
choose could end up being done differently in some future version of
Matlab.
jwe


On Tue, Sep 9, 2008 at 4:24 AM, John W. Eaton < [hidden email]> wrote:
> On 8Sep2008, Jaroslav Hajek wrote:
>
>  I'm working on a patch to improve integer arithmetics in Octave.
>
> What do you mean by improve? Make them faster? I think that's OK,
> but is it really worth the effort?
>
I mean:
1. Provide int64 arithmetics.
2. Make them faster (while, of course, preserving the saturation
semantics and the warnings). Use integer arithmetics where possible.
The primary target for speed are the homogeneous operations.
3. Complete the warnings. I don't get why Matlab possibly gripes at
conversion int32(0.5) but always accepts int32(1) + 0.5 silently. I
want to make Octave possibly warn here.
4. Restructure the classes in octinttypes.h somewhat, replace macros
with templates where appropriate, reduce code duplication.
I personally think that 1. alone is worth the effort, and that was my
original aim. But I realized that the current implementation (convert
to double, do OP, convert back) would not work for 64bit integers,
and is not particularly fast anyway for 32bit and smaller ints
either.
> If you want to change the semantics so that they make better sense (or
> at least are more like the semantics of similar operations in other
> languages) then I think I'd have to vote no against doing that since
> it will just be confusing for most users if integer operations work
> differently in Octave and Matlab. Also, while it might be tempting to
> make mixed type operations work, it is a bit risky since whatever we
> choose could end up being done differently in some future version of
> Matlab.
>
Of course, it's Matlab versus common sense again. But I didn't want to
go any farther than making 64bit arithmetics work, and I think that
most of the behaviour is already clear in Matlab, just not
implemented. The problem is just that the obvious way of implementing
int + double does not work well for int64.
I am all for simply not providing mixed integer ops. It's just that
they are defined in octinttypes.h, and so I thought they might be
used somewhere.
The easiest solution is to just leave the templates there using the
current approach, and not care that they don't work for octave_int64.
Or put them away, and if they happen to be used anywhere in the
sources (I didn't check yet), then replace the usage with an explicit
conversion. I'll rather do this, if you're fine with it, as it seems
nasty to leave broken code around. The mixedtype comparisons will of
course remain, as there is no problem with these.
The question is what to do with int64 OP double. Options I see:
1. leave the current (broken) way. Losing some digits with high numbers.
2. forbid it, and wait what Matlab will do (but I think they're bound
to do something close to 3)
3. use long double if possible (x86), and if not, emulate.
which one do you prefer? Now that I think of it, 3 is probably the
best in terms of completeness as well as consistency with Matlab,
though likely the most laborious.

RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz


On Tue, Sep 9, 2008 at 4:18 AM, John W. Eaton < [hidden email]> wrote:
> On 8Sep2008, Jaroslav Hajek wrote:
>
>  Mixedinteger operations are another question. Currently, they're not
>  allowed from within interpreter, but they're implemented in liboctave,
>  using "convert to double, do OP, convert back".
>  The question is, should they be allowed? And if yes, what meaning
>  should they have? If no, they should probably be removed from
>  liboctave, because they apparently consume a not quite negligible
>  amount of compilation time.
>
> Looking at the files like mxi32i16.cc, it seems that we only
> instantiate mixed comparison and boolean ops. These are used in the
> interpreter, and are also allowed in Matlab.
>
> However, we seem to be missing the  and & operators for mixed type
> integer operations. Oops.
>
What do you mean? int32(1) & int8(2) works, of course, but it does not
do bitwise operation.
> jwe
>

RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz


On 9Sep2008, Jaroslav Hajek wrote:
 What do you mean? int32(1) & int8(2) works, of course, but it does not
 do bitwise operation.
It works for you? This is what I see with the latest sources:
octave:1> int32(1) & int8(2)
error: binary operator `&' not implemented for `int32 scalar' by `int8 scalar' operations
octave:1> int32(1)  int8(2)
error: binary operator `' not implemented for `int32 scalar' by `int8 scalar' operations
jwe


On 9Sep2008, Jaroslav Hajek wrote:
 On Tue, Sep 9, 2008 at 4:24 AM, John W. Eaton < [hidden email]> wrote:
 > On 8Sep2008, Jaroslav Hajek wrote:
 >
 >  I'm working on a patch to improve integer arithmetics in Octave.
 >
 > What do you mean by improve? Make them faster? I think that's OK,
 > but is it really worth the effort?
 >
 I mean:
 1. Provide int64 arithmetics.
 2. Make them faster (while, of course, preserving the saturation
 semantics and the warnings). Use integer arithmetics where possible.
 The primary target for speed are the homogeneous operations.
 3. Complete the warnings. I don't get why Matlab possibly gripes at
 conversion int32(0.5) but always accepts int32(1) + 0.5 silently. I
 want to make Octave possibly warn here.
 4. Restructure the classes in octinttypes.h somewhat, replace macros
 with templates where appropriate, reduce code duplication.
OK. I'd suggest attacking each of these separately. Maybe do the
cleanup of octinttypes.h first.
How do you propose to avoid the conversion to double and still provide
the saturation semantics? Do the operation in the next widest integer
type? Something else?
 Of course, it's Matlab versus common sense again. But I didn't want to
 go any farther than making 64bit arithmetics work, and I think that
 most of the behaviour is already clear in Matlab, just not
 implemented. The problem is just that the obvious way of implementing
 int + double does not work well for int64.
 I am all for simply not providing mixed integer ops. It's just that
 they are defined in octinttypes.h, and so I thought they might be
 used somewhere.
OK. I think the templates could easily be rewritten so that they only
work when both operands have the same type.
 The easiest solution is to just leave the templates there using the
 current approach, and not care that they don't work for octave_int64.
 Or put them away, and if they happen to be used anywhere in the
 sources (I didn't check yet), then replace the usage with an explicit
 conversion. I'll rather do this, if you're fine with it, as it seems
 nasty to leave broken code around.
How is the code broken?
 The mixedtype comparisons will of
 course remain, as there is no problem with these.

 The question is what to do with int64 OP double. Options I see:

 1. leave the current (broken) way. Losing some digits with high numbers.
Is this what you mean when you say broken above, that the templates
won't work properly for 64bit ints? OK, if we can't come up with a
portable way to implement 64bit operations on all systems, then I
think we will need specializations for the 64bit types that throw
errors.
 2. forbid it, and wait what Matlab will do (but I think they're bound
 to do something close to 3)
 3. use long double if possible (x86), and if not, emulate.

 which one do you prefer? Now that I think of it, 3 is probably the
 best in terms of completeness as well as consistency with Matlab,
 though likely the most laborious.
I think 3 would be fine, if you want to tackle it. I just don't think
of it as a high priority project.
jwe


On Tue, Sep 9, 2008 at 4:07 PM, John W. Eaton < [hidden email]> wrote:
> On 9Sep2008, Jaroslav Hajek wrote:
>
>  What do you mean? int32(1) & int8(2) works, of course, but it does not
>  do bitwise operation.
>
> It works for you? This is what I see with the latest sources:
>
> octave:1> int32(1) & int8(2)
> error: binary operator `&' not implemented for `int32 scalar' by `int8 scalar' operations
> octave:1> int32(1)  int8(2)
> error: binary operator `' not implemented for `int32 scalar' by `int8 scalar' operations
>
> jwe
>
Sorry, I meant it works in Matlab. But it does a boolean operation.

RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz


On 9Sep2008, Jaroslav Hajek wrote:
 On Tue, Sep 9, 2008 at 4:07 PM, John W. Eaton < [hidden email]> wrote:
 > On 9Sep2008, Jaroslav Hajek wrote:
 >
 >  What do you mean? int32(1) & int8(2) works, of course, but it does not
 >  do bitwise operation.
 >
 > It works for you? This is what I see with the latest sources:
 >
 > octave:1> int32(1) & int8(2)
 > error: binary operator `&' not implemented for `int32 scalar' by `int8 scalar' operations
 > octave:1> int32(1)  int8(2)
 > error: binary operator `' not implemented for `int32 scalar' by `int8 scalar' operations
 >
 > jwe
 >

 Sorry, I meant it works in Matlab. But it does a boolean operation.
Right, that's what it should do since in Matlab the  and & operators
work on matrix elements, not bits.
jwe


On Tue, Sep 9, 2008 at 7:42 PM, John W. Eaton < [hidden email]> wrote:
> On 9Sep2008, Jaroslav Hajek wrote:
>
>  On Tue, Sep 9, 2008 at 4:24 AM, John W. Eaton < [hidden email]> wrote:
>  > On 8Sep2008, Jaroslav Hajek wrote:
>  >
>  >  I'm working on a patch to improve integer arithmetics in Octave.
>  >
>  > What do you mean by improve? Make them faster? I think that's OK,
>  > but is it really worth the effort?
>  >
>  I mean:
>  1. Provide int64 arithmetics.
>  2. Make them faster (while, of course, preserving the saturation
>  semantics and the warnings). Use integer arithmetics where possible.
>  The primary target for speed are the homogeneous operations.
>  3. Complete the warnings. I don't get why Matlab possibly gripes at
>  conversion int32(0.5) but always accepts int32(1) + 0.5 silently. I
>  want to make Octave possibly warn here.
>  4. Restructure the classes in octinttypes.h somewhat, replace macros
>  with templates where appropriate, reduce code duplication.
>
> OK. I'd suggest attacking each of these separately. Maybe do the
> cleanup of octinttypes.h first.
A good suggestion. The problem is that I already started doing
everything together; right now, I'm not even particularly sure how to
split it apart. But I'll have a look at it.
>
> How do you propose to avoid the conversion to double and still provide
> the saturation semantics? Do the operation in the next widest integer
> type? Something else?
>
Addition and subtraction can be done using just the native operations.
For instance, addition can be done like this:
T add (T x, T y)
{
T u;
if (x >= 0)
{
u = std::numeric_limits<T>::max ()  x;
if (y > u)
ftruncate = true;
else
u = y;
}
else
{
u = std::numeric_limits<T>::min ()  x;
if (y < u)
ftruncate = true;
else
u = y;
}
return x + u;
}
or like this (uses the bit properties of two's complement signed int
representation,
is significantly faster)
T add (T x, T y)
{
T u = x + y;
T ux = u ^ x, uy = u ^ y;
if ((ux & uy) < 0)
{
u = std::numeric_limits<T>::max () + signbit (~u);
ftruncate = true;
}
return u;
}
Both versions seem to be faster than going via double, the current
implementation.
While version 1 is million percent portable, version 2 may not be if
the target machine does not use two's complement signed integers (i.e.
the signed arithmetic is identical to unsigned). I don't know if there
are any such architectures we actually care of that do not support
this. In fact, even the autoconf manual says that assuming two's
complement is, for practical purposes, safe.
OTOH, on most, if not all, architectures, version 2 will work and is
still considerably faster than version 1.
Multiplication is best done by promoting to a wider integer type,
multiplying, and fittorange. Again, avoiding the intreal
conversions is a performance win. If 128bit int is not available,
64bit multiplication can be done by using bit shifts.
Division can be done using integer division.
>  Of course, it's Matlab versus common sense again. But I didn't want to
>  go any farther than making 64bit arithmetics work, and I think that
>  most of the behaviour is already clear in Matlab, just not
>  implemented. The problem is just that the obvious way of implementing
>  int + double does not work well for int64.
>
>  I am all for simply not providing mixed integer ops. It's just that
>  they are defined in octinttypes.h, and so I thought they might be
>  used somewhere.
>
> OK. I think the templates could easily be rewritten so that they only
> work when both operands have the same type.
>
>  The easiest solution is to just leave the templates there using the
>  current approach, and not care that they don't work for octave_int64.
>  Or put them away, and if they happen to be used anywhere in the
>  sources (I didn't check yet), then replace the usage with an explicit
>  conversion. I'll rather do this, if you're fine with it, as it seems
>  nasty to leave broken code around.
>
> How is the code broken?
>
>  The mixedtype comparisons will of
>  course remain, as there is no problem with these.
> 
>  The question is what to do with int64 OP double. Options I see:
> 
>  1. leave the current (broken) way. Losing some digits with high numbers.
>
> Is this what you mean when you say broken above, that the templates
> won't work properly for 64bit ints?
yes.
> OK, if we can't come up with a
> portable way to implement 64bit operations on all systems, then I
> think we will need specializations for the 64bit types that throw
> errors.
>
>  2. forbid it, and wait what Matlab will do (but I think they're bound
>  to do something close to 3)
>  3. use long double if possible (x86), and if not, emulate.
> 
>  which one do you prefer? Now that I think of it, 3 is probably the
>  best in terms of completeness as well as consistency with Matlab,
>  though likely the most laborious.
>
> I think 3 would be fine, if you want to tackle it. I just don't think
> of it as a high priority project.
>
Probably not, given that current Matlab does not bother with 64bit
arithmetics. But I think it is a useful thing to have. And the
performance boosts for int32 and lower arithmetics may make it more
usable for real applications, DSP for example.
>
> jwe
>

RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz


On 9Sep2008, Jaroslav Hajek wrote:
 Addition and subtraction can be done using just the native operations.
 For instance, addition can be done like this:

 T add (T x, T y)
 {
 T u;
 if (x >= 0)
 {
 u = std::numeric_limits<T>::max ()  x;
 if (y > u)
 ftruncate = true;
 else
 u = y;
 }
 else
 {
 u = std::numeric_limits<T>::min ()  x;
 if (y < u)
 ftruncate = true;
 else
 u = y;
 }
 return x + u;
 }
Where is ftruncate declared and how is it used?
Maybe you would also want to check x == 0 and y == 0? Something like
template <typename T>
T
add (const T& x, const T& y)
{
if (x == 0)
return y;
else if (y == 0)
return x;
else if (x > 0)
{
static T mx = std::numeric_limits<T>::max ();
return (y > mx  x) ? mx : x + y;
}
else
{
static T mn = std::numeric_limits<T>::min ();
return (y < mn  x) ? mn : x + y;
}
}
?
 or like this (uses the bit properties of two's complement signed int
 representation,
 is significantly faster)

 T add (T x, T y)
 {
 T u = x + y;
 T ux = u ^ x, uy = u ^ y;
 if ((ux & uy) < 0)
 {
 u = std::numeric_limits<T>::max () + signbit (~u);
 ftruncate = true;
 }
 return u;
 }
I'd think that which version is faster would depend on the compiler
and hardware, at least to some degree. And I'd really rather not use
bit twiddling tricks that are not guaranteed to work for all
arithmetic models unless they are accompanied by some configure checks
so that this code won't cause trouble in the future if/when someone
tries to port Octave to some exotic system.
 Both versions seem to be faster than going via double, the current
 implementation.
 While version 1 is million percent portable, version 2 may not be if
 the target machine does not use two's complement signed integers (i.e.
 the signed arithmetic is identical to unsigned). I don't know if there
 are any such architectures we actually care of that do not support
 this. In fact, even the autoconf manual says that assuming two's
 complement is, for practical purposes, safe.
 OTOH, on most, if not all, architectures, version 2 will work and is
 still considerably faster than version 1.

 Multiplication is best done by promoting to a wider integer type,
 multiplying, and fittorange. Again, avoiding the intreal
 conversions is a performance win. If 128bit int is not available,
 64bit multiplication can be done by using bit shifts.
Before making claims about what is faster, I guess I'd like to see
some actual numbers for the different versions on several different
architectures.
BTW, how much faster are we talking about here?
Although speed is nice, I think it would be better to spend time
working on missing features first.
jwe


On Tue, Sep 9, 2008 at 10:38 PM, John W. Eaton < [hidden email]> wrote:
> On 9Sep2008, Jaroslav Hajek wrote:
>
>  Addition and subtraction can be done using just the native operations.
>  For instance, addition can be done like this:
> 
>  T add (T x, T y)
>  {
>  T u;
>  if (x >= 0)
>  {
>  u = std::numeric_limits<T>::max ()  x;
>  if (y > u)
>  ftruncate = true;
>  else
>  u = y;
>  }
>  else
>  {
>  u = std::numeric_limits<T>::min ()  x;
>  if (y < u)
>  ftruncate = true;
>  else
>  u = y;
>  }
>  return x + u;
>  }
>
> Where is ftruncate declared and how is it used?
I converted the single int flag to individual boolean flags for each
exception (truncate, nan, nonint). After all, they were always used
separately.
And it's faster to raise a flag using flag = true (a single write)
than int_flag = mask (a read, or, and write). The first operation is
also atomic and thus threadsafe.
>
> Maybe you would also want to check x == 0 and y == 0? Something like
Why? It's handled in the code above. But I can try and report the
benchmark results.
>
> template <typename T>
> T
> add (const T& x, const T& y)
> {
> if (x == 0)
> return y;
> else if (y == 0)
> return x;
> else if (x > 0)
> {
> static T mx = std::numeric_limits<T>::max ();
> return (y > mx  x) ? mx : x + y;
> }
> else
> {
> static T mn = std::numeric_limits<T>::min ();
> return (y < mn  x) ? mn : x + y;
> }
> }
>
> ?
>
>  or like this (uses the bit properties of two's complement signed int
>  representation,
>  is significantly faster)
> 
>  T add (T x, T y)
>  {
>  T u = x + y;
>  T ux = u ^ x, uy = u ^ y;
>  if ((ux & uy) < 0)
>  {
>  u = std::numeric_limits<T>::max () + signbit (~u);
>  ftruncate = true;
>  }
>  return u;
>  }
>
> I'd think that which version is faster would depend on the compiler
> and hardware, at least to some degree.
I bet version 2 is faster on any current architecture (if it works).
Using modest optimization, it translates to machine code with only a
single conditional jump, that is only taken on overflow, and thus
handled gracefully by branch prediction. The question is, of course,
how much faster.
> And I'd really rather not use
> bit twiddling tricks that are not guaranteed to work for all
> arithmetic models unless they are accompanied by some configure checks
> so that this code won't cause trouble in the future if/when someone
> tries to port Octave to some exotic system.
>
Exactly my idea  I wanted to do a configure check and define a flag
if this is usable.
>  Both versions seem to be faster than going via double, the current
>  implementation.
>  While version 1 is million percent portable, version 2 may not be if
>  the target machine does not use two's complement signed integers (i.e.
>  the signed arithmetic is identical to unsigned). I don't know if there
>  are any such architectures we actually care of that do not support
>  this. In fact, even the autoconf manual says that assuming two's
>  complement is, for practical purposes, safe.
>  OTOH, on most, if not all, architectures, version 2 will work and is
>  still considerably faster than version 1.
> 
>  Multiplication is best done by promoting to a wider integer type,
>  multiplying, and fittorange. Again, avoiding the intreal
>  conversions is a performance win. If 128bit int is not available,
>  64bit multiplication can be done by using bit shifts.
>
> Before making claims about what is faster, I guess I'd like to see
> some actual numbers for the different versions on several different
> architectures.
>
You'll get them. I contributed the benchmark into OctaveForge, and
I'll want people try the patch and report what results they've seen.
> BTW, how much faster are we talking about here?
I already did some benchmarks, but I have changed the code
significantly since, so I don't have uptodate numbers now.
>From the old results, I see that in adding two signed 32bit integers
there was 113% speedup with version 1 compared to current
implementation and 173% with version 2 (the speedup is measured as
`old_time / new_time  1' (in %)). This may not correspond exactly to
the codes above. Also, especially with version 2, it somewhat depends
on how often the overflow occurs (normally, it occurs infrequently,
and I think this gives version 2 another speed advantage on
architectures using branch prediction).
>
> Although speed is nice, I think it would be better to spend time
> working on missing features first.
>
Well, as I said before, it did start as a hunt for missing feature
(int64 arithmetics). And I realized that the present method (using
doubles) would not work, and when I wrote addition and subtraction for
64bit ints, I could just as well make it a template and use it for
other integers, especially if it would improve performance, which it
did, etc.
Maybe you're right, all this may really be quite useless. Perhaps
someone can comment on whether they use (or want to use) integer
arithmetics in Octave, and whether they care about its speed?

RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz


On 10Sep2008, Jaroslav Hajek wrote:
 On Tue, Sep 9, 2008 at 10:38 PM, John W. Eaton < [hidden email]> wrote:
 > On 9Sep2008, Jaroslav Hajek wrote:
 >
 >  Addition and subtraction can be done using just the native operations.
 >  For instance, addition can be done like this:
 > 
 >  T add (T x, T y)
 >  {
 >  T u;
 >  if (x >= 0)
 >  {
 >  u = std::numeric_limits<T>::max ()  x;
 >  if (y > u)
 >  ftruncate = true;
 >  else
 >  u = y;
 >  }
 >  else
 >  {
 >  u = std::numeric_limits<T>::min ()  x;
 >  if (y < u)
 >  ftruncate = true;
 >  else
 >  u = y;
 >  }
 >  return x + u;
 >  }
 >
 > Where is ftruncate declared and how is it used?

 I converted the single int flag to individual boolean flags for each
 exception (truncate, nan, nonint). After all, they were always used
 separately.
OK, I forgot about exceptions.
 > Maybe you would also want to check x == 0 and y == 0? Something like

 Why? It's handled in the code above. But I can try and report the
 benchmark results.
It seems like it would be faster to avoid the extra operations when
either x or y is exactly zero.
 > BTW, how much faster are we talking about here?

 I already did some benchmarks, but I have changed the code
 significantly since, so I don't have uptodate numbers now.

 >From the old results, I see that in adding two signed 32bit integers
 there was 113% speedup with version 1 compared to current
 implementation and 173% with version 2 (the speedup is measured as
 `old_time / new_time  1' (in %)). This may not correspond exactly to
 the codes above. Also, especially with version 2, it somewhat depends
 on how often the overflow occurs (normally, it occurs infrequently,
 and I think this gives version 2 another speed advantage on
 architectures using branch prediction).
OK.
 > Although speed is nice, I think it would be better to spend time
 > working on missing features first.
 >
 Well, as I said before, it did start as a hunt for missing feature
 (int64 arithmetics). And I realized that the present method (using
 doubles) would not work, and when I wrote addition and subtraction for
 64bit ints, I could just as well make it a template and use it for
 other integers, especially if it would improve performance, which it
 did, etc.
By "missing features" I was thinking of things that are present in
Matlab but not Octave, so they would be missed by people trying to
port code from Matlab to Octave. I think we see a lot more complaints
about those kinds of things than we do about features that are just
generally missing from both Octave and Matlab.
But if you've already done most of the work, then I don't see why we
can't include it. I would prefer to see it done in stages if
possible. Also, we really must have configure checks to make sure the
assumptions required for your code to work are valid.
 Maybe you're right, all this may really be quite useless. Perhaps
 someone can comment on whether they use (or want to use) integer
 arithmetics in Octave, and whether they care about its speed?
It could certainly help for image processing, since reading an image
gives you an integer matrix.
jwe


On Wed, Sep 10, 2008 at 8:06 PM, John W. Eaton < [hidden email]> wrote:
> On 10Sep2008, Jaroslav Hajek wrote:
>
>  On Tue, Sep 9, 2008 at 10:38 PM, John W. Eaton < [hidden email]> wrote:
>  > On 9Sep2008, Jaroslav Hajek wrote:
>  >
>  >  Addition and subtraction can be done using just the native operations.
>  >  For instance, addition can be done like this:
>  > 
>  >  T add (T x, T y)
>  >  {
>  >  T u;
>  >  if (x >= 0)
>  >  {
>  >  u = std::numeric_limits<T>::max ()  x;
>  >  if (y > u)
>  >  ftruncate = true;
>  >  else
>  >  u = y;
>  >  }
>  >  else
>  >  {
>  >  u = std::numeric_limits<T>::min ()  x;
>  >  if (y < u)
>  >  ftruncate = true;
>  >  else
>  >  u = y;
>  >  }
>  >  return x + u;
>  >  }
>  >
>  > Where is ftruncate declared and how is it used?
> 
>  I converted the single int flag to individual boolean flags for each
>  exception (truncate, nan, nonint). After all, they were always used
>  separately.
>
> OK, I forgot about exceptions.
>
>  > Maybe you would also want to check x == 0 and y == 0? Something like
> 
>  Why? It's handled in the code above. But I can try and report the
>  benchmark results.
>
> It seems like it would be faster to avoid the extra operations when
> either x or y is exactly zero.
>
>  > BTW, how much faster are we talking about here?
> 
>  I already did some benchmarks, but I have changed the code
>  significantly since, so I don't have uptodate numbers now.
> 
>  >From the old results, I see that in adding two signed 32bit integers
>  there was 113% speedup with version 1 compared to current
>  implementation and 173% with version 2 (the speedup is measured as
>  `old_time / new_time  1' (in %)). This may not correspond exactly to
>  the codes above. Also, especially with version 2, it somewhat depends
>  on how often the overflow occurs (normally, it occurs infrequently,
>  and I think this gives version 2 another speed advantage on
>  architectures using branch prediction).
>
> OK.
>
>  > Although speed is nice, I think it would be better to spend time
>  > working on missing features first.
>  >
>  Well, as I said before, it did start as a hunt for missing feature
>  (int64 arithmetics). And I realized that the present method (using
>  doubles) would not work, and when I wrote addition and subtraction for
>  64bit ints, I could just as well make it a template and use it for
>  other integers, especially if it would improve performance, which it
>  did, etc.
>
> By "missing features" I was thinking of things that are present in
> Matlab but not Octave, so they would be missed by people trying to
> port code from Matlab to Octave. I think we see a lot more complaints
> about those kinds of things than we do about features that are just
> generally missing from both Octave and Matlab.
>
I know what you mean, and I agree, but 64bit int arithmetics is,
IMHO, better described as a feature that is not *yet* in Matlab. I
think we can safely bet that it will, at some point, and maybe quite
soon, be implemented, and it is also fairly clear *how* it will be
implemented.
The question is  is it worth to go ahead and risk minor
incompatibilities, or is it better to wait for Mathworks?
Btw. I know that Matlab compatibility is what many users want, but I'm
curious how many users, and how much do they want it? In other words,
it could be interesting to try to get some numbers. I'm thinking about
setting up a "user feedback" webpage on Savannah, containing some
questionnaires and such stuff, to let users express their preferences.
To let users know, we could include a simple "user_feedback" command
in Octave that would launch the page in a browser or something
similar. Also, a note could be added to the windows installer. Perhaps
that would give us interesting information about what users want from
Octave.
cheers
> But if you've already done most of the work, then I don't see why we
> can't include it. I would prefer to see it done in stages if
> possible. Also, we really must have configure checks to make sure the
> assumptions required for your code to work are valid.
>
>  Maybe you're right, all this may really be quite useless. Perhaps
>  someone can comment on whether they use (or want to use) integer
>  arithmetics in Octave, and whether they care about its speed?
>
> It could certainly help for image processing, since reading an image
> gives you an integer matrix.
>
> jwe
>
>

RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

