Re: Random number generation quirk

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Random number generation quirk

Jordi Gutiérrez Hermoso-2
Moving this to the dev list...

On Fri, 2014-02-28 at 12:14 -0500, Jordi Gutiérrez Hermoso wrote:

> Ah, you're using single. I can reproduce now on 3.8.0:
>
>     octave:1> x = rand(3000,"single"); max(round(x*10+0.5)(:))
>     ans =  10
>     octave:2> x = rand(3000,"single"); max(round(x*10+0.5)(:))
>     ans =  11
>     octave:3> max(x(:))
>     ans =  1
>
> This a bug one way or another, since the documentation clearly
> states that this should not happen. Can you please file a bug
> report?

So I think I see the bug:

    http://hg.savannah.gnu.org/hgweb/octave/file/c579bd4e12c9/liboctave/numeric/randmtzig.c#l392

Here the cast to float is pointless. The constants are doubles, not
floats, so the computation is done in doubles, with 32 bits. However,
a float mantissa only has 23 bits, so casting to float loses precision
bits, and I think this casting also performs rounding.

I think it might be better to do bit manipulations to light the
appropriate mantissa bits in the float representation than to be
relying on implicit casts, explicit casts, and the arithmetic between
them.

- Jordi G. H.




Reply | Threaded
Open this post in threaded view
|

Re: Random number generation quirk

mpender
I have had the problem occur with double rands, but it is much easier to reproduce with singles.

I think loss of precision could account for the problem.  I was under the impression that the division was taking place as a single precision calculation, not a double precision calculation?

I would be more confident in the correctness of the result if all operands were doubles, and the division took place as a double precision division, and then the final result was cast as a single.

That said, it would probably be much faster to do bit manipulations place the bits in the appropriate locations.  However, I am not sure whether assumptions about x86 versus other architectures will require specialized coding.  I don't completely understand the comments from lines 104-108 on performance impacts of the i686 64-bit architecture.

- Mike


On Fri, Feb 28, 2014 at 1:39 PM, Jordi Gutiérrez Hermoso <[hidden email]> wrote:
Moving this to the dev list...

On Fri, 2014-02-28 at 12:14 -0500, Jordi Gutiérrez Hermoso wrote:
> Ah, you're using single. I can reproduce now on 3.8.0:
>
>     octave:1> x = rand(3000,"single"); max(round(x*10+0.5)(:))
>     ans =  10
>     octave:2> x = rand(3000,"single"); max(round(x*10+0.5)(:))
>     ans =  11
>     octave:3> max(x(:))
>     ans =  1
>
> This a bug one way or another, since the documentation clearly
> states that this should not happen. Can you please file a bug
> report?

So I think I see the bug:

    http://hg.savannah.gnu.org/hgweb/octave/file/c579bd4e12c9/liboctave/numeric/randmtzig.c#l392

Here the cast to float is pointless. The constants are doubles, not
floats, so the computation is done in doubles, with 32 bits. However,
a float mantissa only has 23 bits, so casting to float loses precision
bits, and I think this casting also performs rounding.

I think it might be better to do bit manipulations to light the
appropriate mantissa bits in the float representation than to be
relying on implicit casts, explicit casts, and the arithmetic between
them.

- Jordi G. H.





Reply | Threaded
Open this post in threaded view
|

Re: Random number generation quirk

Jordi Gutiérrez Hermoso-2
On Fri, 2014-02-28 at 15:14 -0500, Michael Pender wrote:
> I have had the problem occur with double rands, but it is much
> easier to reproduce with singles.

Alternatively, we might decide this is not a problem at all and we
just update the docs to say that sometimes rand is exactly 1 (or
exactly 0?).

> That said, it would probably be much faster to do bit manipulations
> place the bits in the appropriate locations. However, I am not sure
> whether assumptions about x86 versus other architectures will
> require specialized coding.

We have dropped the ability to do arithmetic other than IEEE 754 float
arithmetic. We could do bit manipulations instead of casting around
and doing arithmetic. But come to think of it, this might be
complicated. I just remember that we can't use the same exponent bits
for all mantissas, due to the implicit bit.

- Jordi G. H.


Reply | Threaded
Open this post in threaded view
|

Re: Random number generation quirk

Philipp Kutin
That's a pretty interesting problem!

On Fri, Feb 28, 2014 at 7:39 PM, Jordi Gutiérrez Hermoso
<[hidden email]> wrote:
> So I think I see the bug:
>
>     http://hg.savannah.gnu.org/hgweb/octave/file/c579bd4e12c9/liboctave/numeric/randmtzig.c#l392
>
> Here the cast to float is pointless. The constants are doubles, not
> floats, so the computation is done in doubles, with 32 bits. However,
> a float mantissa only has 23 bits, so casting to float loses precision
> bits, and I think this casting also performs rounding.

Yup, looking at the border case shows this nicely:

octave:4> q = ((2^32-1) + 0.5) * (1.0/4294967296.0)
q =  1.00000
octave:5> q==1
ans = 0
octave:6> q = (single(2^32-1) + 0.5) * (1.0/4294967296.0)
q =  1
octave:7> q==1
ans =  1

This makes perfect sense, of course. If memory serves me right, FP
default rounding is defined that if the mathematical result of an
operation falls between two finite FP numbers, then the closer one is
chosen (in case of ties, such that there's no bias on average, IIRC).
With epsf := eps('single') being 2^-23, the predecessor of 1 is
1-2^-23.


On Fri, Feb 28, 2014 at 11:00 PM, Jordi Gutiérrez Hermoso
<[hidden email]> wrote:
> Alternatively, we might decide this is not a problem at all and we
> just update the docs to say that sometimes rand is exactly 1 (or
> exactly 0?).

I think that besides this being incompatible with Mathworks'
definition of M rand(), it would be inappropriate due to Octave using
randu32() for derived calculations (sampiling of other probability
distributions, it seems).

> We could do bit manipulations instead of casting around
> and doing arithmetic. But come to think of it, this might be
> complicated. I just remember that we can't use the same exponent bits
> for all mantissas, due to the implicit bit.

Indeed, it's only easy at first sight :-). Here's a function that
samples the numbers epsf * [1 .. 2^23-1] uniformly, assuming randi32()
is uniform on [0 .. 2^31-1]:

float randu32(void)  // (0, 1)
{
    const float C = 1.0f / 8388608.0f;  // 2^-23
    uint32_t i;

    do {
        i = randi32()>>9;  // [0, 2^23-1]
    } while (i == 0);
    // i now in [1, 2^23-1]

    return C * (float)i;
}

The catch is that while in the interval [0.5, 1), all FP numbers are
sampled this way, in [0.25, 0.5) it's every second one, in [0.125,
0.25) every fourth one, and so on. So the question is, are you fine
with this "coarseness", or do you want all FP numers in [0.25, 0.5)
sampled too, which would then have to be with halved probability for
each one (quartered probability for [0.125, 0.25), and so on).


--Philipp
Reply | Threaded
Open this post in threaded view
|

Re: Random number generation quirk

Philipp Kutin
On Sat, Mar 1, 2014 at 11:38 AM, Philipp Kutin <[hidden email]> wrote:
> Here's a function that
> samples the numbers epsf * [1 .. 2^23-1] uniformly, assuming randi32()
> is uniform on [0 .. 2^31-1]:

Oops. That should be "assuming randi32() is uniform on [0 .. 2^32-1]".

--Philipp
Reply | Threaded
Open this post in threaded view
|

Re: Random number generation quirk

Philipp Kutin
In reply to this post by Philipp Kutin
Thinking about this some more. First, another correction:

On Sat, Mar 1, 2014 at 11:38 AM, Philipp Kutin <[hidden email]> wrote:
> With epsf := eps('single') being 2^-23, the predecessor of 1 is
> 1-2^-23.

That should be "the predecessor of 1 is 1 - 2^-24".

If you want to do away with the coarseness, you may think along the
lines of, well, let's just do the intermediate calculation in double
precision, and only at the end demote to float. We repeat this as long
as this result equals one of the boundary values. Here's what you
might come up with then:

float randu32(void)  // (0, 1)
{
    static const double C = 1.0 / (65536.0 * 65536.0);  // 2^-32
    float f;

    do {
        f = C * (double)randi32();
    } while (f == 0.0 || f == 1.0);

    return f;
}

As an optimization, since f == 0.0 IFF randi32() returns 0, we can do
the f == 0.0 comparison in integer beforehand. How does this function
compare with the one posted earlier? First, it's clear that it samples
more values (without telling which ones these are). However, comparing
the converted result is slower than doing it on the integer early.
These are some timings for a whole 0 .. 2^32-1 cycle:

first version: 12.8 s
above version: 23.7 s
above version with early i==0 check: 20.7 s

But besides these efficiency issues, there's another numeric one that
relates to the FP rounding in the case of ties.

octave:37> for i=0:10, fprintf('%ld %ld\n', 2^24+i, single(2^24+i)); end
16777216 16777216
16777217 16777216  % round down
16777218 16777218
16777219 16777220  % round up
16777220 16777220
16777221 16777220  % round down
16777222 16777222
16777223 16777224  % round up
16777224 16777224
16777225 16777224  % round down
16777226 16777226

Because of this rounding definition, we don't sample the FP numbers in
an interval [2^(-n-1) 2^-n) (n >= 0) uniformly if, denoting the
difference between two consecutive numbers in this interval as d(n),
numbers "finer" than d(n) are used as intermediate values.

So much for "fun with floating point" for now. I'm attaching a C
source file with the three variants of randu32(), plus some simple
checking in main().

--Philipp

randfloat.c (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Random number generation quirk

Philipp Kutin
Alright... thinking further, to prevent the floating comparison, we
can simply determine the greatest integer n for which "(float)(2^-32 *
n) < 1.0" and plug that in. The result has been submitted [1] to the
patch tracker.

This leaves the rounding issue. In retrospect, I think it's not *that*
grave. As before, let d(n) denote the
difference between two consecutive numbers in [2^(-n-1), 2^-n) for n
>= 0. Since d(0) == 2^-24, d(i) == 2^(-24-i). The problem is greatest
where d(i) is twice the "base factor" 2^-32 -- this is the case for
i==8, or an interval of [2^-9, 2^-8) which itself has an extent of
1/512.

As can be seen from the test C file I posted at the patch tracker, I
tried playing around with setting the rounding mode dynamically. That
wasn't very fruitful though. First, this made the whole thing really
slow -- a full cycle was over a minute instead of ~20 seconds. Second
and more concerning, GCC doesn't seem to support doing this cleanly,
at least not in a way to satisfy the the C Standard. In this regard, I
found an amusing report [2] on the GCC bug tracker.

[1] https://savannah.gnu.org/patch/index.php?8389
[2] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34678

--Philipp
Reply | Threaded
Open this post in threaded view
|

Re: Random number generation quirk

Philipp Kutin
On Mon, Mar 10, 2014 at 11:26 PM, Philipp Kutin <[hidden email]> wrote:
> The problem is greatest
> where d(i) is twice the "base factor" 2^-32 -- this is the case for
> i==8, or an interval of [2^-9, 2^-8) which itself has an extent of
> 1/512.

It looks like the ability of doing basic arithmetic and sitting in
front of a computer screen exclude each other for me :-(. This should
be i==7 --> d(i)==2^-31, with an interval of [2^-8, 2^-7) and an
extent of 1/256. This is slightly annoying... maybe the patch could be
spun further and amended later on.

--Philipp