That's a pretty interesting problem!

On Fri, Feb 28, 2014 at 7:39 PM, Jordi Gutiérrez Hermoso

<

[hidden email]> wrote:

> So I think I see the bug:

>

>

http://hg.savannah.gnu.org/hgweb/octave/file/c579bd4e12c9/liboctave/numeric/randmtzig.c#l392>

> Here the cast to float is pointless. The constants are doubles, not

> floats, so the computation is done in doubles, with 32 bits. However,

> a float mantissa only has 23 bits, so casting to float loses precision

> bits, and I think this casting also performs rounding.

Yup, looking at the border case shows this nicely:

octave:4> q = ((2^32-1) + 0.5) * (1.0/4294967296.0)

q = 1.00000

octave:5> q==1

ans = 0

octave:6> q = (single(2^32-1) + 0.5) * (1.0/4294967296.0)

q = 1

octave:7> q==1

ans = 1

This makes perfect sense, of course. If memory serves me right, FP

default rounding is defined that if the mathematical result of an

operation falls between two finite FP numbers, then the closer one is

chosen (in case of ties, such that there's no bias on average, IIRC).

With epsf := eps('single') being 2^-23, the predecessor of 1 is

1-2^-23.

On Fri, Feb 28, 2014 at 11:00 PM, Jordi Gutiérrez Hermoso

<

[hidden email]> wrote:

> Alternatively, we might decide this is not a problem at all and we

> just update the docs to say that sometimes rand is exactly 1 (or

> exactly 0?).

I think that besides this being incompatible with Mathworks'

definition of M rand(), it would be inappropriate due to Octave using

randu32() for derived calculations (sampiling of other probability

distributions, it seems).

> We could do bit manipulations instead of casting around

> and doing arithmetic. But come to think of it, this might be

> complicated. I just remember that we can't use the same exponent bits

> for all mantissas, due to the implicit bit.

Indeed, it's only easy at first sight :-). Here's a function that

samples the numbers epsf * [1 .. 2^23-1] uniformly, assuming randi32()

is uniform on [0 .. 2^31-1]:

float randu32(void) // (0, 1)

{

const float C = 1.0f / 8388608.0f; // 2^-23

uint32_t i;

do {

i = randi32()>>9; // [0, 2^23-1]

} while (i == 0);

// i now in [1, 2^23-1]

return C * (float)i;

}

The catch is that while in the interval [0.5, 1), all FP numbers are

sampled this way, in [0.25, 0.5) it's every second one, in [0.125,

0.25) every fourth one, and so on. So the question is, are you fine

with this "coarseness", or do you want all FP numers in [0.25, 0.5)

sampled too, which would then have to be with halved probability for

each one (quartered probability for [0.125, 0.25), and so on).

--Philipp