Question about RNGs used in oct files, and multiple instances of Octave

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Question about RNGs used in oct files, and multiple instances of Octave

Michael Creel
Hi all,

I'm using some .oct files that generate random numbers, using Octave's oct-rand. Typical lines are

octave_rand::distribution("uniform");
e4s = octave_rand::vector((n+burnin)*M*MMM+100);

If I simultaneously run multiple copies of Octave (using MPI, on Linux), I understand that each copy will have it's own seed, even if I don't explicitly set the seed. Is there any particular reason to worry about the states of the RNGs becoming the same on the different instances? From what I see, I would say no, but I'd appreciate getting information from those who know. Also, would it be a better idea to set different seed explicitly?

TIA,
Michael
Reply | Threaded
Open this post in threaded view
|

Re: Question about RNGs used in oct files, and multiple instances of Octave

Sebastian Schöps
It depends what you are doing but one good reason is reproducibility!

Seb.

Michael Creel wrote
Hi all,

I'm using some .oct files that generate random numbers, using Octave's oct-rand. Typical lines are

octave_rand::distribution("uniform");
e4s = octave_rand::vector((n+burnin)*M*MMM+100);

If I simultaneously run multiple copies of Octave (using MPI, on Linux), I understand that each copy will have it's own seed, even if I don't explicitly set the seed. Is there any particular reason to worry about the states of the RNGs becoming the same on the different instances? From what I see, I would say no, but I'd appreciate getting information from those who know. Also, would it be a better idea to set different seed explicitly?

TIA,
Michael
Reply | Threaded
Open this post in threaded view
|

Re: Question about RNGs used in oct files, and multiple instances of Octave

Michael Creel
I understand that setting the seed could be useful for other reasons, but I'm mostly interested that the sequences generated on the separate instances of Octave should not be dependent. I also know that there are sophisticated solutions for generating random numbers on clusters. However, with the long period of the MT generator, I think that different initial starting points will be sufficient for my purposes. If I'm mistaken about that, though, I'd like to know. Thanks, M.
Reply | Threaded
Open this post in threaded view
|

Re: Question about RNGs used in oct files, and multiple instances of Octave

Gordon Haverland
On Fri, 30 May 2014 10:08:40 -0700 (PDT)
Michael Creel <[hidden email]> wrote:

> I understand that setting the seed could be useful for other reasons,
> but I'm mostly interested that the sequences generated on the
> separate instances of Octave should not be dependent. I also know
> that there are sophisticated solutions for generating random numbers
> on clusters. However, with the long period of the MT generator, I
> think that different initial starting points will be sufficient for
> my purposes. If I'm mistaken about that, though, I'd like to know.

If you are using multiple generators in parallel, if at some point the
internal states of some generators become the same, they are now
dependent.  They are perfectly correlated with each other.

It is also possible for generators to become in step with each other,
which is just as bad (one sequence becomes the other sequence after a
lag).

I would suggest that if you are running a stochastic program on a
cluster of machines, and need to be sure about randomness, you get
hardware random number generators to generate true random numbers.  If
nothing else, they are probably faster (or could be).

Gord


_______________________________________________
Help-octave mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Question about RNGs used in oct files, and multiple instances of Octave

Francesco Potortì
>If you are using multiple generators in parallel, if at some point the
>internal states of some generators become the same, they are now
>dependent.  They are perfectly correlated with each other.
>
>It is also possible for generators to become in step with each other,
>which is just as bad (one sequence becomes the other sequence after a
>lag).
>
>I would suggest that if you are running a stochastic program on a
>cluster of machines, and need to be sure about randomness, you get
>hardware random number generators to generate true random numbers.  If
>nothing else, they are probably faster (or could be).

From a mathematical point of view, you are right.  From a human point of
view, that depends on numbers.  The Mersenne twister used by Octavés
rand function has a period of 2^{19937}-1, which is about 1e6000.

Unless I'm mistaken, this means that if you run, say, 1e5 independent
generators each choosing a different seed and generating 1e9 random
numbers, the collision probability is around 10^-(6000-9-5-5), which is
incredibly low.

--
Francesco Potortì (ricercatore)        Voice:  +39.050.621.3058
ISTI - Area della ricerca CNR          Mobile: +39.348.8283.107
via G. Moruzzi 1, I-56124 Pisa         Skype:  wnlabisti
(entrance 20, 1st floor, room C71)     Web:    http://fly.isti.cnr.it


_______________________________________________
Help-octave mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Question about RNGs used in oct files, and multiple instances of Octave

Francesco Potortì
>Francesco Potortì:
>> >If you are using multiple generators in parallel, if at some point
>> >the internal states of some generators become the same, they are now
>> >dependent.  They are perfectly correlated with each other.
>> >
>> >It is also possible for generators to become in step with each other,
>> >which is just as bad (one sequence becomes the other sequence after a
>> >lag).
>> >
>> >I would suggest that if you are running a stochastic program on a
>> >cluster of machines, and need to be sure about randomness, you get
>> >hardware random number generators to generate true random numbers.
>> >If nothing else, they are probably faster (or could be).
>>
>> From a mathematical point of view, you are right.  From a human point
>> of view, that depends on numbers.  The Mersenne twister used by
>> Octavés rand function has a period of 2^{19937}-1, which is about
>> 1e6000.
>>
>> Unless I'm mistaken, this means that if you run, say, 1e5 independent
>> generators each choosing a different seed and generating 1e9 random
>> numbers, the collision probability is around 10^-(6000-9-5-5), which
>> is incredibly low.

Ghaverla:
>If someone has the money for a cluster, I think they could afford
>hardware RNGs.  Yes, the Mersenne Twister is a good RNG for most casual
>uses.  I have run across a couple of hints that it isn't perfect.

Unless you have cryptographic needs, I see no point in using a hardware
rng.  Unless technology has significantly changed since when I have
looked at hardware generators (say, ten years ago), they are far less
convenient than software rngs for simulation purposes, which is what I
think we are speaking here.

The Mersenne twister is an excellent rng for practically all non
cryptographic uses.  And yes, it is not perfect: no rng is perfect :)

As far as I know, its main drawback is its dimensionality, which may
introduce unwanted correlations if you happen to run simulations where a
single sequence is used to feed hundreds of different independent
sequences of events, rather than using several sequences based on
different feeds.

>Most of my use of the twister is from Perl, as I don't yet understand
>Octave well enough.  And a billion random deviates was somewhere
>between 1 and 12 hours (I was using a bootstrap to change the uniform
>PDF into a non-uniform PDF).

You can easily change my example numbers above to vastly exceed the
computational capabilities of any modern computer, and you still get an
incredibly low collision probability, if that is what you mean.

That said, I am not an expert of rng, and I would be grateful to anyone
who can point to weaknesses in my reasoning.

--
Francesco Potortì (ricercatore)        Voice:  +39.050.621.3058
ISTI - Area della ricerca CNR          Mobile: +39.348.8283.107
via G. Moruzzi 1, I-56124 Pisa         Skype:  wnlabisti
(entrance 20, 1st floor, room C71)     Web:    http://fly.isti.cnr.it

_______________________________________________
Help-octave mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Question about RNGs used in oct files, and multiple instances of Octave

Michael Creel
I agree with Francesco's analysis of the probability of an overlap being very low. This is good enough for my purposes. My only concern is with the details of how different instances of octave might interact with one another, when they are all using the same .oct file, and that .oct file calls the Octave random number code through oct-rand.h. If it is true that they share no memory, then I don't see how there could be any problem. I believe that this is the case.

The reason I ask is because Dynare, www.dynare.org, when run using MPI, causes the random number seeds to become synchronized across all MPI ranks, even if they are different at the outset. I believe that this is an intentional feature of Dynare, unrelated to Octave's own behavior. However, I'm not absolutely sure about that.
Reply | Threaded
Open this post in threaded view
|

Re: Question about RNGs used in oct files, and multiple instances of Octave

Gordon Haverland
If one looks at the English Wikipedia page on comparison of hardware
random number generators, the second last entry is from Flying Stone
Technology of Japan, for a USB based hardware RNG for $35, $36 or $37
(without enclosure, with heat shrink tube for enclosure, or "fancy"
enclosure).  It apparently operates at about 588 kbit/s and it has been
tested:

http://www.cacert.at/cgi-bin/rngresults

It generates bits via ADC quantization error (NeuG-1.0).

http://www.gniibe.org/memo/development/gnuk/rng/neug.html
http://www.seeedstudio.com/wiki/FST-01

Can a person combine that with haveged, egd (this is a Perl script) or
similar to get faster production of RNs?  Among some methods mentioned,
was to use the hardware RNG to generate starting seeds for a PRNG.
Draw a seed, feed it into PRNG, generate N RNs, repeat.

For N not being too large, even if two (or more) RNGs in a cluster
became synchronized (same internal state), at the next re-seeding they
would likely not be synchronized any more.

That list of test results has prices in it (usually), so you might be
able to roll your own.

If you had multiple FST-01 units in a cluster, I would still want to
compare their outputs to see if there are correlations.  If you open a
door (to provide ventilation), do bit patterns change?  (periodic
heating would probably be easiest to use)

Gord


_______________________________________________
Help-octave mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-octave