You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@commons.apache.org by Gilles <gi...@harfang.homelinux.org> on 2016/01/10 05:09:21 UTC

[Math] Add 64-bits based RNG (Was: [Math] New base class for all RNGs)

Hi.

Relevant excerpt of previous posts:

>> [...]
>>
>> Something implicit in "BitStreamGenerator": the maximum number of
>> bits is 32 (cf. return type of "next(int)" and the ubiquitous use
>> of hard-coded "32".
>>
>> What about the possibility of using a "long" as a building block
>> for another family of RNG?
>
> Why?  We don't have contributions of other RNGs implemented using
> 64-bit blocks of data.  In theory, I guess you could do that, but I
> don't know of any and all the ones we have use 32-bit words.
>
> Phil
>>
>> Gilles

(1)
CPUs and OS are nowadays commonly 64-bits based.
Thus one could legitimately wonder whether, on such systems, generating
   one "long" value
would not be faster than generating
   two "int" values,
especially when 64 bits are necessary in order to produce the next 
random
"long" or "double" value.

(2)
There indeed exist 64-bits implementations of RNGs:
   
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/VERSIONS/C-LANG/mt19937-64.c
   http://burtleburtle.net/bob/rand/isaacafa.html


I thus propose to create a feature branch that will contain
* a new interface, for when "nextLong()" is the "random bits" producer
* the (adapted) "xorshift1024*" RNG implementation from
     http://xorshift.di.unimi.it/
* an "adapter" to the existing generic methods
* benchmarking code


Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [Math] Add 64-bits based RNG (Was: [Math] New base class for all RNGs)

Posted by Thomas Neidhart <th...@gmail.com>.

On Mon, Jan 11, 2016 at 1:10 PM, Gilles <gi...@harfang.homelinux.org>
wrote:

> On Mon, 11 Jan 2016 07:47:40 +0100, Thomas Neidhart wrote:
>
>> On 01/10/2016 05:09 AM, Gilles wrote:
>>
>>> Hi.
>>>
>>> Relevant excerpt of previous posts:
>>>
>>> [...]
>>>>>
>>>>> Something implicit in "BitStreamGenerator": the maximum number of
>>>>> bits is 32 (cf. return type of "next(int)" and the ubiquitous use
>>>>> of hard-coded "32".
>>>>>
>>>>> What about the possibility of using a "long" as a building block
>>>>> for another family of RNG?
>>>>>
>>>>
>>>> Why?  We don't have contributions of other RNGs implemented using
>>>> 64-bit blocks of data.  In theory, I guess you could do that, but I
>>>> don't know of any and all the ones we have use 32-bit words.
>>>>
>>>> Phil
>>>>
>>>>>
>>>>> Gilles
>>>>>
>>>>
>>> (1)
>>> CPUs and OS are nowadays commonly 64-bits based.
>>> Thus one could legitimately wonder whether, on such systems, generating
>>>   one "long" value
>>> would not be faster than generating
>>>   two "int" values,
>>> especially when 64 bits are necessary in order to produce the next random
>>> "long" or "double" value.
>>>
>>> (2)
>>> There indeed exist 64-bits implementations of RNGs:
>>>
>>>
>>>
>>> http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/VERSIONS/C-LANG/mt19937-64.c
>>>
>>>   http://burtleburtle.net/bob/rand/isaacafa.html
>>>
>>>
>>> I thus propose to create a feature branch that will contain
>>> * a new interface, for when "nextLong()" is the "random bits" producer
>>>
>>
>> be aware that (at least some of) the 64-bit variants rely on unsigned
>> long arithmetic operations which can't be done in java without losing a
>> *lot* performance.
>>
>
> What do you mean?
>
> Wasn't there the same problem when implementing RNGs that rely on
> unsigned int arithmetic, with Java's signed int?
>

in which case you can perform the operation in longs and convert back to
ints.
This is actually done for some of the RNG implementations in CM.

Normally, the addition, subtraction and multiplication should be
unaffected. Problematic are division (modulo) and comparison,
which need to be treated specially. At least the Well type rngs do modulo
operations, but I did not look into detail if this would create problems.

I just wanted to raise this issue before any decision is being made.


>
> But IIUC, the RNG codes generally rely on bit operations (and bit
> operations disguised as arithmetic).[1]
> Fortunately, the representation is the same in C and Java, which
> allows us to mostly copy source code.
> There *are* caveats, though:
>   * ">>" in C code must (in general) be replaced with ">>>"
>   * Some constants (that exceed MAX_VALUE) cannot be written using
>     their decimal representation (hexadecimal must be used instead).
>

the ">>" operator in C is implementation specific for signed values (see
http://stackoverflow.com/questions/7622/shift-operator-in-c).
Replacing it blindly with ">>>" is not correct imho.

Thomas


> Gilles
>
> [1] This is related to the "next(int bits)" debate: Ideally, an
>     RNG indeed provides a "bit stream" but, as we are now all
>     hopefully convinced, those that concern us in CM actually
>     manipulate "int" (or "long") values, not "boolean" sequences.
>
>
> Thomas
>>
>> * the (adapted) "xorshift1024*" RNG implementation from
>>>     http://xorshift.di.unimi.it/
>>> * an "adapter" to the existing generic methods
>>> * benchmarking code
>>>
>>>
>>> Gilles
>>>
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Re: [Math] Add 64-bits based RNG (Was: [Math] New base class for all RNGs)

Posted by Gilles <gi...@harfang.homelinux.org>.

On Mon, 11 Jan 2016 07:47:40 +0100, Thomas Neidhart wrote:
> On 01/10/2016 05:09 AM, Gilles wrote:
>> Hi.
>>
>> Relevant excerpt of previous posts:
>>
>>>> [...]
>>>>
>>>> Something implicit in "BitStreamGenerator": the maximum number of
>>>> bits is 32 (cf. return type of "next(int)" and the ubiquitous use
>>>> of hard-coded "32".
>>>>
>>>> What about the possibility of using a "long" as a building block
>>>> for another family of RNG?
>>>
>>> Why?  We don't have contributions of other RNGs implemented using
>>> 64-bit blocks of data.  In theory, I guess you could do that, but I
>>> don't know of any and all the ones we have use 32-bit words.
>>>
>>> Phil
>>>>
>>>> Gilles
>>
>> (1)
>> CPUs and OS are nowadays commonly 64-bits based.
>> Thus one could legitimately wonder whether, on such systems, 
>> generating
>>   one "long" value
>> would not be faster than generating
>>   two "int" values,
>> especially when 64 bits are necessary in order to produce the next 
>> random
>> "long" or "double" value.
>>
>> (2)
>> There indeed exist 64-bits implementations of RNGs:
>>
>> 
>> http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/VERSIONS/C-LANG/mt19937-64.c
>>
>>   http://burtleburtle.net/bob/rand/isaacafa.html
>>
>>
>> I thus propose to create a feature branch that will contain
>> * a new interface, for when "nextLong()" is the "random bits" 
>> producer
>
> be aware that (at least some of) the 64-bit variants rely on unsigned
> long arithmetic operations which can't be done in java without losing 
> a
> *lot* performance.

What do you mean?

Wasn't there the same problem when implementing RNGs that rely on
unsigned int arithmetic, with Java's signed int?

But IIUC, the RNG codes generally rely on bit operations (and bit
operations disguised as arithmetic).[1]
Fortunately, the representation is the same in C and Java, which
allows us to mostly copy source code.
There *are* caveats, though:
   * ">>" in C code must (in general) be replaced with ">>>"
   * Some constants (that exceed MAX_VALUE) cannot be written using
     their decimal representation (hexadecimal must be used instead).

Gilles

[1] This is related to the "next(int bits)" debate: Ideally, an
     RNG indeed provides a "bit stream" but, as we are now all
     hopefully convinced, those that concern us in CM actually
     manipulate "int" (or "long") values, not "boolean" sequences.

> Thomas
>
>> * the (adapted) "xorshift1024*" RNG implementation from
>>     http://xorshift.di.unimi.it/
>> * an "adapter" to the existing generic methods
>> * benchmarking code
>>
>>
>> Gilles
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [Math] Add 64-bits based RNG (Was: [Math] New base class for all RNGs)

Posted by Thomas Neidhart <th...@gmail.com>.

On 01/10/2016 05:09 AM, Gilles wrote:
> Hi.
> 
> Relevant excerpt of previous posts:
> 
>>> [...]
>>>
>>> Something implicit in "BitStreamGenerator": the maximum number of
>>> bits is 32 (cf. return type of "next(int)" and the ubiquitous use
>>> of hard-coded "32".
>>>
>>> What about the possibility of using a "long" as a building block
>>> for another family of RNG?
>>
>> Why?  We don't have contributions of other RNGs implemented using
>> 64-bit blocks of data.  In theory, I guess you could do that, but I
>> don't know of any and all the ones we have use 32-bit words.
>>
>> Phil
>>>
>>> Gilles
> 
> (1)
> CPUs and OS are nowadays commonly 64-bits based.
> Thus one could legitimately wonder whether, on such systems, generating
>   one "long" value
> would not be faster than generating
>   two "int" values,
> especially when 64 bits are necessary in order to produce the next random
> "long" or "double" value.
> 
> (2)
> There indeed exist 64-bits implementations of RNGs:
>  
> http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/VERSIONS/C-LANG/mt19937-64.c
> 
>   http://burtleburtle.net/bob/rand/isaacafa.html
> 
> 
> I thus propose to create a feature branch that will contain
> * a new interface, for when "nextLong()" is the "random bits" producer

be aware that (at least some of) the 64-bit variants rely on unsigned
long arithmetic operations which can't be done in java without losing a
*lot* performance.

Thomas

> * the (adapted) "xorshift1024*" RNG implementation from
>     http://xorshift.di.unimi.it/
> * an "adapter" to the existing generic methods
> * benchmarking code
> 
> 
> Gilles
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [Math] Add 64-bits based RNG (Was: [Math] New base class for all RNGs)

Posted by Phil Steitz <ph...@gmail.com>.

On Sat, Jan 9, 2016 at 9:09 PM, Gilles <gi...@harfang.homelinux.org> wrote:

> Hi.
>
> Relevant excerpt of previous posts:
>
> [...]
>>>
>>> Something implicit in "BitStreamGenerator": the maximum number of
>>> bits is 32 (cf. return type of "next(int)" and the ubiquitous use
>>> of hard-coded "32".
>>>
>>> What about the possibility of using a "long" as a building block
>>> for another family of RNG?
>>>
>>
>> Why?  We don't have contributions of other RNGs implemented using
>> 64-bit blocks of data.  In theory, I guess you could do that, but I
>> don't know of any and all the ones we have use 32-bit words.
>>
>> Phil
>>
>>>
>>> Gilles
>>>
>>
> (1)
> CPUs and OS are nowadays commonly 64-bits based.
> Thus one could legitimately wonder whether, on such systems, generating
>   one "long" value
> would not be faster than generating
>   two "int" values,
> especially when 64 bits are necessary in order to produce the next random
> "long" or "double" value.
>
(2)
> There indeed exist 64-bits implementations of RNGs:
>
> http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/VERSIONS/C-LANG/mt19937-64.c
>   http://burtleburtle.net/bob/rand/isaacafa.html
>
>
> I thus propose to create a feature branch that will contain
> * a new interface, for when "nextLong()" is the "random bits" producer
>

You mean another abstract base class.  Fine by me.


> * the (adapted) "xorshift1024*" RNG implementation from
>     http://xorshift.di.unimi.it/


+1 to add a new one, assuming someone is willing to do the research and get
a clean impl.  Looks like the above is GPL, so we have to be careful will
adapting code.  The benchmarks and discussion there look promising, though.


>
> * an "adapter" to the existing generic methods
>

What do you mean by this?

* benchmarking code
>

+1

Phil

>
>
> Gilles
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>