You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Lance Norskog (JIRA)" <ji...@apache.org> on 2011/05/05 07:39:03 UTC

[jira] [Created] (MAHOUT-687) Random generator objects- slight refactor

Random generator objects- slight refactor
-----------------------------------------

                 Key: MAHOUT-687
                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
             Project: Mahout
          Issue Type: Improvement
            Reporter: Lance Norskog
            Priority: Minor


Problems:
* Uncommons MersenneTwisterRNG, the default RandomUtils.getRandom(), ignores setSeed without throwing an error.
* The project wants to move off Uncommons anyway.

This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.

Testcases: All math test cases pass except for org.apache.mahout.math.stats.LogLikelihoodTest. 
Other package tests fail that are mostly about testing random-oriented classes; not a surprise.
Almost all tests that use random numbers in algorithms still pass; this is a good sign of their stability.
.
Still, a lot of tests have to be fiddled to make this commit.




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: [jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by Ted Dunning <te...@gmail.com>.
My suggestion is that it never store any data (more than a tiny cache at
most) and should compute the random value as murmurHash(i+":"+j+salt).
 Instead of string concatenation, of course, you should use successive
updates to the hash.

On Fri, May 20, 2011 at 9:14 PM, Lance Norskog <go...@gmail.com> wrote:

> But still, how should a random based on MurmurHash work?
>

Re: [jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by Lance Norskog <go...@gmail.com>.
Right! Ok. Yes, that makes a lot more sense. I'm parking my random
vector/matrix stuff because it's clear (to me at least) that
Vector&Matrix need some revamping.

But still, how should a random based on MurmurHash work?

Lance

On Fri, May 20, 2011 at 7:29 PM, Ted Dunning <te...@gmail.com> wrote:
> No.  I was referring to the specific use of random number generators to
> generate an ephemeral random matrix.  I prefer to hash the element location
> in order to generate the element value than to use a random number generator
> seeded by the element location.  These are formally equivalent but
> practically quite different since hash functions are often designed with
> fast startup and PRNG's are often designed without much regard to startup or
> reseeding cost.  MersenneTwister in particular is a very bad generator with
> respect to the cost of reseeding.  Murmurhash is a very good example of a
> lean hash function.
>
> On Fri, May 20, 2011 at 7:25 PM, Lance Norskog (JIRA) <ji...@apache.org>wrote:
>
>>
>> Ted, you mentioned wanting a MurmurHash Random class. Is this what you
>> envisioned? (It is not finished code; see below).
>>
>> {code}
>> public class MurmurHashRandom extends Random {
>>  private long murmurSeed;
>>  private final ByteBuffer buf;
>>
>>  public MurmurHashRandom() {
>>    this(0);
>>  }
>>
>>  public MurmurHashRandom(int seed) {
>>    SeedGenerator gen = new FastRandomSeedGenerator();
>>    byte[] bits = RandomUtils.longSeedtoBytes(gen.generateSeed());
>>    buf = ByteBuffer.wrap(bits);
>>    this.murmurSeed = MurmurHash.hash64A(bits, seed);
>>  }
>>
>>  @Override
>>  public long nextLong() {
>>    long oldSeed = murmurSeed;
>>    murmurSeed = MurmurHash.hash64A(buf, (int) murmurSeed);
>>    return oldSeed;
>>  }
>>
>> {code}
>>
>>
>



-- 
Lance Norskog
goksron@gmail.com

Re: [jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by Ted Dunning <te...@gmail.com>.
No.  I was referring to the specific use of random number generators to
generate an ephemeral random matrix.  I prefer to hash the element location
in order to generate the element value than to use a random number generator
seeded by the element location.  These are formally equivalent but
practically quite different since hash functions are often designed with
fast startup and PRNG's are often designed without much regard to startup or
reseeding cost.  MersenneTwister in particular is a very bad generator with
respect to the cost of reseeding.  Murmurhash is a very good example of a
lean hash function.

On Fri, May 20, 2011 at 7:25 PM, Lance Norskog (JIRA) <ji...@apache.org>wrote:

>
> Ted, you mentioned wanting a MurmurHash Random class. Is this what you
> envisioned? (It is not finished code; see below).
>
> {code}
> public class MurmurHashRandom extends Random {
>  private long murmurSeed;
>  private final ByteBuffer buf;
>
>  public MurmurHashRandom() {
>    this(0);
>  }
>
>  public MurmurHashRandom(int seed) {
>    SeedGenerator gen = new FastRandomSeedGenerator();
>    byte[] bits = RandomUtils.longSeedtoBytes(gen.generateSeed());
>    buf = ByteBuffer.wrap(bits);
>    this.murmurSeed = MurmurHash.hash64A(bits, seed);
>  }
>
>  @Override
>  public long nextLong() {
>    long oldSeed = murmurSeed;
>    murmurSeed = MurmurHash.hash64A(buf, (int) murmurSeed);
>    return oldSeed;
>  }
>
> {code}
>
>

Re: [jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by Ted Dunning <te...@gmail.com>.
Sean is exactly right.

Another way to put it is that hasNext is free to have side-effects, but it
must be idempotent.  That is, repeated calls to hasNext() must not change
the next value returned by next().  This is not the same as having no
effect, it just means that it must have no *user*visible* effect if the user
can't see the precise timing of accesses to the underlying sequence.

On Sat, May 21, 2011 at 3:55 PM, Sean Owen <sr...@gmail.com> wrote:

> This just uses the Google Guava helper class to implement the Iterator and
> it does work. next() advances the iterator logically, and hasNext() can't
> advance the iteration. However, when and how the iterator accesses the
> underlying sequence is up to the iterator. It can, and must, access the
> next
> element to find out if there is a next element in a call to hasNext().
> However this doesn't violate any contract.
>
> On Sat, May 21, 2011 at 11:33 PM, Lance Norskog <go...@gmail.com> wrote:
>
> > You're right, it can drop the first element.
> >
> > SamplingIterator.next() pulls a successor element from the delegate
> > iterator and stashes it. This works, but I think the full semantics
> > would require that the delegate iterator does not advance until
> > Sampling.next causes it to. Not sure.
> >
> > On Sat, May 21, 2011 at 3:06 PM, Sean Owen (JIRA) <ji...@apache.org>
> wrote:
> > >
> > >    [
> >
> https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037469#comment-13037469
> ]
> > >
> > > Sean Owen commented on MAHOUT-687:
> > > ----------------------------------
> > >
> > > I don't know what you mean -- it most definitely can drop the first
> > element.
> > > What hasNext() method are you referring to?
> > > Yes, it's easy to make a similar change elsewhere to remove static
> Random
> > instances.
> > >
> > >> Random generator objects- slight refactor
> > >> -----------------------------------------
> > >>
> > >>                 Key: MAHOUT-687
> > >>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
> > >>             Project: Mahout
> > >>          Issue Type: Improvement
> > >>            Reporter: Lance Norskog
> > >>            Priority: Minor
> > >>         Attachments: MAHOUT-687.patch, MAHOUT-687.patch
> > >>
> > >>
> > >> Problems:
> > >> * The uncommons RepeatableRNG classes are the basis of RandomUtils.
> > >> ** These classes cheerfully ignore setSeed.
> > >> * Some people in the project want to move off Uncommons anyway.
> > >> This patch uses the org.apache.commons.math.random.RandomGenerator
> > classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> > >> .
> > >
> > > --
> > > This message is automatically generated by JIRA.
> > > For more information on JIRA, see:
> > http://www.atlassian.com/software/jira
> > >
> >
> >
> >
> > --
> > Lance Norskog
> > goksron@gmail.com
> >
>

Re: [jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by Sean Owen <sr...@gmail.com>.
This just uses the Google Guava helper class to implement the Iterator and
it does work. next() advances the iterator logically, and hasNext() can't
advance the iteration. However, when and how the iterator accesses the
underlying sequence is up to the iterator. It can, and must, access the next
element to find out if there is a next element in a call to hasNext().
However this doesn't violate any contract.

On Sat, May 21, 2011 at 11:33 PM, Lance Norskog <go...@gmail.com> wrote:

> You're right, it can drop the first element.
>
> SamplingIterator.next() pulls a successor element from the delegate
> iterator and stashes it. This works, but I think the full semantics
> would require that the delegate iterator does not advance until
> Sampling.next causes it to. Not sure.
>
> On Sat, May 21, 2011 at 3:06 PM, Sean Owen (JIRA) <ji...@apache.org> wrote:
> >
> >    [
> https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037469#comment-13037469]
> >
> > Sean Owen commented on MAHOUT-687:
> > ----------------------------------
> >
> > I don't know what you mean -- it most definitely can drop the first
> element.
> > What hasNext() method are you referring to?
> > Yes, it's easy to make a similar change elsewhere to remove static Random
> instances.
> >
> >> Random generator objects- slight refactor
> >> -----------------------------------------
> >>
> >>                 Key: MAHOUT-687
> >>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
> >>             Project: Mahout
> >>          Issue Type: Improvement
> >>            Reporter: Lance Norskog
> >>            Priority: Minor
> >>         Attachments: MAHOUT-687.patch, MAHOUT-687.patch
> >>
> >>
> >> Problems:
> >> * The uncommons RepeatableRNG classes are the basis of RandomUtils.
> >> ** These classes cheerfully ignore setSeed.
> >> * Some people in the project want to move off Uncommons anyway.
> >> This patch uses the org.apache.commons.math.random.RandomGenerator
> classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> >> .
> >
> > --
> > This message is automatically generated by JIRA.
> > For more information on JIRA, see:
> http://www.atlassian.com/software/jira
> >
>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>

Re: [jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by Lance Norskog <go...@gmail.com>.
You're right, it can drop the first element.

SamplingIterator.next() pulls a successor element from the delegate
iterator and stashes it. This works, but I think the full semantics
would require that the delegate iterator does not advance until
Sampling.next causes it to. Not sure.

On Sat, May 21, 2011 at 3:06 PM, Sean Owen (JIRA) <ji...@apache.org> wrote:
>
>    [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037469#comment-13037469 ]
>
> Sean Owen commented on MAHOUT-687:
> ----------------------------------
>
> I don't know what you mean -- it most definitely can drop the first element.
> What hasNext() method are you referring to?
> Yes, it's easy to make a similar change elsewhere to remove static Random instances.
>
>> Random generator objects- slight refactor
>> -----------------------------------------
>>
>>                 Key: MAHOUT-687
>>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>>             Project: Mahout
>>          Issue Type: Improvement
>>            Reporter: Lance Norskog
>>            Priority: Minor
>>         Attachments: MAHOUT-687.patch, MAHOUT-687.patch
>>
>>
>> Problems:
>> * The uncommons RepeatableRNG classes are the basis of RandomUtils.
>> ** These classes cheerfully ignore setSeed.
>> * Some people in the project want to move off Uncommons anyway.
>> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
>> .
>
> --
> This message is automatically generated by JIRA.
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>



-- 
Lance Norskog
goksron@gmail.com

Re: [jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by Ted Dunning <te...@gmail.com>.
Also, adding a dependency on commons math doesn't necessarily give us
forward motion on the dependency front.

On Fri, May 20, 2011 at 7:23 PM, Sean Owen (JIRA) <ji...@apache.org> wrote:

>
>    [
> https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037229#comment-13037229]
>
> Sean Owen commented on MAHOUT-687:
> ----------------------------------
>
> OK, that is still not what the patch does though.
> I think there are two good ideas in the mix here that can be committed
> without controversy. First, work around setSeed() behavior by instantiating
> a new RNG when called. Second, don't use a shared RNG in the sampling
> Iterator. I suggest this is the substance of what to commit in this thread.
>
> > Random generator objects- slight refactor
> > -----------------------------------------
> >
> >                 Key: MAHOUT-687
> >                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
> >             Project: Mahout
> >          Issue Type: Improvement
> >            Reporter: Lance Norskog
> >            Priority: Minor
> >         Attachments: MAHOUT-687.patch, MAHOUT-687.patch
> >
> >
> > Problems:
> > * The uncommons RepeatableRNG classes are the basis of RandomUtils.
> > ** These classes cheerfully ignore setSeed.
> > * Some people in the project want to move off Uncommons anyway.
> > This patch uses the org.apache.commons.math.random.RandomGenerator
> classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> > .
>
> --
> This message is automatically generated by JIRA.
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>

[jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037126#comment-13037126 ] 

Ted Dunning commented on MAHOUT-687:
------------------------------------

Is this still useful?  I am like Sean and have lost the thrust of the patch.


> Random generator objects- slight refactor
> -----------------------------------------
>
>                 Key: MAHOUT-687
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Lance Norskog
>            Priority: Minor
>         Attachments: MAHOUT-687.patch, MAHOUT-687.patch
>
>
> Problems:
> * Uncommons MersenneTwisterRNG, the default RandomUtils.getRandom(), ignores setSeed without throwing an error.
> * The project wants to move off Uncommons anyway.
> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> Testcases: All math test cases pass except for org.apache.mahout.math.stats.LogLikelihoodTest. 
> Other package tests fail that are mostly about testing random-oriented classes; not a surprise.
> Almost all tests that use random numbers in algorithms still pass; this is a good sign of their stability.
> .
> Still, a lot of tests have to be fiddled to make this commit.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAHOUT-687) Random generator objects- slight refactor

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lance Norskog updated MAHOUT-687:
---------------------------------

    Description: 
Problems:
* The uncommons RepeatableRNG classes are the basis of RandomUtils.
** These classes cheerfully ignore setSeed.
* Some people in the project want to move off Uncommons anyway.

This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
.





  was:
Problems:
* Uncommons MersenneTwisterRNG, the default RandomUtils.getRandom(), ignores setSeed without throwing an error.
* The project wants to move off Uncommons anyway.

This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.

Testcases: All math test cases pass except for org.apache.mahout.math.stats.LogLikelihoodTest. 
Other package tests fail that are mostly about testing random-oriented classes; not a surprise.
Almost all tests that use random numbers in algorithms still pass; this is a good sign of their stability.
.
Still, a lot of tests have to be fiddled to make this commit.





> Random generator objects- slight refactor
> -----------------------------------------
>
>                 Key: MAHOUT-687
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Lance Norskog
>            Priority: Minor
>         Attachments: MAHOUT-687.patch, MAHOUT-687.patch
>
>
> Problems:
> * The uncommons RepeatableRNG classes are the basis of RandomUtils.
> ** These classes cheerfully ignore setSeed.
> * Some people in the project want to move off Uncommons anyway.
> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> .

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029989#comment-13029989 ] 

Ted Dunning commented on MAHOUT-687:
------------------------------------

{quote}
A deterministic random matrix or vector needs to set the seed for each multiply. This fix would create too much garbage. (Each MersenneTwister has 2500 bytes!) Once you say you need Commons MersenneTwister instead, because it has a setSeed(long), the rest of the patch ticks over.
{quote}
MersenneTwister is unacceptable for this usage anyway.  It takes far too much startup time.  The commons implementation just uses the long with a weak generator to build the long seed so there isn't a difference in garbage created.  Besides, this is totally ephemeral garbage that won't even survive out of newspace.

A good implementation option is Murmurhash applied to row and column and salt.

> Random generator objects- slight refactor
> -----------------------------------------
>
>                 Key: MAHOUT-687
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Lance Norskog
>            Priority: Minor
>         Attachments: MAHOUT-687.patch
>
>
> Problems:
> * Uncommons MersenneTwisterRNG, the default RandomUtils.getRandom(), ignores setSeed without throwing an error.
> * The project wants to move off Uncommons anyway.
> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> Testcases: All math test cases pass except for org.apache.mahout.math.stats.LogLikelihoodTest. 
> Other package tests fail that are mostly about testing random-oriented classes; not a surprise.
> Almost all tests that use random numbers in algorithms still pass; this is a good sign of their stability.
> .
> Still, a lot of tests have to be fiddled to make this commit.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030289#comment-13030289 ] 

Ted Dunning commented on MAHOUT-687:
------------------------------------

{quote}
Why should the default creator of a random number generator include a system call to the random seed operating system device driver?
{quote}
Because it provides good seeds from real physical processes?


> Random generator objects- slight refactor
> -----------------------------------------
>
>                 Key: MAHOUT-687
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Lance Norskog
>            Priority: Minor
>         Attachments: MAHOUT-687.patch
>
>
> Problems:
> * Uncommons MersenneTwisterRNG, the default RandomUtils.getRandom(), ignores setSeed without throwing an error.
> * The project wants to move off Uncommons anyway.
> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> Testcases: All math test cases pass except for org.apache.mahout.math.stats.LogLikelihoodTest. 
> Other package tests fail that are mostly about testing random-oriented classes; not a surprise.
> Almost all tests that use random numbers in algorithms still pass; this is a good sign of their stability.
> .
> Still, a lot of tests have to be fiddled to make this commit.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037469#comment-13037469 ] 

Sean Owen commented on MAHOUT-687:
----------------------------------

I don't know what you mean -- it most definitely can drop the first element.
What hasNext() method are you referring to?
Yes, it's easy to make a similar change elsewhere to remove static Random instances.

> Random generator objects- slight refactor
> -----------------------------------------
>
>                 Key: MAHOUT-687
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Lance Norskog
>            Priority: Minor
>         Attachments: MAHOUT-687.patch, MAHOUT-687.patch
>
>
> Problems:
> * The uncommons RepeatableRNG classes are the basis of RandomUtils.
> ** These classes cheerfully ignore setSeed.
> * Some people in the project want to move off Uncommons anyway.
> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> .

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037229#comment-13037229 ] 

Sean Owen commented on MAHOUT-687:
----------------------------------

OK, that is still not what the patch does though.
I think there are two good ideas in the mix here that can be committed without controversy. First, work around setSeed() behavior by instantiating a new RNG when called. Second, don't use a shared RNG in the sampling Iterator. I suggest this is the substance of what to commit in this thread.

> Random generator objects- slight refactor
> -----------------------------------------
>
>                 Key: MAHOUT-687
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Lance Norskog
>            Priority: Minor
>         Attachments: MAHOUT-687.patch, MAHOUT-687.patch
>
>
> Problems:
> * The uncommons RepeatableRNG classes are the basis of RandomUtils.
> ** These classes cheerfully ignore setSeed.
> * Some people in the project want to move off Uncommons anyway.
> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> .

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030663#comment-13030663 ] 

Sean Owen commented on MAHOUT-687:
----------------------------------

RandomWrapper most certainly does use the map. Look at the use of INSTANCES just below it in useTestSeed() which shows its purpose. It needs to reset all of the wrappers that have been made. setSeed() may be ignored, and like I've said there's an easy fix to that -- new RNG object. I don't see a need to make the setSeed() method fail.

I don't have a strong opinion about sharing RNGs per se -- obviously correctness is vital, followed by performance, as concerns. So for example I think it's reasonable to use an RNG per iterator, yes.

But at the moment I don't know that we have a determinism problem as a result of this?


This patch does something different than the first patch, so I think I'm losing the plot about what this is out to accomplish?


> Random generator objects- slight refactor
> -----------------------------------------
>
>                 Key: MAHOUT-687
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Lance Norskog
>            Priority: Minor
>         Attachments: MAHOUT-687.patch, MAHOUT-687.patch
>
>
> Problems:
> * Uncommons MersenneTwisterRNG, the default RandomUtils.getRandom(), ignores setSeed without throwing an error.
> * The project wants to move off Uncommons anyway.
> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> Testcases: All math test cases pass except for org.apache.mahout.math.stats.LogLikelihoodTest. 
> Other package tests fail that are mostly about testing random-oriented classes; not a surprise.
> Almost all tests that use random numbers in algorithms still pass; this is a good sign of their stability.
> .
> Still, a lot of tests have to be fiddled to make this commit.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030270#comment-13030270 ] 

Lance Norskog commented on MAHOUT-687:
--------------------------------------

Now we're getting somewhere! 

Why should the default creator of a random number generator include a system call to the random seed operating system device driver?

> Random generator objects- slight refactor
> -----------------------------------------
>
>                 Key: MAHOUT-687
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Lance Norskog
>            Priority: Minor
>         Attachments: MAHOUT-687.patch
>
>
> Problems:
> * Uncommons MersenneTwisterRNG, the default RandomUtils.getRandom(), ignores setSeed without throwing an error.
> * The project wants to move off Uncommons anyway.
> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> Testcases: All math test cases pass except for org.apache.mahout.math.stats.LogLikelihoodTest. 
> Other package tests fail that are mostly about testing random-oriented classes; not a surprise.
> Almost all tests that use random numbers in algorithms still pass; this is a good sign of their stability.
> .
> Still, a lot of tests have to be fiddled to make this commit.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAHOUT-687) Random generator objects- slight refactor

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lance Norskog updated MAHOUT-687:
---------------------------------

    Attachment: MAHOUT-687.patch

> Random generator objects- slight refactor
> -----------------------------------------
>
>                 Key: MAHOUT-687
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Lance Norskog
>            Priority: Minor
>         Attachments: MAHOUT-687.patch
>
>
> Problems:
> * Uncommons MersenneTwisterRNG, the default RandomUtils.getRandom(), ignores setSeed without throwing an error.
> * The project wants to move off Uncommons anyway.
> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> Testcases: All math test cases pass except for org.apache.mahout.math.stats.LogLikelihoodTest. 
> Other package tests fail that are mostly about testing random-oriented classes; not a surprise.
> Almost all tests that use random numbers in algorithms still pass; this is a good sign of their stability.
> .
> Still, a lot of tests have to be fiddled to make this commit.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAHOUT-687) Random generator objects- slight refactor

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lance Norskog updated MAHOUT-687:
---------------------------------

    Attachment: MAHOUT-687.patch

> Random generator objects- slight refactor
> -----------------------------------------
>
>                 Key: MAHOUT-687
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Lance Norskog
>            Priority: Minor
>         Attachments: MAHOUT-687.patch, MAHOUT-687.patch
>
>
> Problems:
> * Uncommons MersenneTwisterRNG, the default RandomUtils.getRandom(), ignores setSeed without throwing an error.
> * The project wants to move off Uncommons anyway.
> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> Testcases: All math test cases pass except for org.apache.mahout.math.stats.LogLikelihoodTest. 
> Other package tests fail that are mostly about testing random-oriented classes; not a surprise.
> Almost all tests that use random numbers in algorithms still pass; this is a good sign of their stability.
> .
> Still, a lot of tests have to be fiddled to make this commit.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042690#comment-13042690 ] 

Hudson commented on MAHOUT-687:
-------------------------------

Integrated in Mahout-Quality #848 (See [https://builds.apache.org/hudson/job/Mahout-Quality/848/])
    

> Random generator objects- slight refactor
> -----------------------------------------
>
>                 Key: MAHOUT-687
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Lance Norskog
>            Assignee: Sean Owen
>            Priority: Minor
>              Labels: random, rng, seed
>             Fix For: 0.6
>
>         Attachments: MAHOUT-687.patch, MAHOUT-687.patch
>
>
> Problems:
> * The uncommons RepeatableRNG classes are the basis of RandomUtils.
> ** These classes cheerfully ignore setSeed.
> * Some people in the project want to move off Uncommons anyway.
> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> .

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037230#comment-13037230 ] 

Lance Norskog commented on MAHOUT-687:
--------------------------------------

bq. Is this still useful? I am like Sean and have lost the thrust of the patch.

I'm still working on it.

Ted, you mentioned wanting a MurmurHash Random class. Is this what you envisioned? (It is not finished code; see below).

{code}
public class MurmurHashRandom extends Random {
  private long murmurSeed;
  private final ByteBuffer buf;
  
  public MurmurHashRandom() {
    this(0);
  }

  public MurmurHashRandom(int seed) {
    SeedGenerator gen = new FastRandomSeedGenerator();
    byte[] bits = RandomUtils.longSeedtoBytes(gen.generateSeed());
    buf = ByteBuffer.wrap(bits);
    this.murmurSeed = MurmurHash.hash64A(bits, seed);
  }
  
  @Override
  public long nextLong() {
    long oldSeed = murmurSeed;
    murmurSeed = MurmurHash.hash64A(buf, (int) murmurSeed);
    return oldSeed;
  }

{code}

It is coded against my patch, so is only here for study purposes.It's coded against the MurmurHash class in encoders.encoders.MurmurHash works in ints, not longs, so types are a bit confused in this demo code.

> Random generator objects- slight refactor
> -----------------------------------------
>
>                 Key: MAHOUT-687
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Lance Norskog
>            Priority: Minor
>         Attachments: MAHOUT-687.patch, MAHOUT-687.patch
>
>
> Problems:
> * The uncommons RepeatableRNG classes are the basis of RandomUtils.
> ** These classes cheerfully ignore setSeed.
> * Some people in the project want to move off Uncommons anyway.
> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> .

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030595#comment-13030595 ] 

Lance Norskog commented on MAHOUT-687:
--------------------------------------

By popular demand, a less ambitious version. Four topics:

* SamplingLongPrimitiveIterator used one common Random which cause the unit test "deterministic seed" trick fail for some unknown reason. Changed class to use one Random per instance.

* RandomUtils had a private WeakHashMap out of different RandomWrappers that it built. It never used the map.

RandomWrapper does exactly what it used to do, with one change and one addition:

* setSeed(long x) now throws UnsupportedOperationException, because all of the
 Uncommons RNG classes ignore setSeed(long x). Yes. All of them.
* You can now pull the seed that RandomWrapper makes so that you can make another Uncommons RNG object, or use it for any purpose.


> Random generator objects- slight refactor
> -----------------------------------------
>
>                 Key: MAHOUT-687
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Lance Norskog
>            Priority: Minor
>         Attachments: MAHOUT-687.patch, MAHOUT-687.patch
>
>
> Problems:
> * Uncommons MersenneTwisterRNG, the default RandomUtils.getRandom(), ignores setSeed without throwing an error.
> * The project wants to move off Uncommons anyway.
> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> Testcases: All math test cases pass except for org.apache.mahout.math.stats.LogLikelihoodTest. 
> Other package tests fail that are mostly about testing random-oriented classes; not a surprise.
> Almost all tests that use random numbers in algorithms still pass; this is a good sign of their stability.
> .
> Still, a lot of tests have to be fiddled to make this commit.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029162#comment-13029162 ] 

Sean Owen commented on MAHOUT-687:
----------------------------------

Let's tease apart several things going on here.

If you want setSeed() to work on MersenneTwisterRNG, that's easy with a different one-line change that makes a new generator. It's not necessarily necessary to remove the implementation.

Removing Uncommons Maths is not necessarily a goal, but I'd support it. But more than just MersenneTwisterRNG is used in the project, so removing it won't let you remove Uncommons Math. So this patch fails to compile. (Side note, I would only post a patch if it still makes the project compile and pass tests.)

But then this patch does a bit more. It's replacing seeding based on /dev/urandom or SecureRandom with a simple increasing counter. What is the reasoning behind that?

> Random generator objects- slight refactor
> -----------------------------------------
>
>                 Key: MAHOUT-687
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Lance Norskog
>            Priority: Minor
>         Attachments: MAHOUT-687.patch
>
>
> Problems:
> * Uncommons MersenneTwisterRNG, the default RandomUtils.getRandom(), ignores setSeed without throwing an error.
> * The project wants to move off Uncommons anyway.
> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> Testcases: All math test cases pass except for org.apache.mahout.math.stats.LogLikelihoodTest. 
> Other package tests fail that are mostly about testing random-oriented classes; not a surprise.
> Almost all tests that use random numbers in algorithms still pass; this is a good sign of their stability.
> .
> Still, a lot of tests have to be fiddled to make this commit.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAHOUT-687) Random generator objects- slight refactor

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen updated MAHOUT-687:
-----------------------------

             Due Date: 10/Jun/11
    Affects Version/s: 0.5
        Fix Version/s: 0.6
             Assignee: Sean Owen
               Labels: random rng seed  (was: )

I'm assigning to me to commit, after 0.5, changes for two aspects of this discussion:

- RandomWrapper.setSeed() will now work, by instantiating a new RNG
- static Random variables will be made into instance variables where possible

> Random generator objects- slight refactor
> -----------------------------------------
>
>                 Key: MAHOUT-687
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Lance Norskog
>            Assignee: Sean Owen
>            Priority: Minor
>              Labels: random, rng, seed
>             Fix For: 0.6
>
>         Attachments: MAHOUT-687.patch, MAHOUT-687.patch
>
>
> Problems:
> * The uncommons RepeatableRNG classes are the basis of RandomUtils.
> ** These classes cheerfully ignore setSeed.
> * Some people in the project want to move off Uncommons anyway.
> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> .

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029815#comment-13029815 ] 

Lance Norskog commented on MAHOUT-687:
--------------------------------------

bq. If you want setSeed() to work on MersenneTwisterRNG, that's easy with a different one-line change that makes a new generator.
A deterministic random matrix or vector needs to set the seed for each multiply. This fix would create too much garbage. (Each MersenneTwister has 2500 bytes!) Once you say you need Commons MersenneTwister instead, because it has a setSeed(long), the rest of the patch ticks over.
bq. Removing Uncommons Maths is not necessarily a goal, but I'd support it. 
Other chatter on the list talked about pushing uncommons out completely. One step at a time.
bq. It's replacing seeding based on /dev/urandom or SecureRandom with a simple increasing counter. 
Oops- thought I changed that back. 
.
The patch is clearly not finished. If a test fails because it relies on a deterministic result, that's easy to fix. If a test fails otherwise, probably the test does not supply enough data points for the algorithm to function. From a quick look, LogLikelihoodTest may have this problem.



> Random generator objects- slight refactor
> -----------------------------------------
>
>                 Key: MAHOUT-687
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Lance Norskog
>            Priority: Minor
>         Attachments: MAHOUT-687.patch
>
>
> Problems:
> * Uncommons MersenneTwisterRNG, the default RandomUtils.getRandom(), ignores setSeed without throwing an error.
> * The project wants to move off Uncommons anyway.
> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> Testcases: All math test cases pass except for org.apache.mahout.math.stats.LogLikelihoodTest. 
> Other package tests fail that are mostly about testing random-oriented classes; not a surprise.
> Almost all tests that use random numbers in algorithms still pass; this is a good sign of their stability.
> .
> Still, a lot of tests have to be fiddled to make this commit.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030599#comment-13030599 ] 

Ted Dunning commented on MAHOUT-687:
------------------------------------

{quote}
SamplingLongPrimitiveIterator used one common Random which cause the unit test "deterministic seed" trick fail for some unknown reason. Changed class to use one Random per instance.
{quote}

Random number generators should almost never be shared between objects for many reasons.  The two I think are most important that it makes the objects inherently thread unsafe and it makes the threads sharing the RNG's very slow due to memory synchronization.  Some RNG's aren't even thread safe themselves so sharing is a complete disaster as opposed to just bad.

> Random generator objects- slight refactor
> -----------------------------------------
>
>                 Key: MAHOUT-687
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Lance Norskog
>            Priority: Minor
>         Attachments: MAHOUT-687.patch, MAHOUT-687.patch
>
>
> Problems:
> * Uncommons MersenneTwisterRNG, the default RandomUtils.getRandom(), ignores setSeed without throwing an error.
> * The project wants to move off Uncommons anyway.
> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> Testcases: All math test cases pass except for org.apache.mahout.math.stats.LogLikelihoodTest. 
> Other package tests fail that are mostly about testing random-oriented classes; not a surprise.
> Almost all tests that use random numbers in algorithms still pass; this is a good sign of their stability.
> .
> Still, a lot of tests have to be fiddled to make this commit.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037464#comment-13037464 ] 

Lance Norskog commented on MAHOUT-687:
--------------------------------------

The Sampling iterator never drops the first sample. 
A minor nit: Iterator.hasNext() is not supposed to change anything, but this makes it hard to "keep or drop" the final sample.

So, if you want to have a per-object random, that's great. It would be good to fix that everywhere as a sweep, but sweeps kill patches. Oh well.

> Random generator objects- slight refactor
> -----------------------------------------
>
>                 Key: MAHOUT-687
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Lance Norskog
>            Priority: Minor
>         Attachments: MAHOUT-687.patch, MAHOUT-687.patch
>
>
> Problems:
> * The uncommons RepeatableRNG classes are the basis of RandomUtils.
> ** These classes cheerfully ignore setSeed.
> * Some people in the project want to move off Uncommons anyway.
> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> .

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAHOUT-687) Random generator objects- slight refactor

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved MAHOUT-687.
------------------------------

    Resolution: Fixed

Committed the part I referred to in comments

> Random generator objects- slight refactor
> -----------------------------------------
>
>                 Key: MAHOUT-687
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Lance Norskog
>            Assignee: Sean Owen
>            Priority: Minor
>              Labels: random, rng, seed
>             Fix For: 0.6
>
>         Attachments: MAHOUT-687.patch, MAHOUT-687.patch
>
>
> Problems:
> * The uncommons RepeatableRNG classes are the basis of RandomUtils.
> ** These classes cheerfully ignore setSeed.
> * Some people in the project want to move off Uncommons anyway.
> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> .

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-687) Random generator objects- slight refactor

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030596#comment-13030596 ] 

Lance Norskog commented on MAHOUT-687:
--------------------------------------

Why throw an exception instead of helping the user by making the problem go away? Because code should not "help". If there's a problem, just tell me. Only I know how I want to handle the error. Don't "help" me- it just wastes my time.

*Fail Loud*
*Fail Early*

> Random generator objects- slight refactor
> -----------------------------------------
>
>                 Key: MAHOUT-687
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-687
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Lance Norskog
>            Priority: Minor
>         Attachments: MAHOUT-687.patch, MAHOUT-687.patch
>
>
> Problems:
> * Uncommons MersenneTwisterRNG, the default RandomUtils.getRandom(), ignores setSeed without throwing an error.
> * The project wants to move off Uncommons anyway.
> This patch uses the org.apache.commons.math.random.RandomGenerator classes instead of org.apache.uncommons.maths.RepeatableRNG classes.
> Testcases: All math test cases pass except for org.apache.mahout.math.stats.LogLikelihoodTest. 
> Other package tests fail that are mostly about testing random-oriented classes; not a surprise.
> Almost all tests that use random numbers in algorithms still pass; this is a good sign of their stability.
> .
> Still, a lot of tests have to be fiddled to make this commit.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira