You are viewing a plain text version of this content. The canonical link for it is here.
Posted to codereview@trafodion.apache.org by zellerh <gi...@git.apache.org> on 2018/04/24 22:50:09 UTC

[GitHub] trafodion pull request #1539: [TRAFODION-3042] RAND() function is not always...

GitHub user zellerh opened a pull request:

    https://github.com/apache/trafodion/pull/1539

    [TRAFODION-3042] RAND() function is not always random

    When called without a seed, we use a seed based on the system timestamp. The random generator we use generates similar output values for seeds that are close together. Adding another randomization step to avoid that.
    
    Also switched from a microsecond to a nanosecond-based timestamp.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zellerh/trafodion bug/R23a

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/trafodion/pull/1539.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1539
    
----
commit 36e21b2c9c722bbcb551c35a761f069811051ea0
Author: Hans Zeller <hz...@...>
Date:   2018-04-24T21:39:16Z

    [TRAFODION-3042] RAND() function is not always random
    
    When called without a seed, we use a seed based on the
    system timestamp. The random generator we use generates similar
    output values for seeds that are close together. Adding another
    randomization step to avoid that.

commit 4f834729efa2c9ae74922f1f32226a29b6f8d7d0
Author: Hans Zeller <hz...@...>
Date:   2018-04-24T22:01:37Z

    [TRAFODION-3042] Switch to nanosecond-resolution time
    
    Getting ready for the day where we can do two RAND() calls
    in a microsecond - hopefully soon :-)

----


---

[GitHub] trafodion pull request #1539: [TRAFODION-3042] RAND() function is not always...

Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1539#discussion_r183904493
  
    --- Diff: core/sql/exp/exp_function.cpp ---
    @@ -5937,45 +5938,22 @@ void ExFunctionRandomNum::initSeed(char *op_data[])
     
           // Pick an initial seed.  According to the reference given below
           // (in the eval function), all initial seeds between 1 and
    -      // 2147483646 are equally valid.  So, we just need to pick one
    +      // 2147483647 are equally valid.  So, we just need to pick one
           // in this range.  Do this based on a timestamp.
    +      struct timespec seedTime;
     
    -      // Use ex_function_current to get timestamp.
    -      //
    -      char currBuff[32];
    -      char *opData[1];
    -      opData[0] = currBuff;
    -      ex_function_current currentFun;
    -      currentFun.eval(&opData[0], 0, 0);
    +      clock_gettime(CLOCK_REALTIME, &seedTime);
     
    -      // Extract year, month, etc.
    -      //
    -      char *p = currBuff;
    -      short year = *((short*) p);
    -      p += sizeof(short);
    -      char month = *p++;
    -      char day = *p++;
    -      char hour = *p++;
    -      char minute = *p++;
    -      char second = *p++;
    -      Lng32 fraction = *((Lng32*) p);
    -
    -      // Local variables year, ..., fraction are now initialized.
    -      // From the values of these variables, generate a seed in the
    -      // desired range.
    -
    -      Lng32 x = year * month * day;
    -      if (hour) x *= hour;
    -      p = (char*) &x;
    -
    -      assert(sizeof(Lng32)==4);
    -
    -      p[0] |= (second<<1);
    -      p[1] |= (minute<<1);
    -      p[2] |= (minute<<2);
    -      p[3] |= second;
    -
    -      seed_ = x + fraction;
    +      seed_  = (Int32) (seedTime.tv_sec  % 2147483648);
    +      seed_ ^= (Int32) (seedTime.tv_nsec % 2147483648L);
    +
    +      // Go through one step of a linear congruential random generator.
    +      // (https://en.wikipedia.org/wiki/Linear_congruential_generator).
    +      // This is to avoid seed values that are close to each other when
    +      // we call this method again within a short time. The eval() method
    +      // below doesn't handle seed values that are close to each other
    +      // very well.
    +      seed_ = (((Int64) seed_) * 1664525L + 1664525L) % 2147483648;
    --- End diff --
    
    Did you mean, (((Int64) seed_) * 1664525L + 1013904223L) & 2147483648? (These are the parameters suggested by the wiki article.)


---

[GitHub] trafodion pull request #1539: [TRAFODION-3042] RAND() function is not always...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/trafodion/pull/1539


---

[GitHub] trafodion pull request #1539: [TRAFODION-3042] RAND() function is not always...

Posted by zellerh <gi...@git.apache.org>.
Github user zellerh commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1539#discussion_r183907638
  
    --- Diff: core/sql/exp/exp_function.cpp ---
    @@ -5937,45 +5938,22 @@ void ExFunctionRandomNum::initSeed(char *op_data[])
     
           // Pick an initial seed.  According to the reference given below
           // (in the eval function), all initial seeds between 1 and
    -      // 2147483646 are equally valid.  So, we just need to pick one
    +      // 2147483647 are equally valid.  So, we just need to pick one
           // in this range.  Do this based on a timestamp.
    +      struct timespec seedTime;
     
    -      // Use ex_function_current to get timestamp.
    -      //
    -      char currBuff[32];
    -      char *opData[1];
    -      opData[0] = currBuff;
    -      ex_function_current currentFun;
    -      currentFun.eval(&opData[0], 0, 0);
    +      clock_gettime(CLOCK_REALTIME, &seedTime);
     
    -      // Extract year, month, etc.
    -      //
    -      char *p = currBuff;
    -      short year = *((short*) p);
    -      p += sizeof(short);
    -      char month = *p++;
    -      char day = *p++;
    -      char hour = *p++;
    -      char minute = *p++;
    -      char second = *p++;
    -      Lng32 fraction = *((Lng32*) p);
    -
    -      // Local variables year, ..., fraction are now initialized.
    -      // From the values of these variables, generate a seed in the
    -      // desired range.
    -
    -      Lng32 x = year * month * day;
    -      if (hour) x *= hour;
    -      p = (char*) &x;
    -
    -      assert(sizeof(Lng32)==4);
    -
    -      p[0] |= (second<<1);
    -      p[1] |= (minute<<1);
    -      p[2] |= (minute<<2);
    -      p[3] |= second;
    -
    -      seed_ = x + fraction;
    +      seed_  = (Int32) (seedTime.tv_sec  % 2147483648);
    +      seed_ ^= (Int32) (seedTime.tv_nsec % 2147483648L);
    +
    +      // Go through one step of a linear congruential random generator.
    +      // (https://en.wikipedia.org/wiki/Linear_congruential_generator).
    +      // This is to avoid seed values that are close to each other when
    +      // we call this method again within a short time. The eval() method
    +      // below doesn't handle seed values that are close to each other
    +      // very well.
    +      seed_ = (((Int64) seed_) * 1664525L + 1664525L) % 2147483648;
    --- End diff --
    
    Thanks, yes, good catch! Looks like a copy & paste error.


---