You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Tim Wintle <ti...@gmail.com> on 2012/09/03 10:20:51 UTC

Re: Order of the cyclic group of hashed partitioners

On Tue, 2012-08-28 at 16:57 +1200, aaron morton wrote:
> Sorry I don't understand your question. 
> 
> Can you explain it a bit more or maybe someone else knows.

I believe the question is why is the maximum 2**127 and not
0xffffffffffffffff

Tim

> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 27/08/2012, at 7:16 PM, Romain HARDOUIN <ro...@urssaf.fr> wrote:
> 
> > 
> > Thank you Aaron. 
> > This limit was pushed down in RandomPartitioner but the question still exists... 
> > 
> >   
> > aaron morton <aa...@thelastpickle.com> a écrit sur 26/08/2012 23:35:50 :
> > 
> > > > AbstractHashedPartitioner 
> > > does not exist in the trunk. 
> > > https://git-wip-us.apache.org/repos/asf?p=cassandra.git;
> > > a=commitdiff;h=a89ef1ffd4cd2ee39a2751f37044dba3015d72f1
> > > 
> > > 
> > > Cheers
> > > 
> > > -----------------
> > > Aaron Morton
> > > Freelance Developer
> > > @aaronmorton
> > > http://www.thelastpickle.com
> > > 
> > > On 24/08/2012, at 10:51 PM, Romain HARDOUIN <ro...@urssaf.fr> wrote:
> > > 
> > > > 
> > > > Hi, 
> > > > 
> > > > AbstractHashedPartitioner defines a maximum of 2**127 hence an 
> > > order of (2**127)+1. 
> > > > I'd say that tokens of such partitioners are intented to be 
> > > distributed in Z/(127), hence a maximum of (2**127)-1. 
> > > > Could there be a mix up between maximum and order? 
> > > > This is a detail but could someone confirm/invalidate? 
> > > > 
> > > > Regards, 
> > > > 
> > > > Romain
> > > 
> 



Re: Order of the cyclic group of hashed partitioners

Posted by Romain HARDOUIN <ro...@urssaf.fr>.
> I meant that what the OP spotted was it's an inclusive maximum <=

That's it Tim, you understood what I meant. Thank you for taking the time 
to consider my question.
0 <= hash < (2**127) corresponds to the cyclic group Z/(2**127) so the 
maximum is (2**127)-1
So there is indeed a mix up in sources between the order -- 2**127 -- and 
the maximum  -- (2**127)-1

But I don't agree with you when you say:
> The code does exclusive comparisons and excludes the value 0 though.

CMIIW but 0 is allowed:
if (i.compareTo(ZERO) < 0) // if i is negative. 0 is OK.
    throw ...

The same goes for 2**127:
if (i.compareTo(MAXIMUM) > 0) // if i is greater than MAXIMUM. MAXIMUM is 
OK.
    throw ...

Exclusive bounds would be: "compareTo(...) == 0".

IMHO the maximum const should be:
public static final BigInteger MAXIMUM = new 
BigInteger("2").pow(127).subtract(BigInteger.ONE);

And the correct comment should be:
throw ...("Token must be < 2**127");

Are you agree?

Romain


Tim Wintle <ti...@gmail.com> a écrit sur 06/09/2012 08:41:27 :

> On Wed, 2012-09-05 at 13:23 +1200, aaron morton wrote:
> > > I believe the question is why is the maximum 2**127 and not
> > > 0xffffffffffffffff
> 
> oops - I got the wrong number of digits there.
> 
> > The maximum is the size of the digest created by MD5. 
> 
> (I may be mistaken) - isn't the range of MD5 values 
> 0 <= hash < (2**128)
> ?
> 
> If you're dropping one bit to store as a signed integer to give 127 bits
> of entropy then it would be in the range:
> 
> 0 <= hash < (2**127)
> 
> but the range being checked is:
> 
> 0 <= hash <= (2**127)
> 
> > Does that answer the question?
> 
> I meant that what the OP spotted was it's an inclusive maximum <=
> 
> 0 <= hash <= 2**127 gives (2**127) + 1 different values, and is
> mathematically the clock-arithmetic (cyclic) group:
>     Z/(2**127 + 1)   [0]
> 
> 
> I _believe_ the issue is actually the other way around in
> AbstractHashedPartitioner (upper and lower bounds are exclusive) - but
> the comments are incorrect.
> 
> i.e. both the code and the comments have off-by-one errors.
> 
> 
> {{{
> 
> if (i.compareTo(ZERO) < 0)
>     throw new ConfigurationException("Token must be >= 0");
> if (i.compareTo(MAXIMUM) > 0)
>     throw new ConfigurationException("Token must be <= 2**127");
> }}}
> 
> The comments imply that 0 and 2**127 are both valid tokens (which they
> shouldn't be).
> 
> The code does exclusive comparisons and excludes the value 0 though.
> 
> Tim
> > 
> [0] I believe the OP mistyped that as Z/(127+1)
> 
> > 
> > 
> > -----------------
> > Aaron Morton
> > Freelance Developer
> > @aaronmorton
> > http://www.thelastpickle.com
> > 
> > On 3/09/2012, at 8:20 PM, Tim Wintle <ti...@gmail.com> wrote:
> > 
> > > On Tue, 2012-08-28 at 16:57 +1200, aaron morton wrote:
> > > > Sorry I don't understand your question. 
> > > > 
> > > > Can you explain it a bit more or maybe someone else knows.
> > > 
> > > I believe the question is why is the maximum 2**127 and not
> > > 0xffffffffffffffff
> > > 
> > > Tim
> > > 
> > > > 
> > > > Cheers
> > > > 
> > > > -----------------
> > > > Aaron Morton
> > > > Freelance Developer
> > > > @aaronmorton
> > > > http://www.thelastpickle.com
> > > > 
> > > > On 27/08/2012, at 7:16 PM, Romain HARDOUIN
> > > > <ro...@urssaf.fr> wrote:
> > > > 
> > > > > 
> > > > > Thank you Aaron. 
> > > > > This limit was pushed down in RandomPartitioner but the question
> > > > > still exists... 
> > > > > 
> > > > > 
> > > > > aaron morton <aa...@thelastpickle.com> a écrit sur 26/08/2012
> > > > > 23:35:50 :
> > > > > 
> > > > > > > AbstractHashedPartitioner 
> > > > > > does not exist in the trunk. 
> > > > > > https://git-wip-us.apache.org/repos/asf?p=cassandra.git;
> > > > > > a=commitdiff;h=a89ef1ffd4cd2ee39a2751f37044dba3015d72f1
> > > > > > 
> > > > > > 
> > > > > > Cheers
> > > > > > 
> > > > > > -----------------
> > > > > > Aaron Morton
> > > > > > Freelance Developer
> > > > > > @aaronmorton
> > > > > > http://www.thelastpickle.com
> > > > > > 
> > > > > > On 24/08/2012, at 10:51 PM, Romain HARDOUIN
> > > > > > <ro...@urssaf.fr> wrote:
> > > > > > 
> > > > > > > 
> > > > > > > Hi, 
> > > > > > > 
> > > > > > > AbstractHashedPartitioner defines a maximum of 2**127 hence
> > > > > > > an 
> > > > > > order of (2**127)+1. 
> > > > > > > I'd say that tokens of such partitioners are intented to be 
> > > > > > distributed in Z/(127), hence a maximum of (2**127)-1. 
> > > > > > > Could there be a mix up between maximum and order? 
> > > > > > > This is a detail but could someone confirm/invalidate? 
> > > > > > > 
> > > > > > > Regards, 
> > > > > > > 
> > > > > > > Romain
> > > > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> 
> 

Re: Order of the cyclic group of hashed partitioners

Posted by Romain HARDOUIN <ro...@urssaf.fr>.
A little clarification. When I talk about exclusive bounds, I mean:
"compareTo(ZERO) <= 0" or "compareTo(MAXIMUM) >= 0"

Re: Order of the cyclic group of hashed partitioners

Posted by Tim Wintle <ti...@gmail.com>.
On Wed, 2012-09-05 at 13:23 +1200, aaron morton wrote:
> > I believe the question is why is the maximum 2**127 and not
> > 0xffffffffffffffff

oops - I got the wrong number of digits there.

> The maximum is the size of the digest created by MD5. 

(I may be mistaken) - isn't the range of MD5 values 
0 <= hash < (2**128)
?

If you're dropping one bit to store as a signed integer to give 127 bits
of entropy then it would be in the range:

0 <= hash < (2**127)

but the range being checked is:

0 <= hash <= (2**127)

> Does that answer the question?

I meant that what the OP spotted was it's an inclusive maximum <=

0 <= hash <= 2**127 gives (2**127) + 1 different values, and is
mathematically the clock-arithmetic (cyclic) group:
    Z/(2**127 + 1)   [0]


I _believe_ the issue is actually the other way around in
AbstractHashedPartitioner (upper and lower bounds are exclusive) - but
the comments are incorrect.

i.e. both the code and the comments have off-by-one errors.


{{{

if (i.compareTo(ZERO) < 0)
    throw new ConfigurationException("Token must be >= 0");
if (i.compareTo(MAXIMUM) > 0)
    throw new ConfigurationException("Token must be <= 2**127");
}}}

The comments imply that 0 and 2**127 are both valid tokens (which they
shouldn't be).

The code does exclusive comparisons and excludes the value 0 though.

Tim
> 
[0] I believe the OP mistyped that as Z/(127+1)

> 
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 3/09/2012, at 8:20 PM, Tim Wintle <ti...@gmail.com> wrote:
> 
> > On Tue, 2012-08-28 at 16:57 +1200, aaron morton wrote:
> > > Sorry I don't understand your question. 
> > > 
> > > Can you explain it a bit more or maybe someone else knows.
> > 
> > I believe the question is why is the maximum 2**127 and not
> > 0xffffffffffffffff
> > 
> > Tim
> > 
> > > 
> > > Cheers
> > > 
> > > -----------------
> > > Aaron Morton
> > > Freelance Developer
> > > @aaronmorton
> > > http://www.thelastpickle.com
> > > 
> > > On 27/08/2012, at 7:16 PM, Romain HARDOUIN
> > > <ro...@urssaf.fr> wrote:
> > > 
> > > > 
> > > > Thank you Aaron. 
> > > > This limit was pushed down in RandomPartitioner but the question
> > > > still exists... 
> > > > 
> > > > 
> > > > aaron morton <aa...@thelastpickle.com> a écrit sur 26/08/2012
> > > > 23:35:50 :
> > > > 
> > > > > > AbstractHashedPartitioner 
> > > > > does not exist in the trunk. 
> > > > > https://git-wip-us.apache.org/repos/asf?p=cassandra.git;
> > > > > a=commitdiff;h=a89ef1ffd4cd2ee39a2751f37044dba3015d72f1
> > > > > 
> > > > > 
> > > > > Cheers
> > > > > 
> > > > > -----------------
> > > > > Aaron Morton
> > > > > Freelance Developer
> > > > > @aaronmorton
> > > > > http://www.thelastpickle.com
> > > > > 
> > > > > On 24/08/2012, at 10:51 PM, Romain HARDOUIN
> > > > > <ro...@urssaf.fr> wrote:
> > > > > 
> > > > > > 
> > > > > > Hi, 
> > > > > > 
> > > > > > AbstractHashedPartitioner defines a maximum of 2**127 hence
> > > > > > an 
> > > > > order of (2**127)+1. 
> > > > > > I'd say that tokens of such partitioners are intented to be 
> > > > > distributed in Z/(127), hence a maximum of (2**127)-1. 
> > > > > > Could there be a mix up between maximum and order? 
> > > > > > This is a detail but could someone confirm/invalidate? 
> > > > > > 
> > > > > > Regards, 
> > > > > > 
> > > > > > Romain
> > > > > 
> > > 
> > 
> > 
> 
> 



Re: Order of the cyclic group of hashed partitioners

Posted by aaron morton <aa...@thelastpickle.com>.
> I believe the question is why is the maximum 2**127 and not
> 0xffffffffffffffff
The maximum is the size of the digest created by MD5. 

Does that answer the question?


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 3/09/2012, at 8:20 PM, Tim Wintle <ti...@gmail.com> wrote:

> On Tue, 2012-08-28 at 16:57 +1200, aaron morton wrote:
>> Sorry I don't understand your question. 
>> 
>> Can you explain it a bit more or maybe someone else knows.
> 
> I believe the question is why is the maximum 2**127 and not
> 0xffffffffffffffff
> 
> Tim
> 
>> 
>> Cheers
>> 
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 27/08/2012, at 7:16 PM, Romain HARDOUIN <ro...@urssaf.fr> wrote:
>> 
>>> 
>>> Thank you Aaron. 
>>> This limit was pushed down in RandomPartitioner but the question still exists... 
>>> 
>>> 
>>> aaron morton <aa...@thelastpickle.com> a écrit sur 26/08/2012 23:35:50 :
>>> 
>>>>> AbstractHashedPartitioner 
>>>> does not exist in the trunk. 
>>>> https://git-wip-us.apache.org/repos/asf?p=cassandra.git;
>>>> a=commitdiff;h=a89ef1ffd4cd2ee39a2751f37044dba3015d72f1
>>>> 
>>>> 
>>>> Cheers
>>>> 
>>>> -----------------
>>>> Aaron Morton
>>>> Freelance Developer
>>>> @aaronmorton
>>>> http://www.thelastpickle.com
>>>> 
>>>> On 24/08/2012, at 10:51 PM, Romain HARDOUIN <ro...@urssaf.fr> wrote:
>>>> 
>>>>> 
>>>>> Hi, 
>>>>> 
>>>>> AbstractHashedPartitioner defines a maximum of 2**127 hence an 
>>>> order of (2**127)+1. 
>>>>> I'd say that tokens of such partitioners are intented to be 
>>>> distributed in Z/(127), hence a maximum of (2**127)-1. 
>>>>> Could there be a mix up between maximum and order? 
>>>>> This is a detail but could someone confirm/invalidate? 
>>>>> 
>>>>> Regards, 
>>>>> 
>>>>> Romain
>>>> 
>> 
> 
>