You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pirk.apache.org by Tim Ellison <t....@gmail.com> on 2016/08/10 16:23:25 UTC

KeyedHash#hash(String, int, String)

Why does
  org.apache.pirk.utils.KeyedHash#hash(String key, int bitSize, String
input)

have
  int fullHash = Math.abs(concat.hashCode());

What is the value of performing the "Math.abs" in computing this hash?

p.s. I now realize I'm entering the toddler phase of Pirk development,
where I wander around constantly asking "why?" :-)

Regards,
Tim


Re: KeyedHash#hash(String, int, String)

Posted by Tim Ellison <t....@gmail.com>.
On 15/08/16 23:09, Ellison Anne Williams wrote:
> I recall this being a quick fix for a hash collision issue. Moreover, the
> hash (aside from the embedded selector hash) is used as an 'index' into the
> 'encryption matrix' -- it doesn't make any sense to have a negative row
> index, but the matrix representation is just that in this case, a
> representation, and thus we don't really have to play by matrix
> conventions.
> 
> When I removed the absolute value from the hash and ran it through the
> tests (in memory and distributed), everything passed. We can remove it, if
> desired.

Ok, I'll include it in an incoming PR.  Removing the abs() now makes it
symmetrical with the message digest implementation -- which makes me
happy :-)

Regards,
Tim

> On Mon, Aug 15, 2016 at 11:18 AM, Tim Ellison <t....@gmail.com> wrote:
> 
>> On 12/08/16 16:37, Ellison Anne Williams wrote:
>>> Thought that I responded to this -- oops - sorry!
>>>
>>> As hashes can be (naturally) negative, taking the absolute value avoid
>> twos
>>> complement issues when we are only taking certain bits of the hash
>> (instead
>>> of the whole thing). I would have to go back through to see if we could
>>> alleviate these issues in another way - it's possible (as I recall, this
>>> was a quick fix).
>>
>> Not sure that I understand that, as the code is dependent upon the
>> return value of String#hashCode (i.e. it is an 'unknown' 32-bit value).
>>
>> You are just as likely to get a hash collision by 'folding' the hash
>> value range using abs() as you are taking the lower bits and ignoring
>> the sign bit, aren't you?
>>
>> Regards,
>> Tim
>>
>>> On Fri, Aug 12, 2016 at 11:06 AM, Tim Ellison <t....@gmail.com>
>> wrote:
>>>
>>>> Anyone got thoughts on this?
>>>>
>>>> On 10/08/16 17:23, Tim Ellison wrote:
>>>>> Why does
>>>>>   org.apache.pirk.utils.KeyedHash#hash(String key, int bitSize, String
>>>>> input)
>>>>>
>>>>> have
>>>>>   int fullHash = Math.abs(concat.hashCode());
>>>>>
>>>>> What is the value of performing the "Math.abs" in computing this hash?
>>>>>
>>>>> p.s. I now realize I'm entering the toddler phase of Pirk development,
>>>>> where I wander around constantly asking "why?" :-)
>>>>>
>>>>> Regards,
>>>>> Tim
>>>>>
>>>>
>>>
>>
> 

Re: KeyedHash#hash(String, int, String)

Posted by Ellison Anne Williams <ea...@gmail.com>.
I recall this being a quick fix for a hash collision issue. Moreover, the
hash (aside from the embedded selector hash) is used as an 'index' into the
'encryption matrix' -- it doesn't make any sense to have a negative row
index, but the matrix representation is just that in this case, a
representation, and thus we don't really have to play by matrix
conventions.

When I removed the absolute value from the hash and ran it through the
tests (in memory and distributed), everything passed. We can remove it, if
desired.

On Mon, Aug 15, 2016 at 11:18 AM, Tim Ellison <t....@gmail.com> wrote:

> On 12/08/16 16:37, Ellison Anne Williams wrote:
> > Thought that I responded to this -- oops - sorry!
> >
> > As hashes can be (naturally) negative, taking the absolute value avoid
> twos
> > complement issues when we are only taking certain bits of the hash
> (instead
> > of the whole thing). I would have to go back through to see if we could
> > alleviate these issues in another way - it's possible (as I recall, this
> > was a quick fix).
>
> Not sure that I understand that, as the code is dependent upon the
> return value of String#hashCode (i.e. it is an 'unknown' 32-bit value).
>
> You are just as likely to get a hash collision by 'folding' the hash
> value range using abs() as you are taking the lower bits and ignoring
> the sign bit, aren't you?
>
> Regards,
> Tim
>
> > On Fri, Aug 12, 2016 at 11:06 AM, Tim Ellison <t....@gmail.com>
> wrote:
> >
> >> Anyone got thoughts on this?
> >>
> >> On 10/08/16 17:23, Tim Ellison wrote:
> >>> Why does
> >>>   org.apache.pirk.utils.KeyedHash#hash(String key, int bitSize, String
> >>> input)
> >>>
> >>> have
> >>>   int fullHash = Math.abs(concat.hashCode());
> >>>
> >>> What is the value of performing the "Math.abs" in computing this hash?
> >>>
> >>> p.s. I now realize I'm entering the toddler phase of Pirk development,
> >>> where I wander around constantly asking "why?" :-)
> >>>
> >>> Regards,
> >>> Tim
> >>>
> >>
> >
>

Re: KeyedHash#hash(String, int, String)

Posted by Tim Ellison <t....@gmail.com>.
On 12/08/16 16:37, Ellison Anne Williams wrote:
> Thought that I responded to this -- oops - sorry!
> 
> As hashes can be (naturally) negative, taking the absolute value avoid twos
> complement issues when we are only taking certain bits of the hash (instead
> of the whole thing). I would have to go back through to see if we could
> alleviate these issues in another way - it's possible (as I recall, this
> was a quick fix).

Not sure that I understand that, as the code is dependent upon the
return value of String#hashCode (i.e. it is an 'unknown' 32-bit value).

You are just as likely to get a hash collision by 'folding' the hash
value range using abs() as you are taking the lower bits and ignoring
the sign bit, aren't you?

Regards,
Tim

> On Fri, Aug 12, 2016 at 11:06 AM, Tim Ellison <t....@gmail.com> wrote:
> 
>> Anyone got thoughts on this?
>>
>> On 10/08/16 17:23, Tim Ellison wrote:
>>> Why does
>>>   org.apache.pirk.utils.KeyedHash#hash(String key, int bitSize, String
>>> input)
>>>
>>> have
>>>   int fullHash = Math.abs(concat.hashCode());
>>>
>>> What is the value of performing the "Math.abs" in computing this hash?
>>>
>>> p.s. I now realize I'm entering the toddler phase of Pirk development,
>>> where I wander around constantly asking "why?" :-)
>>>
>>> Regards,
>>> Tim
>>>
>>
> 

Re: KeyedHash#hash(String, int, String)

Posted by Ellison Anne Williams <ea...@gmail.com>.
Thought that I responded to this -- oops - sorry!

As hashes can be (naturally) negative, taking the absolute value avoid twos
complement issues when we are only taking certain bits of the hash (instead
of the whole thing). I would have to go back through to see if we could
alleviate these issues in another way - it's possible (as I recall, this
was a quick fix).

On Fri, Aug 12, 2016 at 11:06 AM, Tim Ellison <t....@gmail.com> wrote:

> Anyone got thoughts on this?
>
> On 10/08/16 17:23, Tim Ellison wrote:
> > Why does
> >   org.apache.pirk.utils.KeyedHash#hash(String key, int bitSize, String
> > input)
> >
> > have
> >   int fullHash = Math.abs(concat.hashCode());
> >
> > What is the value of performing the "Math.abs" in computing this hash?
> >
> > p.s. I now realize I'm entering the toddler phase of Pirk development,
> > where I wander around constantly asking "why?" :-)
> >
> > Regards,
> > Tim
> >
>

Re: KeyedHash#hash(String, int, String)

Posted by Tim Ellison <t....@gmail.com>.
Anyone got thoughts on this?

On 10/08/16 17:23, Tim Ellison wrote:
> Why does
>   org.apache.pirk.utils.KeyedHash#hash(String key, int bitSize, String
> input)
> 
> have
>   int fullHash = Math.abs(concat.hashCode());
> 
> What is the value of performing the "Math.abs" in computing this hash?
> 
> p.s. I now realize I'm entering the toddler phase of Pirk development,
> where I wander around constantly asking "why?" :-)
> 
> Regards,
> Tim
>