You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucy.apache.org by "Nick Wellnhofer (JIRA)" <ji...@apache.org> on 2016/11/25 15:15:58 UTC

[lucy-issues] [jira] [Created] (CLOWNFISH-110) Exposure of Hash iteration order allows for O(n²) blowup

Nick Wellnhofer created CLOWNFISH-110:
-----------------------------------------

             Summary: Exposure of Hash iteration order allows for O(n²) blowup
                 Key: CLOWNFISH-110
                 URL: https://issues.apache.org/jira/browse/CLOWNFISH-110
             Project: Apache Lucy-Clownfish
          Issue Type: Bug
          Components: Runtime
            Reporter: Nick Wellnhofer


Our Hash implementation uses open addressing with linear probing, so we're susceptible to the same performance bug that was recently discovered in Rust:

https://github.com/rust-lang/rust/issues/36481
http://accidentallyquadratic.tumblr.com/post/153545455987/rust-hash-iteration-reinsertion

Short explanation: When copying entries from one hash table to another using HashIterator to read the source table, the entries are inserted in in-memory order. If the target hash table wasn't resized to the final capacity yet, this can result in quadratic behavior.

We have a maximum fill factor of 2/3 compared to 9/10 in the Rust implementation, so this problem should be much less pronounced.

A possible fix is to randomize the hash function (or iteration order) for each table which is closely related to CLOWNFISH-48.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)