You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Chen (Jira)" <ji...@apache.org> on 2019/09/19 09:01:00 UTC

[jira] [Commented] (COLLECTIONS-714) PatriciaTrie ignores trailing null characters in keys

    [ https://issues.apache.org/jira/browse/COLLECTIONS-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933188#comment-16933188 ] 

Chen commented on COLLECTIONS-714:
----------------------------------

Hi, [~rohanpadhye] [~ggregory]
This problem is case by StringKeyAnalyzer#bitIndex don't support \u0000.
Because k=0=f=\u0000, so there should not use k!=f.
I think when index1>=endIndex1, we should set some sign instead of setting k=0, then use the sign to instead of k!=f.
{code:java}
            if (index1 >= endIndex1) {
                k = 0;
            } else {
                k = key.charAt(index1);
            }
            
            if (other == null || index2 >= endIndex2) {
                f = 0;
                fflag = true;
            } else {
                f = other.charAt(index2);
            }

            if ( (k != f ) {
               final int x = k ^ f;
               return i * LENGTH + Integer.numberOfLeadingZeros(x) - LENGTH;
            }
{code}


> PatriciaTrie ignores trailing null characters in keys
> -----------------------------------------------------
>
>                 Key: COLLECTIONS-714
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-714
>             Project: Commons Collections
>          Issue Type: Bug
>          Components: Collection, Map
>    Affects Versions: 4.3
>            Reporter: Rohan Padhye
>            Priority: Critical
>
> In Java, strings are not null terminated. The string "x" (of length = 1 char) is different from the string "x\u0000" (of length = 2 chars). However, PatriciaTrie does not seem to distinguish between these strings.
> To reproduce: 
> {code:java}
> public void testNullTerminatedKey1() {
>     Map<String, Integer> map = new HashMap<>();
>     map.put("x", 0);         // key of length 1
>     map.put("x\u0000", 1);   // key of length 2
>     map.put("x\u0000y", 2);  // key of length 3
>     Assert.assertEquals(3, map.size());  // ok, 3 distinct keys
>     PatriciaTrie<Integer> trie = new PatriciaTrie<>(map);
>     Assert.assertEquals(3, trie.size());  // fail; actual=2
> }{code}
> In the above example, the resulting trie has only two keys: "x\u0000" and "x\u0000y". The key "x" gets overwritten. Here is another way to repro the bug: 
> {code:java}
> public void testNullTerminatedKey2() {
>     PatriciaTrie<Integer> trie = new PatriciaTrie<>();
>     trie.put("x", 0);
>     Assert.assertTrue(trie.containsKey("x")); // ok
>     trie.put("x\u0000", 1);
>     Assert.assertTrue(trie.containsKey("x")); // fail
> }
> {code}
> In the above example, the key "x" suddenly disappears when an entry with key "x\u0000" is inserted.
> The PatriciaTrie docs do not mention anything about null-terminated strings. In general, I believe this also breaks the JDK Map contract since the keys "x".equals("x\u0000") is false. 
> This bug was found automatically using JQF: [https://github.com/rohanpadhye/jqf].
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)