You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shai Erera (JIRA)" <ji...@apache.org> on 2008/12/29 12:46:44 UTC

[jira] Updated: (LUCENE-1502) CharArraySet behaves inconsistently in add(Object) and contains(Object)

     [ https://issues.apache.org/jira/browse/LUCENE-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-1502:
-------------------------------

    Attachment: LUCENE-1502.patch

Patch fixes CharArraySet and adds a test to TestCharArraySet.

> CharArraySet behaves inconsistently in add(Object) and contains(Object)
> -----------------------------------------------------------------------
>
>                 Key: LUCENE-1502
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1502
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Analysis
>            Reporter: Shai Erera
>             Fix For: 2.4.1, 2.9
>
>         Attachments: LUCENE-1502.patch
>
>
> CharArraySet's add(Object) method looks like this:
>     if (o instanceof char[]) {
>       return add((char[])o);
>     } else if (o instanceof String) {
>       return add((String)o);
>     } else if (o instanceof CharSequence) {
>       return add((CharSequence)o);
>     } else {
>       return add(o.toString());
>     }
> You'll notice that in the case of an Object (for example, Integer), the o.toString() is added. However, its contains(Object) method looks like this:
>     if (o instanceof char[]) {
>       char[] text = (char[])o;
>       return contains(text, 0, text.length);
>     } else if (o instanceof CharSequence) {
>       return contains((CharSequence)o);
>     }
>     return false;
> In case of contains(Integer), it always returns false. I've added a simple test to TestCharArraySet, which reproduces the problem:
>   public void testObjectContains() {
>     CharArraySet set = new CharArraySet(10, true);
>     Integer val = new Integer(1);
>     set.add(val);
>     assertTrue(set.contains(val));
>     assertTrue(set.contains(new Integer(1)));
>   }
> Changing contains(Object) to this, solves the problem:
>     if (o instanceof char[]) {
>       char[] text = (char[])o;
>       return contains(text, 0, text.length);
>     } 
>     return contains(o.toString());
> The patch also includes few minor improvements (which were discussed on the mailing list) such as the removal of the following dead code from getHashCode(CharSequence):
>       if (false && text instanceof String) {
>         code = text.hashCode();
> and simplifying add(Object):
>     if (o instanceof char[]) {
>       return add((char[])o);
>     }
>     return add(o.toString());
> (which also aligns with the equivalent contains() method).
> One thing that's still left open is whether we can avoid the calls to Character.toLowerCase calls in all the char[] array methods, by first converting the char[] to lowercase, and then passing it through the equals() and getHashCode() methods. It works for add(), but fails for contains(char[]) since it modifies the input array.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org