You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shai Erera (JIRA)" <ji...@apache.org> on 2008/12/29 12:46:44 UTC
[jira] Updated: (LUCENE-1502) CharArraySet behaves inconsistently
in add(Object) and contains(Object)
[ https://issues.apache.org/jira/browse/LUCENE-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shai Erera updated LUCENE-1502:
-------------------------------
Attachment: LUCENE-1502.patch
Patch fixes CharArraySet and adds a test to TestCharArraySet.
> CharArraySet behaves inconsistently in add(Object) and contains(Object)
> -----------------------------------------------------------------------
>
> Key: LUCENE-1502
> URL: https://issues.apache.org/jira/browse/LUCENE-1502
> Project: Lucene - Java
> Issue Type: Bug
> Components: Analysis
> Reporter: Shai Erera
> Fix For: 2.4.1, 2.9
>
> Attachments: LUCENE-1502.patch
>
>
> CharArraySet's add(Object) method looks like this:
> if (o instanceof char[]) {
> return add((char[])o);
> } else if (o instanceof String) {
> return add((String)o);
> } else if (o instanceof CharSequence) {
> return add((CharSequence)o);
> } else {
> return add(o.toString());
> }
> You'll notice that in the case of an Object (for example, Integer), the o.toString() is added. However, its contains(Object) method looks like this:
> if (o instanceof char[]) {
> char[] text = (char[])o;
> return contains(text, 0, text.length);
> } else if (o instanceof CharSequence) {
> return contains((CharSequence)o);
> }
> return false;
> In case of contains(Integer), it always returns false. I've added a simple test to TestCharArraySet, which reproduces the problem:
> public void testObjectContains() {
> CharArraySet set = new CharArraySet(10, true);
> Integer val = new Integer(1);
> set.add(val);
> assertTrue(set.contains(val));
> assertTrue(set.contains(new Integer(1)));
> }
> Changing contains(Object) to this, solves the problem:
> if (o instanceof char[]) {
> char[] text = (char[])o;
> return contains(text, 0, text.length);
> }
> return contains(o.toString());
> The patch also includes few minor improvements (which were discussed on the mailing list) such as the removal of the following dead code from getHashCode(CharSequence):
> if (false && text instanceof String) {
> code = text.hashCode();
> and simplifying add(Object):
> if (o instanceof char[]) {
> return add((char[])o);
> }
> return add(o.toString());
> (which also aligns with the equivalent contains() method).
> One thing that's still left open is whether we can avoid the calls to Character.toLowerCase calls in all the char[] array methods, by first converting the char[] to lowercase, and then passing it through the equals() and getHashCode() methods. It works for add(), but fails for contains(char[]) since it modifies the input array.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org