You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Neil Prosser (JIRA)" <ji...@apache.org> on 2011/08/18 01:04:27 UTC

[jira] [Created] (SOLR-2716) QueryResultKey hashCode() and equals() is dependent on filter order

QueryResultKey hashCode() and equals() is dependent on filter order
-------------------------------------------------------------------

                 Key: SOLR-2716
                 URL: https://issues.apache.org/jira/browse/SOLR-2716
             Project: Solr
          Issue Type: Improvement
          Components: search
    Affects Versions: 3.3
            Reporter: Neil Prosser
            Priority: Minor


The hashCode() and equals() methods of a QueryResultKey are dependent on the order of the filters meaning that potentially identical result sets are missed when cached.

{{Query query = new TermQuery(new Term("field1", "value1"));
Query filter1 = new TermQuery(new Term("field2", "value2"));
Query filter2 = new TermQuery(new Term("field3", "value3"));

List<Query> filters1 = new ArrayList<Query>();
filters1.add(filter1);
filters1.add(filter2);

List<Query> filters2 = new ArrayList<Query>();
filters2.add(filter2);
filters2.add(filter1);

QueryResultKey key1 = new QueryResultKey(query, filters1, null, 0);
QueryResultKey key2 = new QueryResultKey(query, filters2, null, 0);

// Both the following assertions fail
assert key1.equals(key2);
assert key1.hashCode() == key2.hashCode();}}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2716) QueryResultKey hashCode() and equals() is dependent on filter order

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091569#comment-13091569 ] 

Hoss Man commented on SOLR-2716:
--------------------------------

bq. I wonder if the increased cost in equals() would be worth it... the complete solr request (including filters) are normally generated by another system, not a user, so one will normally see filters in the same order anyway.

yeah ... that was the concern i had when this came up on the list.  initially i was thinking of it as a "sort" cost, but the same general perf concern still applies: if Set.equals is generally slower then List.equals, then it's better to tell the client "to maximum cache hit rates, send your filters in a deterministic order"

if perf tests say that the Set equality is just as fast as List equality - go with the set

> QueryResultKey hashCode() and equals() is dependent on filter order
> -------------------------------------------------------------------
>
>                 Key: SOLR-2716
>                 URL: https://issues.apache.org/jira/browse/SOLR-2716
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 3.3
>            Reporter: Neil Prosser
>            Priority: Minor
>         Attachments: SOLR-2716.patch
>
>
> The hashCode() and equals() methods of a QueryResultKey are dependent on the order of the filters meaning that potentially identical result sets are missed when cached.
> Query query = new TermQuery(new Term("field1", "value1"));
> Query filter1 = new TermQuery(new Term("field2", "value2"));
> Query filter2 = new TermQuery(new Term("field3", "value3"));
> List<Query> filters1 = new ArrayList<Query>();
> filters1.add(filter1);
> filters1.add(filter2);
> List<Query> filters2 = new ArrayList<Query>();
> filters2.add(filter2);
> filters2.add(filter1);
> QueryResultKey key1 = new QueryResultKey(query, filters1, null, 0);
> QueryResultKey key2 = new QueryResultKey(query, filters2, null, 0);
> // Both the following assertions fail
> assert key1.equals(key2);
> assert key1.hashCode() == key2.hashCode();

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2716) QueryResultKey hashCode() and equals() is dependent on filter order

Posted by "Neil Prosser (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neil Prosser updated SOLR-2716:
-------------------------------

    Attachment: SOLR-2716.patch

> QueryResultKey hashCode() and equals() is dependent on filter order
> -------------------------------------------------------------------
>
>                 Key: SOLR-2716
>                 URL: https://issues.apache.org/jira/browse/SOLR-2716
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 3.3
>            Reporter: Neil Prosser
>            Priority: Minor
>         Attachments: SOLR-2716.patch
>
>
> The hashCode() and equals() methods of a QueryResultKey are dependent on the order of the filters meaning that potentially identical result sets are missed when cached.
> Query query = new TermQuery(new Term("field1", "value1"));
> Query filter1 = new TermQuery(new Term("field2", "value2"));
> Query filter2 = new TermQuery(new Term("field3", "value3"));
> List<Query> filters1 = new ArrayList<Query>();
> filters1.add(filter1);
> filters1.add(filter2);
> List<Query> filters2 = new ArrayList<Query>();
> filters2.add(filter2);
> filters2.add(filter1);
> QueryResultKey key1 = new QueryResultKey(query, filters1, null, 0);
> QueryResultKey key2 = new QueryResultKey(query, filters2, null, 0);
> // Both the following assertions fail
> assert key1.equals(key2);
> assert key1.hashCode() == key2.hashCode();

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2716) QueryResultKey hashCode() and equals() is dependent on filter order

Posted by "Mike Sokolov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091948#comment-13091948 ] 

Mike Sokolov commented on SOLR-2716:
------------------------------------

Also, if Set.equals() is slower than List.equals() and it seems worth the trouble, one could use maybe a SortedMap with keys being the filter hashCode.  This would have the effect of eliminating dups though, which could be bad in some weird case.  So maybe a Bag?

> QueryResultKey hashCode() and equals() is dependent on filter order
> -------------------------------------------------------------------
>
>                 Key: SOLR-2716
>                 URL: https://issues.apache.org/jira/browse/SOLR-2716
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 3.3
>            Reporter: Neil Prosser
>            Priority: Minor
>         Attachments: SOLR-2716.patch
>
>
> The hashCode() and equals() methods of a QueryResultKey are dependent on the order of the filters meaning that potentially identical result sets are missed when cached.
> Query query = new TermQuery(new Term("field1", "value1"));
> Query filter1 = new TermQuery(new Term("field2", "value2"));
> Query filter2 = new TermQuery(new Term("field3", "value3"));
> List<Query> filters1 = new ArrayList<Query>();
> filters1.add(filter1);
> filters1.add(filter2);
> List<Query> filters2 = new ArrayList<Query>();
> filters2.add(filter2);
> filters2.add(filter1);
> QueryResultKey key1 = new QueryResultKey(query, filters1, null, 0);
> QueryResultKey key2 = new QueryResultKey(query, filters2, null, 0);
> // Both the following assertions fail
> assert key1.equals(key2);
> assert key1.hashCode() == key2.hashCode();

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2716) QueryResultKey hashCode() and equals() is dependent on filter order

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091069#comment-13091069 ] 

Yonik Seeley commented on SOLR-2716:
------------------------------------

Although it does make sense, I wonder if the increased cost in equals() would be worth it... the complete solr request (including filters) are normally generated by another system, not a user, so one will normally see filters in the same order anyway.

Fixing hashCode for no performance impact is easy, and for equals() perhaps we can put the comparison of filters last so it's often only executed when other components already match, and optimize a number of cases (check length first, if lengths==1 then comparison is simple).  The problem is when lengths!=1 - I don't see a simple way to quickly compare w/o adding more state to the QueryCacheKey or doing a fair bit more CPU work.

> QueryResultKey hashCode() and equals() is dependent on filter order
> -------------------------------------------------------------------
>
>                 Key: SOLR-2716
>                 URL: https://issues.apache.org/jira/browse/SOLR-2716
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 3.3
>            Reporter: Neil Prosser
>            Priority: Minor
>         Attachments: SOLR-2716.patch
>
>
> The hashCode() and equals() methods of a QueryResultKey are dependent on the order of the filters meaning that potentially identical result sets are missed when cached.
> Query query = new TermQuery(new Term("field1", "value1"));
> Query filter1 = new TermQuery(new Term("field2", "value2"));
> Query filter2 = new TermQuery(new Term("field3", "value3"));
> List<Query> filters1 = new ArrayList<Query>();
> filters1.add(filter1);
> filters1.add(filter2);
> List<Query> filters2 = new ArrayList<Query>();
> filters2.add(filter2);
> filters2.add(filter1);
> QueryResultKey key1 = new QueryResultKey(query, filters1, null, 0);
> QueryResultKey key2 = new QueryResultKey(query, filters2, null, 0);
> // Both the following assertions fail
> assert key1.equals(key2);
> assert key1.hashCode() == key2.hashCode();

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2716) QueryResultKey hashCode() and equals() is dependent on filter order

Posted by "Neil Prosser (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neil Prosser updated SOLR-2716:
-------------------------------

    Description: 
The hashCode() and equals() methods of a QueryResultKey are dependent on the order of the filters meaning that potentially identical result sets are missed when cached.

Query query = new TermQuery(new Term("field1", "value1"));
Query filter1 = new TermQuery(new Term("field2", "value2"));
Query filter2 = new TermQuery(new Term("field3", "value3"));

List<Query> filters1 = new ArrayList<Query>();
filters1.add(filter1);
filters1.add(filter2);

List<Query> filters2 = new ArrayList<Query>();
filters2.add(filter2);
filters2.add(filter1);

QueryResultKey key1 = new QueryResultKey(query, filters1, null, 0);
QueryResultKey key2 = new QueryResultKey(query, filters2, null, 0);

// Both the following assertions fail
assert key1.equals(key2);
assert key1.hashCode() == key2.hashCode();

  was:
The hashCode() and equals() methods of a QueryResultKey are dependent on the order of the filters meaning that potentially identical result sets are missed when cached.

{{Query query = new TermQuery(new Term("field1", "value1"));
Query filter1 = new TermQuery(new Term("field2", "value2"));
Query filter2 = new TermQuery(new Term("field3", "value3"));

List<Query> filters1 = new ArrayList<Query>();
filters1.add(filter1);
filters1.add(filter2);

List<Query> filters2 = new ArrayList<Query>();
filters2.add(filter2);
filters2.add(filter1);

QueryResultKey key1 = new QueryResultKey(query, filters1, null, 0);
QueryResultKey key2 = new QueryResultKey(query, filters2, null, 0);

// Both the following assertions fail
assert key1.equals(key2);
assert key1.hashCode() == key2.hashCode();}}


> QueryResultKey hashCode() and equals() is dependent on filter order
> -------------------------------------------------------------------
>
>                 Key: SOLR-2716
>                 URL: https://issues.apache.org/jira/browse/SOLR-2716
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 3.3
>            Reporter: Neil Prosser
>            Priority: Minor
>
> The hashCode() and equals() methods of a QueryResultKey are dependent on the order of the filters meaning that potentially identical result sets are missed when cached.
> Query query = new TermQuery(new Term("field1", "value1"));
> Query filter1 = new TermQuery(new Term("field2", "value2"));
> Query filter2 = new TermQuery(new Term("field3", "value3"));
> List<Query> filters1 = new ArrayList<Query>();
> filters1.add(filter1);
> filters1.add(filter2);
> List<Query> filters2 = new ArrayList<Query>();
> filters2.add(filter2);
> filters2.add(filter1);
> QueryResultKey key1 = new QueryResultKey(query, filters1, null, 0);
> QueryResultKey key2 = new QueryResultKey(query, filters2, null, 0);
> // Both the following assertions fail
> assert key1.equals(key2);
> assert key1.hashCode() == key2.hashCode();

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2716) QueryResultKey hashCode() and equals() is dependent on filter order

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090892#comment-13090892 ] 

Simon Willnauer commented on SOLR-2716:
---------------------------------------

at a first glance this makes totally sense to me. I just wonder if we should take a Set<Filter> instead of List<Filter> from the beginning and maybe make the member final and use EmptySet() instead of null? Somehow I recall that we had a different issue where the order matters and I recall that there where reasons to keep it like it is but I can't find it right now. Maybe somebody else can find the issue if there is one.

> QueryResultKey hashCode() and equals() is dependent on filter order
> -------------------------------------------------------------------
>
>                 Key: SOLR-2716
>                 URL: https://issues.apache.org/jira/browse/SOLR-2716
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 3.3
>            Reporter: Neil Prosser
>            Priority: Minor
>         Attachments: SOLR-2716.patch
>
>
> The hashCode() and equals() methods of a QueryResultKey are dependent on the order of the filters meaning that potentially identical result sets are missed when cached.
> Query query = new TermQuery(new Term("field1", "value1"));
> Query filter1 = new TermQuery(new Term("field2", "value2"));
> Query filter2 = new TermQuery(new Term("field3", "value3"));
> List<Query> filters1 = new ArrayList<Query>();
> filters1.add(filter1);
> filters1.add(filter2);
> List<Query> filters2 = new ArrayList<Query>();
> filters2.add(filter2);
> filters2.add(filter1);
> QueryResultKey key1 = new QueryResultKey(query, filters1, null, 0);
> QueryResultKey key2 = new QueryResultKey(query, filters2, null, 0);
> // Both the following assertions fail
> assert key1.equals(key2);
> assert key1.hashCode() == key2.hashCode();

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2716) QueryResultKey hashCode() and equals() is dependent on filter order

Posted by "Neil Prosser (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090876#comment-13090876 ] 

Neil Prosser commented on SOLR-2716:
------------------------------------

Half-baked and my first Solr patch so hopefully I've done what's needed. The above example is included as a unit test and I've tried to keep my changes as local as possible to the QueryResultKey class.

I understand that there's some creation of HashSets/ArrayLists which can hopefully be removed. But I wanted to get people's opinion on the change before I went too far down the rabbit-hole and changed a load of files.

> QueryResultKey hashCode() and equals() is dependent on filter order
> -------------------------------------------------------------------
>
>                 Key: SOLR-2716
>                 URL: https://issues.apache.org/jira/browse/SOLR-2716
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 3.3
>            Reporter: Neil Prosser
>            Priority: Minor
>         Attachments: SOLR-2716.patch
>
>
> The hashCode() and equals() methods of a QueryResultKey are dependent on the order of the filters meaning that potentially identical result sets are missed when cached.
> Query query = new TermQuery(new Term("field1", "value1"));
> Query filter1 = new TermQuery(new Term("field2", "value2"));
> Query filter2 = new TermQuery(new Term("field3", "value3"));
> List<Query> filters1 = new ArrayList<Query>();
> filters1.add(filter1);
> filters1.add(filter2);
> List<Query> filters2 = new ArrayList<Query>();
> filters2.add(filter2);
> filters2.add(filter1);
> QueryResultKey key1 = new QueryResultKey(query, filters1, null, 0);
> QueryResultKey key2 = new QueryResultKey(query, filters2, null, 0);
> // Both the following assertions fail
> assert key1.equals(key2);
> assert key1.hashCode() == key2.hashCode();

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org