You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org> on 2015/02/25 21:41:04 UTC

[jira] [Comment Edited] (SOLR-7128) Two phase distributed search is fetching extra fields in GET_TOP_IDS phase

    [ https://issues.apache.org/jira/browse/SOLR-7128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337134#comment-14337134 ] 

Shalin Shekhar Mangar edited comment on SOLR-7128 at 2/25/15 8:40 PM:
----------------------------------------------------------------------

This patch fixes the bug and modifies the DistributedQueryComponentOptimizationTest to use the TrackingShardHandlerFactory introduced in SOLR-7147. I removed the TestTwoPhaseDistributedQuery test that I had introduced earlier.

This test now asserts that every distrib.singlePass query:
# Makes exactly 'numSlices' number of shard requests
# Makes no GET_FIELDS requests
# Must request the unique key field from shards
# Must request the score if 'fl' has score or sort by score is requested
# Requests all fields that are present in 'fl' param

It also asserts that every regular two phase distribtued search:
# Makes at most 2 * 'numSlices' number of shard requests
# Must request the unique key field from shards
# Must request the score if 'fl' has score or sort by score is requested
# Requests no fields other than id and score in GET_TOP_IDS request
# Requests exactly the fields that are present in 'fl' param in GET_FIELDS request and no others

and also asserts that:
# Each query which requests id or score or both behaves exactly like a single pass query


was (Author: shalinmangar):
This patch fixes the bug and modifies the DistributedQueryComponentOptimizationTest to use the TrackingShardHandlerFactory introduced in SOLR-7147.

This test now asserts that every distrib.singlePass query:
# Makes exactly 'numSlices' number of shard requests
# Makes no GET_FIELDS requests
# Must request the unique key field from shards
# Must request the score if 'fl' has score or sort by score is requested
# Requests all fields that are present in 'fl' param

It also asserts that every regular two phase distribtued search:
# Makes at most 2 * 'numSlices' number of shard requests
# Must request the unique key field from shards
# Must request the score if 'fl' has score or sort by score is requested
# Requests no fields other than id and score in GET_TOP_IDS request
# Requests exactly the fields that are present in 'fl' param in GET_FIELDS request and no others

and also asserts that:
# Each query which requests id or score or both behaves exactly like a single pass query

> Two phase distributed search is fetching extra fields in GET_TOP_IDS phase
> --------------------------------------------------------------------------
>
>                 Key: SOLR-7128
>                 URL: https://issues.apache.org/jira/browse/SOLR-7128
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.10.2, 4.10.3
>            Reporter: Shalin Shekhar Mangar
>            Assignee: Shalin Shekhar Mangar
>             Fix For: Trunk, 5.1
>
>         Attachments: SOLR-7128.patch, SOLR-7128.patch
>
>
> [~pqueixalos] reported this to me privately so I am creating this issue on his behalf.
> {quote}
> We found an issue in versions 4.10.+ (4.10.2 and 4.10.3 for sure).
> When processing a two phase distributed query with an explicit fl parameter, the two phases are well processed, but the GET_TOP_IDS retrieves the matching documents fields, even if a GET_FIELDS shard request is getting executed just after.
> /solr/someCollectionCore?collection=someOtherCollection&q=*:*&debug=true&fl=id,title
> => id is retrieved during GET_TOP_IDS phase that's ok:: it's our uniqueKeyField
> => title is also retrieved during GET_TOP_IDS phase, that's not ok.
> {quote}
> I'm able to reproduce this. This is pretty bad performance bug that was introduced in SOLR-5768 or it's subsequent related issues. I plan to fix this bug and add substantial tests to assert such things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org