You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2016/10/19 20:59:58 UTC

[jira] [Comment Edited] (SOLR-7580) Number of ScoreDoc instances equals rows parameter, not actual number of matches

    [ https://issues.apache.org/jira/browse/SOLR-7580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15589845#comment-15589845 ] 

Markus Jelsma edited comment on SOLR-7580 at 10/19/16 8:59 PM:
---------------------------------------------------------------

Ah, i was almost lead to believe this issue was addressed :)

The work-around is fine, but with the whole analysis and processing inside Solr nowadays, i'd expect more users could run into this. I, at least, have no idea how to address this and post a patch.


was (Author: markus17):
Ah, i was almost lead to believe this issue was addressed :)

The work-around is fine, but with the whole analysis and processing inside Solr, i'd expect more users could run into this. I, at least, have no idea how to address this and post a patch.

> Number of ScoreDoc instances equals rows parameter, not actual number of matches
> --------------------------------------------------------------------------------
>
>                 Key: SOLR-7580
>                 URL: https://issues.apache.org/jira/browse/SOLR-7580
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 5.1
>            Reporter: Markus Jelsma
>             Fix For: 5.5, 6.0
>
>
> We have several batch jobs that use StreamingResponseCallback to collect all records matching a specific query. For each record, we execute a new query and need all results without paging through them. Because we do not know the amount of matches to expect, we do setRows(Integer.MAX_VALUE);. According to the VisualVM samples, this results in a huge amount of ScoreDoc instances, making the query unreasonably slow.
> The current work-around we use is to execute the same query with setRows(0), get numResults, and then reissue the query with setRows(numResults). This is fast, almost as fast as one would expect.
> This is, however, a very dirty work-around. I am unsure whether this is a Solr or Lucene issue, SolrIndexSearcher is a beast to debug ;)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org