You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Tom Winch (JIRA)" <ji...@apache.org> on 2015/11/04 14:50:27 UTC

[jira] [Updated] (SOLR-8236) Federated Search (new) - NumFound

     [ https://issues.apache.org/jira/browse/SOLR-8236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom Winch updated SOLR-8236:
----------------------------
    Description: 
This issue describes a search component for estimating numFounds in federated search - that is, distributed search over documents stored in separated instances of SOLR (for example, one server per continent), where a single document (identified by an agreed, common unique id) may be stored in more than one server instance, with (possibly) differing fields and data.

When documents are present on more than one distributed server, which is normally the case in the federated search situation, then the numFound reported by the search is incorrect. For small result sets we may return all the document ids matching the query from each server, in order to compute an exact numFound. For large result sets this is impractical, and the numFound may be estimated using statistical techniques.

Statistical techniques may be driven by the following heuristic: if two shards always return the same numFound for queries, then they contain the same document ids, and the combined numFound is the same as for each. On the other hand, if two shards always return different numFounds for queries, then they likely contain independent document ids, and the numFounds should be summed.

This issue combines with others to provide full federated search support. See also SOLR-8234 and SOLR-8235.

–

Note that this is part of a new implementation of federated search as opposed to the older issues SOLR-3799 through SOLR-3805.

> Federated Search (new) - NumFound
> ---------------------------------
>
>                 Key: SOLR-8236
>                 URL: https://issues.apache.org/jira/browse/SOLR-8236
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Tom Winch
>            Priority: Minor
>
> This issue describes a search component for estimating numFounds in federated search - that is, distributed search over documents stored in separated instances of SOLR (for example, one server per continent), where a single document (identified by an agreed, common unique id) may be stored in more than one server instance, with (possibly) differing fields and data.
> When documents are present on more than one distributed server, which is normally the case in the federated search situation, then the numFound reported by the search is incorrect. For small result sets we may return all the document ids matching the query from each server, in order to compute an exact numFound. For large result sets this is impractical, and the numFound may be estimated using statistical techniques.
> Statistical techniques may be driven by the following heuristic: if two shards always return the same numFound for queries, then they contain the same document ids, and the combined numFound is the same as for each. On the other hand, if two shards always return different numFounds for queries, then they likely contain independent document ids, and the numFounds should be summed.
> This issue combines with others to provide full federated search support. See also SOLR-8234 and SOLR-8235.
> –
> Note that this is part of a new implementation of federated search as opposed to the older issues SOLR-3799 through SOLR-3805.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org