You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "David Smiley (JIRA)" <ji...@apache.org> on 2013/08/05 16:24:48 UTC

[jira] [Commented] (SOLR-5093) Rewrite field:* to use the filter cache

    [ https://issues.apache.org/jira/browse/SOLR-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13729536#comment-13729536 ] 

David Smiley commented on SOLR-5093:
------------------------------------

I guess I change my mind; the veto arguments are good.

Mikhail, I like your idea on making a sub-clause be filter-cache'able.  But I don't think it should be a separate query parser because it's an orthogonal issue to how the query is parsed.  Perhaps a special local-param filterCache=true.  Your example would become:

{noformat}
  q=bee:blah OR {! filterCache=true}foo:bar OR {! filterCache=true}foo:bar
{noformat}

A key thing to document would not only be that this clause would be cached in the filter-cache, but also that it would constant-score.
                
> Rewrite field:* to use the filter cache
> ---------------------------------------
>
>                 Key: SOLR-5093
>                 URL: https://issues.apache.org/jira/browse/SOLR-5093
>             Project: Solr
>          Issue Type: New Feature
>          Components: query parsers
>            Reporter: David Smiley
>
> Sometimes people writes a query including something like {{field:*}} which matches all documents that have an indexed value in that field.  That can be particularly expensive for tokenized text, numeric, and spatial fields.  The expert advise is to index a separate boolean field that is used in place of these query clauses, but that's annoying to do and it can take users a while to realize that's what they need to do.
> I propose that Solr's query parser rewrite such queries to return a query backed by Solr's filter cache.  The underlying query happens once (and it's slow this time) and then it's cached after which it's super-fast to reuse.  Unfortunately Solr's filter cache is currently index global, not per-segment; that's being handled in a separate issue.  
> Related to this, it may be worth considering if Solr should behind the scenes index a field that records which fields have indexed values, and then it could use this indexed data to power these queries so they are always fast to execute.  Likewise, {{\[\* TO \*\]}} open-ended range queries could similarly use this.
> For an example of how a user bumped into this, see:
> http://lucene.472066.n3.nabble.com/Performance-question-on-Spatial-Search-tt4081150.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org