You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Ahmet Arslan (JIRA)" <ji...@apache.org> on 2016/04/06 09:38:25 UTC

[jira] [Comment Edited] (LUCENE-7148) Support boolean subset matching

    [ https://issues.apache.org/jira/browse/LUCENE-7148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15227887#comment-15227887 ] 

Ahmet Arslan edited comment on LUCENE-7148 at 4/6/16 7:37 AM:
--------------------------------------------------------------

bq. Perhaps you mean something like Solr's frange that filters based on the value? 
Exactly. Given that q=john smith, lets assume that we have a field titleLenght that stores the number of words in the field.  We can even extract that info from norm doc values later on. Something like: {noformat} fq={!frange l=0 u=0 cache=false cost=200} sub(titleLength, sum(termfreq(title,'smith'), termfreq(title,'john'))) {noformat}

bq. That would be O(docs) as it evaluates per doc.
Cant we make this filter query executed last, with cache=false cost=150?


was (Author: iorixxx):
bq. Perhaps you mean something like Solr's frange that filters based on the value? 
Exactly. Given that q=john smith, lets assume that we have a field titleLenght that stores the number of words in the field.  We can even extract that info from norm doc values later on. Something like fq={!frange l=0 u=0} sub(titleLength, sum(termfreq(title,'smith'), termfreq(title,'john')))

bq. That would be O(docs) as it evaluates per doc.
Cant we make this filter query executed last, with cache=false cost=150?

> Support boolean subset matching
> -------------------------------
>
>                 Key: LUCENE-7148
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7148
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: core/search
>    Affects Versions: 5.x
>            Reporter: Otmar Caduff
>              Labels: newbie
>
> In Lucene, I know of the possibility of Occur.SHOULD, Occur.MUST and the “minimum should match” setting on the boolean query.
> Now, when querying, I want to
> - (1)  match the documents which either contain all the terms of the query (Occur.MUST for all terms would do that) or,
> - (2)  if all terms for a given field of a document are a subset of the query terms, that document should match as well.
> Example:
> Document d hast field f with terms A, B, C
> Query with the following terms should match that document:
> A
> B
> A B
> A B C
> A B C D
> Query with the following terms should not match:
> D
> A B D



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org