You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Fa...@emc.com on 2008/04/21 11:34:03 UTC

how to query against payload

Hi,

                I want to use payload to store some kind of object id
which is an arbitrary byte array for better performance. But I do need
some kind of function like searching against payload value. 

 

Also when the hits are available, how to get the payload of a specific
term from a document without set the field as stored? Currently I found
the only available interface is IndexReader.termPosition(new Term()).
Looks we need to search again. 

I've seen there will be per document payload. When will it be ready?

 

Thanks,

Fang, Li

 


Re: how to query against payload

Posted by Grant Ingersoll <gs...@apache.org>.
Hmmm, sounds like you need a new Query.   I _think_ it could be  
something as simple as MutliplicativeTermQuery or something like that  
whereby instead of adding the score of the payload callback, you would  
multiple.  That way, if the document with the term does not have the  
payload of interest, then multiply by 0, or wait until you have seen  
all payloads for that document and if any are the right one, then  
return a score of 1, otherwise return 0.  I think this would be  
relatively easy to do using the BoostingTermQuery as a template.

Other thought:  Could you put the "payload" as a regular term at the  
same position and do a no-slop phrase query?  Or a slightly modified  
phrase query that requires both terms to be at the same position?

Just thinking out loud,
Grant

On Apr 22, 2008, at 2:51 AM, Fang_Li@emc.com wrote:

> Hi Grant,
> 	Thanks for your help.
>
> BoostingTermQuery uses reader.termPositions(term) to get the term
> position. In the Term, we cannot put any payload value to find the
> result documents. What I want is
>
> Find out all documents which have a specific payload value in a  
> specific
> term. We does not care about the value of the term.
>
> The reason is we don't want to store the binary information as a value
> so that we probably can accelerate the query performance by using
> payload. I am not sure this is a good reason to do in this way.
>
> Thanks,
>
> Li
>
>
> -----Original Message-----
> From: Grant Ingersoll [mailto:gsingers@apache.org]
> Sent: Monday, April 21, 2008 9:57 PM
> To: java-user@lucene.apache.org
> Subject: Re: how to query against payload
>
>
> On Apr 21, 2008, at 5:34 AM, Fang_Li@emc.com wrote:
>
>> Hi,
>>
>>               I want to use payload to store some kind of object id
>> which is an arbitrary byte array for better performance. But I do  
>> need
>> some kind of function like searching against payload value.
>>
>
> Have a look at the BoostingTermQuery.  If you need more than that, you
> could create some new queries using that as a model.
>
>>
>>
>> Also when the hits are available, how to get the payload of a  
>> specific
>> term from a document without set the field as stored? Currently I
>> found
>> the only available interface is IndexReader.termPosition(new Term()).
>> Looks we need to search again.
>
> https://issues.apache.org/jira/browse/LUCENE-1001.  Note, however,
> that the patch there is not going to work.  If you can help out on it,
> that would be great.
>
>
>>
>>
>> I've seen there will be per document payload. When will it be ready?
>>
>>
>>
>> Thanks,
>>
>> Fang, Li
>>
>>
>>
>
> --------------------------
> Grant Ingersoll
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ







---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: how to query against payload

Posted by Fa...@emc.com.
Hi Grant,
	Thanks for your help.

BoostingTermQuery uses reader.termPositions(term) to get the term
position. In the Term, we cannot put any payload value to find the
result documents. What I want is

Find out all documents which have a specific payload value in a specific
term. We does not care about the value of the term. 

The reason is we don't want to store the binary information as a value
so that we probably can accelerate the query performance by using
payload. I am not sure this is a good reason to do in this way.

Thanks,

Li


-----Original Message-----
From: Grant Ingersoll [mailto:gsingers@apache.org] 
Sent: Monday, April 21, 2008 9:57 PM
To: java-user@lucene.apache.org
Subject: Re: how to query against payload


On Apr 21, 2008, at 5:34 AM, Fang_Li@emc.com wrote:

> Hi,
>
>                I want to use payload to store some kind of object id
> which is an arbitrary byte array for better performance. But I do need
> some kind of function like searching against payload value.
>

Have a look at the BoostingTermQuery.  If you need more than that, you  
could create some new queries using that as a model.

>
>
> Also when the hits are available, how to get the payload of a specific
> term from a document without set the field as stored? Currently I  
> found
> the only available interface is IndexReader.termPosition(new Term()).
> Looks we need to search again.

https://issues.apache.org/jira/browse/LUCENE-1001.  Note, however,  
that the patch there is not going to work.  If you can help out on it,  
that would be great.


>
>
> I've seen there will be per document payload. When will it be ready?
>
>
>
> Thanks,
>
> Fang, Li
>
>
>

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ







---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: how to query against payload

Posted by Grant Ingersoll <gs...@apache.org>.
On Apr 21, 2008, at 5:34 AM, Fang_Li@emc.com wrote:

> Hi,
>
>                I want to use payload to store some kind of object id
> which is an arbitrary byte array for better performance. But I do need
> some kind of function like searching against payload value.
>

Have a look at the BoostingTermQuery.  If you need more than that, you  
could create some new queries using that as a model.

>
>
> Also when the hits are available, how to get the payload of a specific
> term from a document without set the field as stored? Currently I  
> found
> the only available interface is IndexReader.termPosition(new Term()).
> Looks we need to search again.

https://issues.apache.org/jira/browse/LUCENE-1001.  Note, however,  
that the patch there is not going to work.  If you can help out on it,  
that would be great.


>
>
> I've seen there will be per document payload. When will it be ready?
>
>
>
> Thanks,
>
> Fang, Li
>
>
>

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ







---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org