You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Grant Ingersoll (JIRA)" <ji...@apache.org> on 2007/03/17 22:42:09 UTC

[jira] Updated: (LUCENE-834) Payload Queries

     [ https://issues.apache.org/jira/browse/LUCENE-834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll updated LUCENE-834:
-----------------------------------

    Attachment: boosting.term.query.patch

First draft at a BoostingTermQuery, which is based on the SpanTermQuery and can be used for boosting the score of a term based on what is in the payload (for things like weighting terms higher according to their font size or part of speech).  

A couple of classes that were previously package level are now public and have been marked as Public and for derivational purposes only.


See the CHANGES.xml for some more details.

I believe all tests still pass.

> Payload Queries
> ---------------
>
>                 Key: LUCENE-834
>                 URL: https://issues.apache.org/jira/browse/LUCENE-834
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Search
>            Reporter: Grant Ingersoll
>         Assigned To: Grant Ingersoll
>            Priority: Minor
>         Attachments: boosting.term.query.patch
>
>
> Now that payloads have been implemented, it will be good to make them searchable via one or more Query mechanisms.  See http://wiki.apache.org/lucene-java/Payload_Planning for some background information and https://issues.apache.org/jira/browse/LUCENE-755 for the issue that started it all.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Updated: (LUCENE-834) Payload Queries

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Mar 17, 2007, at 3:02 PM, Andrzej Bialecki wrote:

> We had a discussion recently in Nutch about changing the way  
> typical Nutch queries are translated into Lucene queries, and  
> performance implications there.

This thread describes precisely one of the motivations behind the  
"flexible indexing" design.  It's already implemented in KinoSearch,  
using a unified postings file.  There is no "boosting term query"  
class -- TermScorer adjusts based on whether the postings format  
specifies boost per-position or per-document.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Updated: (LUCENE-834) Payload Queries

Posted by Andrzej Bialecki <ab...@getopt.org>.
Grant Ingersoll wrote:
> You know, the Nutch Dev mailing list was my last holdout for 
> subscriptions to the Lucene mailing lists!  :-)  I barely can keep up 
> with Lucene Java!
>
> I will try to have a read soon, but can't promise I can add anything 
> meaningful.

Yes, I know what you mean ... I'd be grateful if you could just take a 
look at the thread I indicated. It doesn't require any Nutch knowledge, 
because it's really about how to simplify complex Lucene boolean and 
phrase queries over several fields - and this is a real-life case that 
could perhaps serve as a good illustration in your talk ;)

Andrzej.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Updated: (LUCENE-834) Payload Queries

Posted by Grant Ingersoll <gs...@apache.org>.
On Mar 17, 2007, at 6:02 PM, Andrzej Bialecki wrote:

> Grant Ingersoll (JIRA) wrote:
>
> Grant,
>
> This is great stuff! I know quite a few projects that will love  
> this - specifically to boost terms differently based on a POS tag.
>

Michael B. did a great job on implementing the underlying storage  
mechanisms, so most kudos should go to him.

I/we hope to add several other types of Queries (see http:// 
wiki.apache.org/lucene-java/Payload_Planning and add your own thoughts)

POS, font weights, information from NLP applications, XPath, cross- 
references.  It's all good!

I am planning to have a few slides in my ApacheCon talk come May on  
the subject.

> We had a discussion recently in Nutch about changing the way  
> typical Nutch queries are translated into Lucene queries, and  
> performance implications there. If you're looking for a  
> challenge ;) could you perhaps take a look at this discussion and  
> see if you could contribute something? ;)

You know, the Nutch Dev mailing list was my last holdout for  
subscriptions to the Lucene mailing lists!  :-)  I barely can keep up  
with Lucene Java!

I will try to have a read soon, but can't promise I can add anything  
meaningful.

Cheers,
Grant



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Updated: (LUCENE-834) Payload Queries

Posted by Andrzej Bialecki <ab...@getopt.org>.
Grant Ingersoll (JIRA) wrote:
>      [ https://issues.apache.org/jira/browse/LUCENE-834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> Grant Ingersoll updated LUCENE-834:
> -----------------------------------
>
>     Attachment: boosting.term.query.patch
>
> First draft at a BoostingTermQuery, which is based on the SpanTermQuery and can be used for boosting the score of a term based on what is in the payload (for things like weighting terms higher according to their font size or part of speech).  
>
> A couple of classes that were previously package level are now public and have been marked as Public and for derivational purposes only.
>
>
> See the CHANGES.xml for some more details.
>
> I believe all tests still pass.
>
>   

Grant,

This is great stuff! I know quite a few projects that will love this - 
specifically to boost terms differently based on a POS tag.

We had a discussion recently in Nutch about changing the way typical 
Nutch queries are translated into Lucene queries, and performance 
implications there. If you're looking for a challenge ;) could you 
perhaps take a look at this discussion and see if you could contribute 
something? ;)

http://www.nabble.com/Performance-optimization-for-Nutch-index---query-tf3276316.html

Thanks in advance!

-- 
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org