You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Ron Mayer (JIRA)" <ji...@apache.org> on 2010/08/19 13:58:16 UTC
[jira] Created: (SOLR-2058) Adds optional "slop" to "pf2", "pf3"
and "pf" parameters
Adds optional "slop" to "pf2", "pf3" and "pf" parameters
---------------------------------------------------------
Key: SOLR-2058
URL: https://issues.apache.org/jira/browse/SOLR-2058
Project: Solr
Issue Type: Improvement
Components: SearchComponents - other
Affects Versions: 3.1, 4.0
Environment: n/a
Reporter: Ron Mayer
Priority: Minor
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
{quote}
From Ron Mayer <r....@0ape.com>
my results might
be even better if I had a couple different "pf2"s with different "ps"'s
at the same time.
In particular. One with ps=0 to put a high boost on ones the have
the right ordering of words. For example insuring that:
"red hat black jacket"
boosts only red hats and not black hats.
And another pf2 with a more modest boost with ps=5 or so to handle
the query above also boosting docs with "red baseball hat".
{quote}
[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
{quote}
From Yonik Seeley <yo...@lucidimagination.com>
Perhaps fold it into the pf/pf2 syntax?
pf=text^2 // current syntax... makes phrases with a boost of 2
pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and
a boost of 2
That actually seems pretty natural given the lucene query syntax - an
actual boosted sloppy phrase query already looks like
text:"foo bar"~1^2
-Yonik
http
{quote}
[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
{quote}
From Chris Hostetter <ho...@fucit.org>
Big +1 to this idea ... the existing "ps" param can stick arround as the
default for any field that doesn't specify it's own slop in the pf/pf2/pf3
fields using the "~" syntax.
-Hoss
{quote}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] Commented: (SOLR-2058) Adds optional "phrase slop" to
edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904352#action_12904352 ]
Ron Mayer commented on SOLR-2058:
---------------------------------
Also wanted to note - I've been using this on a QA machine with 4 million documents, and it has been working extremely well for me; with multiple simultaneous phrase slop.
In particular, if I use:
* a high boost (500) on pf with slop of 0
* a moderate boost (50) on pf with a slop of 50
* a moderate boost (50) on pf2 with a slop of 0
* a low boost (10) on pf2 with a slop of 10
it's doing a *great* job of getting the most relevant document in the #1 spot, and a very good job at getting the entire first page of results filled with highly relevant documents.
> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
> ----------------------------------------------------------------------------------------------------
>
> Key: SOLR-2058
> URL: https://issues.apache.org/jira/browse/SOLR-2058
> Project: Solr
> Issue Type: Improvement
> Components: SearchComponents - other
> Affects Versions: 3.1, 4.0
> Environment: n/a
> Reporter: Ron Mayer
> Priority: Minor
> Attachments: pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From Ron Mayer <r....@0ape.com>
> ... my results might be even better if I had a couple different "pf2"s with different "ps"'s at the same time. In particular. One with ps=0 to put a high boost on ones the have the right ordering of words. For example insuring that [the query]:
> "red hat black jacket"
> boosts only documents with "red hats" and not "black hats". And another pf2 with a more modest boost with ps=5 or so to handle the query above also boosting docs with
> "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2 // current syntax... makes phrases with a boost of 2
> pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3
> fields using the "~" syntax.
> -Hoss
> {quote}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] Updated: (SOLR-2058) Adds optional "slop" to "pf2", "pf3"
and "pf" parameters
Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ron Mayer updated SOLR-2058:
----------------------------
Attachment: pf2_with_slop.patch
This patch is my first draft at implementing this feature.
Any feedback would be appreciated.
It seems to happily turn a query like
[http://localhost:8983/solr/select?defType=edismax&fl=id,text,score&q=enterprise+search+foobar&ps=5&qf=text&debugQuery=true&pf2=name~0^5555&ps=7&pf2=name^12+name~10]
into what I believe is the desired parsed query:
+((text:enterpris) (text:search) (text:foobar)) ((name:"enterprise search"~5^12.0) (name:"search foobar"~5^12.0)) ((name:"enterprise search"^5555.0) (name:"search foobar"^5555.0)) ((name:"enterprise search"~10) (name:"search foobar"~10))
which looks like it should give a high boost to docs where both words appear right next to each other, but still substantial boosts to docs where the pairs of words are a few words apart.
> Adds optional "slop" to "pf2", "pf3" and "pf" parameters
> ---------------------------------------------------------
>
> Key: SOLR-2058
> URL: https://issues.apache.org/jira/browse/SOLR-2058
> Project: Solr
> Issue Type: Improvement
> Components: SearchComponents - other
> Affects Versions: 3.1, 4.0
> Environment: n/a
> Reporter: Ron Mayer
> Priority: Minor
> Attachments: pf2_with_slop.patch
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From Ron Mayer <r....@0ape.com>
> my results might
> be even better if I had a couple different "pf2"s with different "ps"'s
> at the same time.
> In particular. One with ps=0 to put a high boost on ones the have
> the right ordering of words. For example insuring that:
> "red hat black jacket"
> boosts only red hats and not black hats.
> And another pf2 with a more modest boost with ps=5 or so to handle
> the query above also boosting docs with "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2 // current syntax... makes phrases with a boost of 2
> pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> text:"foo bar"~1^2
> -Yonik
> http
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3
> fields using the "~" syntax.
> -Hoss
> {quote}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] Updated: (SOLR-2058) Adds optional "phrase slop" to "pf2",
"pf3" and "pf" parameters with field~slop^boost syntax
Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ron Mayer updated SOLR-2058:
----------------------------
Original Estimate: (was: 168h)
Remaining Estimate: (was: 168h)
> Adds optional "phrase slop" to "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
> --------------------------------------------------------------------------------------------
>
> Key: SOLR-2058
> URL: https://issues.apache.org/jira/browse/SOLR-2058
> Project: Solr
> Issue Type: Improvement
> Components: SearchComponents - other
> Affects Versions: 3.1, 4.0
> Environment: n/a
> Reporter: Ron Mayer
> Priority: Minor
> Attachments: pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From Ron Mayer <r....@0ape.com>
> my results might
> be even better if I had a couple different "pf2"s with different "ps"'s
> at the same time.
> In particular. One with ps=0 to put a high boost on ones the have
> the right ordering of words. For example insuring that:
> "red hat black jacket"
> boosts only red hats and not black hats.
> And another pf2 with a more modest boost with ps=5 or so to handle
> the query above also boosting docs with "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2 // current syntax... makes phrases with a boost of 2
> pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> text:"foo bar"~1^2
> -Yonik
> http
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3
> fields using the "~" syntax.
> -Hoss
> {quote}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] Commented: (SOLR-2058) Adds optional "phrase slop" to
edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900849#action_12900849 ]
Yonik Seeley commented on SOLR-2058:
------------------------------------
Map<Integer,Map<String,Float>> is an "interesting" way to encode the slop and the boost :-)
But perhaps we should make a FieldParams class?
We could keep the separate pf,pf2,pf3 maps... or encode the number of terms to make phrases out of in the FieldParams class:
{code}
class FIeldParams {
int wordGrams; // make bigrams if 2, trigrams if 3, or all if MAX_INT
int slop;
float boost;
}
{code}
> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
> ----------------------------------------------------------------------------------------------------
>
> Key: SOLR-2058
> URL: https://issues.apache.org/jira/browse/SOLR-2058
> Project: Solr
> Issue Type: Improvement
> Components: SearchComponents - other
> Affects Versions: 3.1, 4.0
> Environment: n/a
> Reporter: Ron Mayer
> Priority: Minor
> Attachments: pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From Ron Mayer <r....@0ape.com>
> ... my results might be even better if I had a couple different "pf2"s with different "ps"'s at the same time. In particular. One with ps=0 to put a high boost on ones the have the right ordering of words. For example insuring that [the query]:
> "red hat black jacket"
> boosts only documents with "red hats" and not "black hats". And another pf2 with a more modest boost with ps=5 or so to handle the query above also boosting docs with
> "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2 // current syntax... makes phrases with a boost of 2
> pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3
> fields using the "~" syntax.
> -Hoss
> {quote}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] Updated: (SOLR-2058) Adds optional "phrase slop" to edismax
"pf2", "pf3" and "pf" parameters with field~slop^boost syntax
Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ron Mayer updated SOLR-2058:
----------------------------
Summary: Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax (was: Adds optional "phrase slop" to "pf2", "pf3" and "pf" parameters with field~slop^boost syntax)
> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
> ----------------------------------------------------------------------------------------------------
>
> Key: SOLR-2058
> URL: https://issues.apache.org/jira/browse/SOLR-2058
> Project: Solr
> Issue Type: Improvement
> Components: SearchComponents - other
> Affects Versions: 3.1, 4.0
> Environment: n/a
> Reporter: Ron Mayer
> Priority: Minor
> Attachments: pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From Ron Mayer <r....@0ape.com>
> my results might
> be even better if I had a couple different "pf2"s with different "ps"'s
> at the same time.
> In particular. One with ps=0 to put a high boost on ones the have
> the right ordering of words. For example insuring that:
> "red hat black jacket"
> boosts only red hats and not black hats.
> And another pf2 with a more modest boost with ps=5 or so to handle
> the query above also boosting docs with "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2 // current syntax... makes phrases with a boost of 2
> pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> text:"foo bar"~1^2
> -Yonik
> http
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3
> fields using the "~" syntax.
> -Hoss
> {quote}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] Commented: (SOLR-2058) Adds optional "phrase slop" to
edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904351#action_12904351 ]
Ron Mayer commented on SOLR-2058:
---------------------------------
Totally agree that that was a bizarre way I used of encoding the boost.
I did that on my first draft just to minimize the impact with the rest of the code (where some functions were expecting the "Map<String,Float>" pieces).
I'll post an updated patch with a more sane class like the one you described. I'm new enough to the code that I'm not sure where such a class should reside. Any opinions?
> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
> ----------------------------------------------------------------------------------------------------
>
> Key: SOLR-2058
> URL: https://issues.apache.org/jira/browse/SOLR-2058
> Project: Solr
> Issue Type: Improvement
> Components: SearchComponents - other
> Affects Versions: 3.1, 4.0
> Environment: n/a
> Reporter: Ron Mayer
> Priority: Minor
> Attachments: pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From Ron Mayer <r....@0ape.com>
> ... my results might be even better if I had a couple different "pf2"s with different "ps"'s at the same time. In particular. One with ps=0 to put a high boost on ones the have the right ordering of words. For example insuring that [the query]:
> "red hat black jacket"
> boosts only documents with "red hats" and not "black hats". And another pf2 with a more modest boost with ps=5 or so to handle the query above also boosting docs with
> "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2 // current syntax... makes phrases with a boost of 2
> pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3
> fields using the "~" syntax.
> -Hoss
> {quote}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] Issue Comment Edited: (SOLR-2058) Adds optional "phrase
slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost
syntax
Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904390#action_12904390 ]
Ron Mayer edited comment on SOLR-2058 at 8/30/10 6:40 PM:
----------------------------------------------------------
Submitted an updated patch to use a more sane FieldParams class to pass fields,, boosts, and phrase slops instead of the bizarre Map<Integer,Map<String,Float>> I was using before.
was (Author: ramayer):
Updated to use more sane FieldParams class to pass fields,, boosts, and phrase slops instead of the bizarre Map<Integer,Map<String,Float>> I was using before.
> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
> ----------------------------------------------------------------------------------------------------
>
> Key: SOLR-2058
> URL: https://issues.apache.org/jira/browse/SOLR-2058
> Project: Solr
> Issue Type: Improvement
> Components: SearchComponents - other
> Affects Versions: 3.1, 4.0
> Environment: n/a
> Reporter: Ron Mayer
> Priority: Minor
> Attachments: edismax_pf_with_slop_v2.patch, pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From Ron Mayer <r....@0ape.com>
> ... my results might be even better if I had a couple different "pf2"s with different "ps"'s at the same time. In particular. One with ps=0 to put a high boost on ones the have the right ordering of words. For example insuring that [the query]:
> "red hat black jacket"
> boosts only documents with "red hats" and not "black hats". And another pf2 with a more modest boost with ps=5 or so to handle the query above also boosting docs with
> "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2 // current syntax... makes phrases with a boost of 2
> pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3
> fields using the "~" syntax.
> -Hoss
> {quote}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] Updated: (SOLR-2058) Adds optional "phrase slop" to edismax
"pf2", "pf3" and "pf" parameters with field~slop^boost syntax
Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ron Mayer updated SOLR-2058:
----------------------------
Attachment: edismax_pf_with_slop_v2.patch
Updated to use more sane FieldParams class to pass fields,, boosts, and phrase slops instead of the bizarre Map<Integer,Map<String,Float>> I was using before.
> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
> ----------------------------------------------------------------------------------------------------
>
> Key: SOLR-2058
> URL: https://issues.apache.org/jira/browse/SOLR-2058
> Project: Solr
> Issue Type: Improvement
> Components: SearchComponents - other
> Affects Versions: 3.1, 4.0
> Environment: n/a
> Reporter: Ron Mayer
> Priority: Minor
> Attachments: edismax_pf_with_slop_v2.patch, pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From Ron Mayer <r....@0ape.com>
> ... my results might be even better if I had a couple different "pf2"s with different "ps"'s at the same time. In particular. One with ps=0 to put a high boost on ones the have the right ordering of words. For example insuring that [the query]:
> "red hat black jacket"
> boosts only documents with "red hats" and not "black hats". And another pf2 with a more modest boost with ps=5 or so to handle the query above also boosting docs with
> "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2 // current syntax... makes phrases with a boost of 2
> pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3
> fields using the "~" syntax.
> -Hoss
> {quote}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] Updated: (SOLR-2058) Adds optional "phrase slop" to edismax
"pf2", "pf3" and "pf" parameters with field~slop^boost syntax
Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ron Mayer updated SOLR-2058:
----------------------------
Attachment: edismax_pf_with_slop_v2.1.patch
Removed a couple unnecessary lines compared to the last version
> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
> ----------------------------------------------------------------------------------------------------
>
> Key: SOLR-2058
> URL: https://issues.apache.org/jira/browse/SOLR-2058
> Project: Solr
> Issue Type: Improvement
> Components: SearchComponents - other
> Affects Versions: 3.1, 4.0
> Environment: n/a
> Reporter: Ron Mayer
> Priority: Minor
> Attachments: edismax_pf_with_slop_v2.1.patch, edismax_pf_with_slop_v2.patch, pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From Ron Mayer <r....@0ape.com>
> ... my results might be even better if I had a couple different "pf2"s with different "ps"'s at the same time. In particular. One with ps=0 to put a high boost on ones the have the right ordering of words. For example insuring that [the query]:
> "red hat black jacket"
> boosts only documents with "red hats" and not "black hats". And another pf2 with a more modest boost with ps=5 or so to handle the query above also boosting docs with
> "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2 // current syntax... makes phrases with a boost of 2
> pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3
> fields using the "~" syntax.
> -Hoss
> {quote}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] Updated: (SOLR-2058) Adds optional "phrase slop" to "pf2",
"pf3" and "pf" parameters with field~slop^boost syntax
Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ron Mayer updated SOLR-2058:
----------------------------
Summary: Adds optional "phrase slop" to "pf2", "pf3" and "pf" parameters with field~slop^boost syntax (was: Adds optional "slop" to "pf2", "pf3" and "pf" parameters )
> Adds optional "phrase slop" to "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
> --------------------------------------------------------------------------------------------
>
> Key: SOLR-2058
> URL: https://issues.apache.org/jira/browse/SOLR-2058
> Project: Solr
> Issue Type: Improvement
> Components: SearchComponents - other
> Affects Versions: 3.1, 4.0
> Environment: n/a
> Reporter: Ron Mayer
> Priority: Minor
> Attachments: pf2_with_slop.patch
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From Ron Mayer <r....@0ape.com>
> my results might
> be even better if I had a couple different "pf2"s with different "ps"'s
> at the same time.
> In particular. One with ps=0 to put a high boost on ones the have
> the right ordering of words. For example insuring that:
> "red hat black jacket"
> boosts only red hats and not black hats.
> And another pf2 with a more modest boost with ps=5 or so to handle
> the query above also boosting docs with "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2 // current syntax... makes phrases with a boost of 2
> pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> text:"foo bar"~1^2
> -Yonik
> http
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3
> fields using the "~" syntax.
> -Hoss
> {quote}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] Issue Comment Edited: (SOLR-2058) Adds optional "slop" to
"pf2", "pf3" and "pf" parameters
Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900260#action_12900260 ]
Ron Mayer edited comment on SOLR-2058 at 8/19/10 8:04 AM:
----------------------------------------------------------
This patch is my first draft at implementing this feature.
Any feedback would be appreciated.
It seems to happily turn a query like
[http://localhost:8983/solr/select?defType=edismax&fl=id,text,score&q=enterprise+search+foobar&ps=5&qf=text&debugQuery=true&pf2=name~0^5555&pf2=name^12+name~10]
into what I believe is the desired parsed query:
+((text:enterpris) (text:search) (text:foobar)) ((name:"enterprise search"~5^12.0) (name:"search foobar"~5^12.0)) ((name:"enterprise search"^5555.0) (name:"search foobar"^5555.0)) ((name:"enterprise search"~10) (name:"search foobar"~10))
which looks like it should give a high boost to docs where both words appear right next to each other, but still substantial boosts to docs where the pairs of words are a few words apart.
was (Author: ramayer):
This patch is my first draft at implementing this feature.
Any feedback would be appreciated.
It seems to happily turn a query like
[http://localhost:8983/solr/select?defType=edismax&fl=id,text,score&q=enterprise+search+foobar&ps=5&qf=text&debugQuery=true&pf2=name~0^5555&ps=7&pf2=name^12+name~10]
into what I believe is the desired parsed query:
+((text:enterpris) (text:search) (text:foobar)) ((name:"enterprise search"~5^12.0) (name:"search foobar"~5^12.0)) ((name:"enterprise search"^5555.0) (name:"search foobar"^5555.0)) ((name:"enterprise search"~10) (name:"search foobar"~10))
which looks like it should give a high boost to docs where both words appear right next to each other, but still substantial boosts to docs where the pairs of words are a few words apart.
> Adds optional "slop" to "pf2", "pf3" and "pf" parameters
> ---------------------------------------------------------
>
> Key: SOLR-2058
> URL: https://issues.apache.org/jira/browse/SOLR-2058
> Project: Solr
> Issue Type: Improvement
> Components: SearchComponents - other
> Affects Versions: 3.1, 4.0
> Environment: n/a
> Reporter: Ron Mayer
> Priority: Minor
> Attachments: pf2_with_slop.patch
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From Ron Mayer <r....@0ape.com>
> my results might
> be even better if I had a couple different "pf2"s with different "ps"'s
> at the same time.
> In particular. One with ps=0 to put a high boost on ones the have
> the right ordering of words. For example insuring that:
> "red hat black jacket"
> boosts only red hats and not black hats.
> And another pf2 with a more modest boost with ps=5 or so to handle
> the query above also boosting docs with "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2 // current syntax... makes phrases with a boost of 2
> pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> text:"foo bar"~1^2
> -Yonik
> http
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3
> fields using the "~" syntax.
> -Hoss
> {quote}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] Updated: (SOLR-2058) Adds optional "phrase slop" to edismax
"pf2", "pf3" and "pf" parameters with field~slop^boost syntax
Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ron Mayer updated SOLR-2058:
----------------------------
Description:
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
{quote}
From Ron Mayer <r....@0ape.com>
... my results might be even better if I had a couple different "pf2"s with different "ps"'s at the same time. In particular. One with ps=0 to put a high boost on ones the have the right ordering of words. For example insuring that [the query]:
"red hat black jacket"
boosts only documents with "red hats" and not "black hats". And another pf2 with a more modest boost with ps=5 or so to handle the query above also boosting docs with
"red baseball hat".
{quote}
[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
{quote}
From Yonik Seeley <yo...@lucidimagination.com>
Perhaps fold it into the pf/pf2 syntax?
pf=text^2 // current syntax... makes phrases with a boost of 2
pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and
a boost of 2
That actually seems pretty natural given the lucene query syntax - an
actual boosted sloppy phrase query already looks like
{{text:"foo bar"~1^2}}
-Yonik
{quote}
[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
{quote}
From Chris Hostetter <ho...@fucit.org>
Big +1 to this idea ... the existing "ps" param can stick arround as the
default for any field that doesn't specify it's own slop in the pf/pf2/pf3
fields using the "~" syntax.
-Hoss
{quote}
was:
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
{quote}
From Ron Mayer <r....@0ape.com>
my results might
be even better if I had a couple different "pf2"s with different "ps"'s
at the same time.
In particular. One with ps=0 to put a high boost on ones the have
the right ordering of words. For example insuring that:
"red hat black jacket"
boosts only red hats and not black hats.
And another pf2 with a more modest boost with ps=5 or so to handle
the query above also boosting docs with "red baseball hat".
{quote}
[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
{quote}
From Yonik Seeley <yo...@lucidimagination.com>
Perhaps fold it into the pf/pf2 syntax?
pf=text^2 // current syntax... makes phrases with a boost of 2
pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and
a boost of 2
That actually seems pretty natural given the lucene query syntax - an
actual boosted sloppy phrase query already looks like
text:"foo bar"~1^2
-Yonik
http
{quote}
[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
{quote}
From Chris Hostetter <ho...@fucit.org>
Big +1 to this idea ... the existing "ps" param can stick arround as the
default for any field that doesn't specify it's own slop in the pf/pf2/pf3
fields using the "~" syntax.
-Hoss
{quote}
> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
> ----------------------------------------------------------------------------------------------------
>
> Key: SOLR-2058
> URL: https://issues.apache.org/jira/browse/SOLR-2058
> Project: Solr
> Issue Type: Improvement
> Components: SearchComponents - other
> Affects Versions: 3.1, 4.0
> Environment: n/a
> Reporter: Ron Mayer
> Priority: Minor
> Attachments: pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From Ron Mayer <r....@0ape.com>
> ... my results might be even better if I had a couple different "pf2"s with different "ps"'s at the same time. In particular. One with ps=0 to put a high boost on ones the have the right ordering of words. For example insuring that [the query]:
> "red hat black jacket"
> boosts only documents with "red hats" and not "black hats". And another pf2 with a more modest boost with ps=5 or so to handle the query above also boosting docs with
> "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2 // current syntax... makes phrases with a boost of 2
> pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3
> fields using the "~" syntax.
> -Hoss
> {quote}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org