You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Ron Mayer (JIRA)" <ji...@apache.org> on 2010/08/19 13:58:16 UTC

[jira] Created: (SOLR-2058) Adds optional "slop" to "pf2", "pf3" and "pf" parameters

Adds optional "slop" to "pf2", "pf3" and "pf" parameters 
---------------------------------------------------------

                 Key: SOLR-2058
                 URL: https://issues.apache.org/jira/browse/SOLR-2058
             Project: Solr
          Issue Type: Improvement
          Components: SearchComponents - other
    Affects Versions: 3.1, 4.0
         Environment: n/a
            Reporter: Ron Mayer
            Priority: Minor


http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
{quote}
From	Ron Mayer <r....@0ape.com>
 my results might
 be even better if I had a couple different "pf2"s with different "ps"'s
 at the same time.

 In particular.   One with ps=0 to put a high boost on ones the have
 the right ordering of words.  For example insuring that:
  "red hat black jacket"
 boosts only red hats and not black hats.

 And another pf2 with a more modest boost with ps=5 or so to handle
 the query above also boosting docs with "red baseball hat".
{quote}

[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
{quote}
From	Yonik Seeley <yo...@lucidimagination.com>

Perhaps fold it into the pf/pf2 syntax?

pf=text^2    // current syntax... makes phrases with a boost of 2
pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
a boost of 2

That actually seems pretty natural given the lucene query syntax - an
actual boosted sloppy phrase query already looks like
text:"foo bar"~1^2

-Yonik
http
{quote}

[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
{quote}
From	Chris Hostetter <ho...@fucit.org>

Big +1 to this idea ... the existing "ps" param can stick arround as the 
default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
fields using the "~" syntax.

-Hoss
{quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2058) Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax

Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904352#action_12904352 ] 

Ron Mayer commented on SOLR-2058:
---------------------------------

Also wanted to note - I've been using this on a QA machine with 4 million documents, and it  has been working extremely well for me; with multiple simultaneous phrase slop.

In particular, if I use:
    * a high boost (500)  on pf with slop of 0
    * a moderate boost (50) on pf with a slop of 50
    * a moderate boost (50) on pf2 with a slop of 0
    * a low boost (10) on pf2 with a slop of 10

it's doing a *great* job of getting the most relevant document  in the #1 spot, and a very good job at getting the entire first page of results filled with highly relevant documents.



> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2058
>                 URL: https://issues.apache.org/jira/browse/SOLR-2058
>             Project: Solr
>          Issue Type: Improvement
>          Components: SearchComponents - other
>    Affects Versions: 3.1, 4.0
>         Environment: n/a
>            Reporter: Ron Mayer
>            Priority: Minor
>         Attachments: pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From	Ron Mayer <r....@0ape.com>
> ... my results might  be even better if I had a couple different "pf2"s with different "ps"'s  at the same time.   In particular.   One with ps=0 to put a high boost on ones the have  the right ordering of words.  For example insuring that [the query]:
>   "red hat black jacket"
>  boosts only documents with "red hats" and not "black hats".   And another pf2 with a more modest boost with ps=5 or so to handle the query above also boosting docs with 
>   "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From	Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2    // current syntax... makes phrases with a boost of 2
> pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From	Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the 
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
> fields using the "~" syntax.
> -Hoss
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2058) Adds optional "slop" to "pf2", "pf3" and "pf" parameters

Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ron Mayer updated SOLR-2058:
----------------------------

    Attachment: pf2_with_slop.patch

This patch is my first draft at implementing this feature.

Any feedback would be appreciated.

It seems to happily turn a query like
[http://localhost:8983/solr/select?defType=edismax&fl=id,text,score&q=enterprise+search+foobar&ps=5&qf=text&debugQuery=true&pf2=name~0^5555&ps=7&pf2=name^12+name~10]

into what I believe is the desired parsed query:

+((text:enterpris) (text:search) (text:foobar)) ((name:"enterprise search"~5^12.0) (name:"search foobar"~5^12.0)) ((name:"enterprise search"^5555.0) (name:"search foobar"^5555.0)) ((name:"enterprise search"~10) (name:"search foobar"~10))

which looks like it should give a high boost to docs where both words appear right next to each other, but still substantial boosts to docs where the pairs of words are a few words apart.

> Adds optional "slop" to "pf2", "pf3" and "pf" parameters 
> ---------------------------------------------------------
>
>                 Key: SOLR-2058
>                 URL: https://issues.apache.org/jira/browse/SOLR-2058
>             Project: Solr
>          Issue Type: Improvement
>          Components: SearchComponents - other
>    Affects Versions: 3.1, 4.0
>         Environment: n/a
>            Reporter: Ron Mayer
>            Priority: Minor
>         Attachments: pf2_with_slop.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From	Ron Mayer <r....@0ape.com>
>  my results might
>  be even better if I had a couple different "pf2"s with different "ps"'s
>  at the same time.
>  In particular.   One with ps=0 to put a high boost on ones the have
>  the right ordering of words.  For example insuring that:
>   "red hat black jacket"
>  boosts only red hats and not black hats.
>  And another pf2 with a more modest boost with ps=5 or so to handle
>  the query above also boosting docs with "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From	Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2    // current syntax... makes phrases with a boost of 2
> pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> text:"foo bar"~1^2
> -Yonik
> http
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From	Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the 
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
> fields using the "~" syntax.
> -Hoss
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2058) Adds optional "phrase slop" to "pf2", "pf3" and "pf" parameters with field~slop^boost syntax

Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ron Mayer updated SOLR-2058:
----------------------------

     Original Estimate:     (was: 168h)
    Remaining Estimate:     (was: 168h)

> Adds optional "phrase slop" to "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
> --------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2058
>                 URL: https://issues.apache.org/jira/browse/SOLR-2058
>             Project: Solr
>          Issue Type: Improvement
>          Components: SearchComponents - other
>    Affects Versions: 3.1, 4.0
>         Environment: n/a
>            Reporter: Ron Mayer
>            Priority: Minor
>         Attachments: pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From	Ron Mayer <r....@0ape.com>
>  my results might
>  be even better if I had a couple different "pf2"s with different "ps"'s
>  at the same time.
>  In particular.   One with ps=0 to put a high boost on ones the have
>  the right ordering of words.  For example insuring that:
>   "red hat black jacket"
>  boosts only red hats and not black hats.
>  And another pf2 with a more modest boost with ps=5 or so to handle
>  the query above also boosting docs with "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From	Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2    // current syntax... makes phrases with a boost of 2
> pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> text:"foo bar"~1^2
> -Yonik
> http
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From	Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the 
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
> fields using the "~" syntax.
> -Hoss
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2058) Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900849#action_12900849 ] 

Yonik Seeley commented on SOLR-2058:
------------------------------------

Map<Integer,Map<String,Float>> is an "interesting" way to encode the slop and the boost :-)
But perhaps we should make a FieldParams class?

We could keep the separate pf,pf2,pf3 maps... or encode the number of terms to make phrases out of in the FieldParams class:
{code}
class FIeldParams {
  int wordGrams;  // make bigrams if 2, trigrams if 3, or all if MAX_INT
  int slop;
  float boost;
}
{code}

> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2058
>                 URL: https://issues.apache.org/jira/browse/SOLR-2058
>             Project: Solr
>          Issue Type: Improvement
>          Components: SearchComponents - other
>    Affects Versions: 3.1, 4.0
>         Environment: n/a
>            Reporter: Ron Mayer
>            Priority: Minor
>         Attachments: pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From	Ron Mayer <r....@0ape.com>
> ... my results might  be even better if I had a couple different "pf2"s with different "ps"'s  at the same time.   In particular.   One with ps=0 to put a high boost on ones the have  the right ordering of words.  For example insuring that [the query]:
>   "red hat black jacket"
>  boosts only documents with "red hats" and not "black hats".   And another pf2 with a more modest boost with ps=5 or so to handle the query above also boosting docs with 
>   "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From	Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2    // current syntax... makes phrases with a boost of 2
> pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From	Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the 
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
> fields using the "~" syntax.
> -Hoss
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2058) Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax

Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ron Mayer updated SOLR-2058:
----------------------------

    Summary: Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax  (was: Adds optional "phrase slop" to "pf2", "pf3" and "pf" parameters with field~slop^boost syntax)

> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2058
>                 URL: https://issues.apache.org/jira/browse/SOLR-2058
>             Project: Solr
>          Issue Type: Improvement
>          Components: SearchComponents - other
>    Affects Versions: 3.1, 4.0
>         Environment: n/a
>            Reporter: Ron Mayer
>            Priority: Minor
>         Attachments: pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From	Ron Mayer <r....@0ape.com>
>  my results might
>  be even better if I had a couple different "pf2"s with different "ps"'s
>  at the same time.
>  In particular.   One with ps=0 to put a high boost on ones the have
>  the right ordering of words.  For example insuring that:
>   "red hat black jacket"
>  boosts only red hats and not black hats.
>  And another pf2 with a more modest boost with ps=5 or so to handle
>  the query above also boosting docs with "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From	Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2    // current syntax... makes phrases with a boost of 2
> pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> text:"foo bar"~1^2
> -Yonik
> http
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From	Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the 
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
> fields using the "~" syntax.
> -Hoss
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2058) Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax

Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904351#action_12904351 ] 

Ron Mayer commented on SOLR-2058:
---------------------------------

Totally agree that that was a bizarre way I used of encoding the boost.

I did that on my first draft  just to minimize the impact with the rest of the code (where some functions were expecting the "Map<String,Float>" pieces).


I'll post an updated patch with a more sane class like the one you described.   I'm new enough to the code that I'm not sure where such a class should reside.  Any opinions?


> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2058
>                 URL: https://issues.apache.org/jira/browse/SOLR-2058
>             Project: Solr
>          Issue Type: Improvement
>          Components: SearchComponents - other
>    Affects Versions: 3.1, 4.0
>         Environment: n/a
>            Reporter: Ron Mayer
>            Priority: Minor
>         Attachments: pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From	Ron Mayer <r....@0ape.com>
> ... my results might  be even better if I had a couple different "pf2"s with different "ps"'s  at the same time.   In particular.   One with ps=0 to put a high boost on ones the have  the right ordering of words.  For example insuring that [the query]:
>   "red hat black jacket"
>  boosts only documents with "red hats" and not "black hats".   And another pf2 with a more modest boost with ps=5 or so to handle the query above also boosting docs with 
>   "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From	Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2    // current syntax... makes phrases with a boost of 2
> pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From	Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the 
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
> fields using the "~" syntax.
> -Hoss
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Issue Comment Edited: (SOLR-2058) Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax

Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904390#action_12904390 ] 

Ron Mayer edited comment on SOLR-2058 at 8/30/10 6:40 PM:
----------------------------------------------------------

Submitted an updated patch to use a more sane FieldParams class to pass fields,, boosts, and phrase slops instead of the bizarre Map<Integer,Map<String,Float>> I was using before.

      was (Author: ramayer):
    Updated to use more sane FieldParams class to pass fields,, boosts, and phrase slops instead of the bizarre Map<Integer,Map<String,Float>> I was using before.
  
> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2058
>                 URL: https://issues.apache.org/jira/browse/SOLR-2058
>             Project: Solr
>          Issue Type: Improvement
>          Components: SearchComponents - other
>    Affects Versions: 3.1, 4.0
>         Environment: n/a
>            Reporter: Ron Mayer
>            Priority: Minor
>         Attachments: edismax_pf_with_slop_v2.patch, pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From	Ron Mayer <r....@0ape.com>
> ... my results might  be even better if I had a couple different "pf2"s with different "ps"'s  at the same time.   In particular.   One with ps=0 to put a high boost on ones the have  the right ordering of words.  For example insuring that [the query]:
>   "red hat black jacket"
>  boosts only documents with "red hats" and not "black hats".   And another pf2 with a more modest boost with ps=5 or so to handle the query above also boosting docs with 
>   "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From	Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2    // current syntax... makes phrases with a boost of 2
> pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From	Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the 
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
> fields using the "~" syntax.
> -Hoss
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2058) Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax

Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ron Mayer updated SOLR-2058:
----------------------------

    Attachment: edismax_pf_with_slop_v2.patch

Updated to use more sane FieldParams class to pass fields,, boosts, and phrase slops instead of the bizarre Map<Integer,Map<String,Float>> I was using before.

> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2058
>                 URL: https://issues.apache.org/jira/browse/SOLR-2058
>             Project: Solr
>          Issue Type: Improvement
>          Components: SearchComponents - other
>    Affects Versions: 3.1, 4.0
>         Environment: n/a
>            Reporter: Ron Mayer
>            Priority: Minor
>         Attachments: edismax_pf_with_slop_v2.patch, pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From	Ron Mayer <r....@0ape.com>
> ... my results might  be even better if I had a couple different "pf2"s with different "ps"'s  at the same time.   In particular.   One with ps=0 to put a high boost on ones the have  the right ordering of words.  For example insuring that [the query]:
>   "red hat black jacket"
>  boosts only documents with "red hats" and not "black hats".   And another pf2 with a more modest boost with ps=5 or so to handle the query above also boosting docs with 
>   "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From	Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2    // current syntax... makes phrases with a boost of 2
> pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From	Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the 
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
> fields using the "~" syntax.
> -Hoss
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2058) Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax

Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ron Mayer updated SOLR-2058:
----------------------------

    Attachment: edismax_pf_with_slop_v2.1.patch

Removed a couple unnecessary lines compared to the last version

> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2058
>                 URL: https://issues.apache.org/jira/browse/SOLR-2058
>             Project: Solr
>          Issue Type: Improvement
>          Components: SearchComponents - other
>    Affects Versions: 3.1, 4.0
>         Environment: n/a
>            Reporter: Ron Mayer
>            Priority: Minor
>         Attachments: edismax_pf_with_slop_v2.1.patch, edismax_pf_with_slop_v2.patch, pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From	Ron Mayer <r....@0ape.com>
> ... my results might  be even better if I had a couple different "pf2"s with different "ps"'s  at the same time.   In particular.   One with ps=0 to put a high boost on ones the have  the right ordering of words.  For example insuring that [the query]:
>   "red hat black jacket"
>  boosts only documents with "red hats" and not "black hats".   And another pf2 with a more modest boost with ps=5 or so to handle the query above also boosting docs with 
>   "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From	Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2    // current syntax... makes phrases with a boost of 2
> pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From	Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the 
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
> fields using the "~" syntax.
> -Hoss
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2058) Adds optional "phrase slop" to "pf2", "pf3" and "pf" parameters with field~slop^boost syntax

Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ron Mayer updated SOLR-2058:
----------------------------

    Summary: Adds optional "phrase slop" to "pf2", "pf3" and "pf" parameters with field~slop^boost syntax  (was: Adds optional "slop" to "pf2", "pf3" and "pf" parameters )

> Adds optional "phrase slop" to "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
> --------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2058
>                 URL: https://issues.apache.org/jira/browse/SOLR-2058
>             Project: Solr
>          Issue Type: Improvement
>          Components: SearchComponents - other
>    Affects Versions: 3.1, 4.0
>         Environment: n/a
>            Reporter: Ron Mayer
>            Priority: Minor
>         Attachments: pf2_with_slop.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From	Ron Mayer <r....@0ape.com>
>  my results might
>  be even better if I had a couple different "pf2"s with different "ps"'s
>  at the same time.
>  In particular.   One with ps=0 to put a high boost on ones the have
>  the right ordering of words.  For example insuring that:
>   "red hat black jacket"
>  boosts only red hats and not black hats.
>  And another pf2 with a more modest boost with ps=5 or so to handle
>  the query above also boosting docs with "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From	Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2    // current syntax... makes phrases with a boost of 2
> pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> text:"foo bar"~1^2
> -Yonik
> http
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From	Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the 
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
> fields using the "~" syntax.
> -Hoss
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Issue Comment Edited: (SOLR-2058) Adds optional "slop" to "pf2", "pf3" and "pf" parameters

Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900260#action_12900260 ] 

Ron Mayer edited comment on SOLR-2058 at 8/19/10 8:04 AM:
----------------------------------------------------------

This patch is my first draft at implementing this feature.

Any feedback would be appreciated.

It seems to happily turn a query like
[http://localhost:8983/solr/select?defType=edismax&fl=id,text,score&q=enterprise+search+foobar&ps=5&qf=text&debugQuery=true&pf2=name~0^5555&pf2=name^12+name~10]

into what I believe is the desired parsed query:

+((text:enterpris) (text:search) (text:foobar)) ((name:"enterprise search"~5^12.0) (name:"search foobar"~5^12.0)) ((name:"enterprise search"^5555.0) (name:"search foobar"^5555.0)) ((name:"enterprise search"~10) (name:"search foobar"~10))

which looks like it should give a high boost to docs where both words appear right next to each other, but still substantial boosts to docs where the pairs of words are a few words apart.

      was (Author: ramayer):
    This patch is my first draft at implementing this feature.

Any feedback would be appreciated.

It seems to happily turn a query like
[http://localhost:8983/solr/select?defType=edismax&fl=id,text,score&q=enterprise+search+foobar&ps=5&qf=text&debugQuery=true&pf2=name~0^5555&ps=7&pf2=name^12+name~10]

into what I believe is the desired parsed query:

+((text:enterpris) (text:search) (text:foobar)) ((name:"enterprise search"~5^12.0) (name:"search foobar"~5^12.0)) ((name:"enterprise search"^5555.0) (name:"search foobar"^5555.0)) ((name:"enterprise search"~10) (name:"search foobar"~10))

which looks like it should give a high boost to docs where both words appear right next to each other, but still substantial boosts to docs where the pairs of words are a few words apart.
  
> Adds optional "slop" to "pf2", "pf3" and "pf" parameters 
> ---------------------------------------------------------
>
>                 Key: SOLR-2058
>                 URL: https://issues.apache.org/jira/browse/SOLR-2058
>             Project: Solr
>          Issue Type: Improvement
>          Components: SearchComponents - other
>    Affects Versions: 3.1, 4.0
>         Environment: n/a
>            Reporter: Ron Mayer
>            Priority: Minor
>         Attachments: pf2_with_slop.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From	Ron Mayer <r....@0ape.com>
>  my results might
>  be even better if I had a couple different "pf2"s with different "ps"'s
>  at the same time.
>  In particular.   One with ps=0 to put a high boost on ones the have
>  the right ordering of words.  For example insuring that:
>   "red hat black jacket"
>  boosts only red hats and not black hats.
>  And another pf2 with a more modest boost with ps=5 or so to handle
>  the query above also boosting docs with "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From	Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2    // current syntax... makes phrases with a boost of 2
> pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> text:"foo bar"~1^2
> -Yonik
> http
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From	Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the 
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
> fields using the "~" syntax.
> -Hoss
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2058) Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax

Posted by "Ron Mayer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ron Mayer updated SOLR-2058:
----------------------------

    Description: 
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
{quote}
From	Ron Mayer <r....@0ape.com>
... my results might  be even better if I had a couple different "pf2"s with different "ps"'s  at the same time.   In particular.   One with ps=0 to put a high boost on ones the have  the right ordering of words.  For example insuring that [the query]:
  "red hat black jacket"
 boosts only documents with "red hats" and not "black hats".   And another pf2 with a more modest boost with ps=5 or so to handle the query above also boosting docs with 
  "red baseball hat".
{quote}

[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
{quote}
From	Yonik Seeley <yo...@lucidimagination.com>
Perhaps fold it into the pf/pf2 syntax?

pf=text^2    // current syntax... makes phrases with a boost of 2
pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
a boost of 2

That actually seems pretty natural given the lucene query syntax - an
actual boosted sloppy phrase query already looks like
{{text:"foo bar"~1^2}}

-Yonik
{quote}

[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
{quote}
From	Chris Hostetter <ho...@fucit.org>

Big +1 to this idea ... the existing "ps" param can stick arround as the 
default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
fields using the "~" syntax.

-Hoss
{quote}

  was:
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
{quote}
From	Ron Mayer <r....@0ape.com>
 my results might
 be even better if I had a couple different "pf2"s with different "ps"'s
 at the same time.

 In particular.   One with ps=0 to put a high boost on ones the have
 the right ordering of words.  For example insuring that:
  "red hat black jacket"
 boosts only red hats and not black hats.

 And another pf2 with a more modest boost with ps=5 or so to handle
 the query above also boosting docs with "red baseball hat".
{quote}

[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
{quote}
From	Yonik Seeley <yo...@lucidimagination.com>

Perhaps fold it into the pf/pf2 syntax?

pf=text^2    // current syntax... makes phrases with a boost of 2
pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
a boost of 2

That actually seems pretty natural given the lucene query syntax - an
actual boosted sloppy phrase query already looks like
text:"foo bar"~1^2

-Yonik
http
{quote}

[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
{quote}
From	Chris Hostetter <ho...@fucit.org>

Big +1 to this idea ... the existing "ps" param can stick arround as the 
default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
fields using the "~" syntax.

-Hoss
{quote}


> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2058
>                 URL: https://issues.apache.org/jira/browse/SOLR-2058
>             Project: Solr
>          Issue Type: Improvement
>          Components: SearchComponents - other
>    Affects Versions: 3.1, 4.0
>         Environment: n/a
>            Reporter: Ron Mayer
>            Priority: Minor
>         Attachments: pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3C4C659119.2010007@0ape.com%3E
> {quote}
> From	Ron Mayer <r....@0ape.com>
> ... my results might  be even better if I had a couple different "pf2"s with different "ps"'s  at the same time.   In particular.   One with ps=0 to put a high boost on ones the have  the right ordering of words.  For example insuring that [the query]:
>   "red hat black jacket"
>  boosts only documents with "red hats" and not "black hats".   And another pf2 with a more modest boost with ps=5 or so to handle the query above also boosting docs with 
>   "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3CAANLkTimd+V3g6d_MnHP+JYkKd+dej8FVMvF_1LQoiLBU@mail.gmail.com%3E]
> {quote}
> From	Yonik Seeley <yo...@lucidimagination.com>
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2    // current syntax... makes phrases with a boost of 2
> pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3Calpine.DEB.1.10.1008161300510.6335@radix.cryptio.net%3E]
> {quote}
> From	Chris Hostetter <ho...@fucit.org>
> Big +1 to this idea ... the existing "ps" param can stick arround as the 
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
> fields using the "~" syntax.
> -Hoss
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org