You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Thomas Michael Engelke <th...@posteo.de> on 2014/07/30 13:38:55 UTC

Ranking based on match position in field

 Hi,

an example. We have 2 records with this data in the same field
(description):

1: Lufthutze vor Kühler Bj 62-65, DS
2: Kühler HY im
Austausch, Altteilpfand 250 Euro

A search with the parameters
'description:Kühler' does provide this debug:

2.3234584 = (MATCH)
weight(description:kühler in 4053) [DefaultSimilarity], result of:

2.3234584 = fieldWeight in 4053, product of:
 1.0 = tf(freq=1.0), with
freq of:
 1.0 = termFreq=1.0
 6.195889 = idf(docFreq=69, maxDocs=12637)

0.375 = fieldNorm(doc=4053)
</str>
 <str name="16946">
2.3234584 =
(MATCH) weight(description:kühler in 5729) [DefaultSimilarity], result
of:
 2.3234584 = fieldWeight in 5729, product of:
 1.0 = tf(freq=1.0),
with freq of:
 1.0 = termFreq=1.0
 6.195889 = idf(docFreq=69,
maxDocs=12637)
 0.375 = fieldNorm(doc=5729)

As you can see, both get
the exact same score. However, we would like to rank the second document
higher, on the basis that the search term occurs further to the left of
the field.

Is there a component/setting that can do that?

Re: Ranking based on match position in field

Posted by Ahmet Arslan <io...@yahoo.com.INVALID>.

Hi Tomas,

Sorry for the confusion. That link (open issue) means that, it is a proposed and desired functionality. However it didn't included in code base yet.

You could do : 

* ping the author through jira and request to bring patch to trunk
* vote for the issue
* you could try if patch works with current version etc.

http://wiki.apache.org/solr/HowToContribute#Working_With_Patches

Ahmet


On Thursday, July 31, 2014 9:55 AM, Thomas Michael Engelke <th...@posteo.de> wrote:
Hi,

thanks for the link. I've upgraded from the used 4.7 to the
recent 4.9 version. I've tried to use the new feature with this query in
the admin interface using edismax:

description:Kühler^~1^5

However,
the result seems to stay the same:

<lst name="debug">
<str
name="rawquerystring">description:Kühler~1^5</str>
<str
name="querystring">description:Kühler~1^5</str>
<str
name="parsedquery">(+description:kühler~1^5.0)/no_coord</str>
<str
name="parsedquery_toString">+description:kühler~1^5.0</str>
<lst
name="explain">
<str name="17411">
2.334934 = (MATCH)
weight(description:kühler^5.0 in 4080) [DefaultSimilarity], result of:

2.334934 = score(doc=4080,freq=1.0 = termFreq=1.0
), product of:

0.99999994 = queryWeight, product of:
5.0 = boost
6.226491 =
idf(docFreq=64, maxDocs=12099)
0.03212082 = queryNorm
2.3349342 =
fieldWeight in 4080, product of:
1.0 = tf(freq=1.0), with freq of:
1.0
= termFreq=1.0
6.226491 = idf(docFreq=64, maxDocs=12099)
0.375 =
fieldNorm(doc=4080)
</str>
<str name="19085">
2.334934 = (MATCH)
weight(description:kühler^5.0 in 5754) [DefaultSimilarity], result of:

2.334934 = score(doc=5754,freq=1.0 = termFreq=1.0
), product of:

0.99999994 = queryWeight, product of:
5.0 = boost
6.226491 =
idf(docFreq=64, maxDocs=12099)
0.03212082 = queryNorm
2.3349342 =
fieldWeight in 5754, product of:
1.0 = tf(freq=1.0), with freq of:
1.0
= termFreq=1.0
6.226491 = idf(docFreq=64, maxDocs=12099)
0.375 =
fieldNorm(doc=5754)
</str>

Am I using this feature wrong?

Am
30.07.2014 14:48 schrieb Ahmet Arslan: 

> Hi,
> 
> Please see :
https://issues.apache.org/jira/browse/SOLR-3925 [1]
> 
> Ahmet
> 
> On
Wednesday, July 30, 2014 2:39 PM, Thomas Michael Engelke
<th...@posteo.de> wrote:
> Hi,
> 
> an example. We have 2
records with this data in the same field
> (description):
> 
> 1:
Lufthutze vor Kühler Bj 62-65, DS
> 2: Kühler HY im
> Austausch,
Altteilpfand 250 Euro
> 
> A search with the parameters
>
'description:Kühler' does provide this debug:
> 
> 2.3234584 = (MATCH)
>
weight(description:kühler in 4053) [DefaultSimilarity], result of:
> 
>
2.3234584 = fieldWeight in 4053, product of:
> 1.0 = tf(freq=1.0),
with
> freq of:
> 1.0 = termFreq=1.0
> 6.195889 = idf(docFreq=69,
maxDocs=12637)
> 
> 0.375 = fieldNorm(doc=4053)
> </str>
> <str
name="16946">
> 2.3234584 =
> (MATCH) weight(description:kühler in 5729)
[DefaultSimilarity], result
> of:
> 2.3234584 = fieldWeight in 5729,
product of:
> 1.0 = tf(freq=1.0),
> with freq of:
> 1.0 = termFreq=1.0
>
6.195889 = idf(docFreq=69,
> maxDocs=12637)
> 0.375 =
fieldNorm(doc=5729)
> 
> As you can see, both get
> the exact same
score. However, we would like to rank the second document
> higher, on
the basis that the search term occurs further to the left of
> the
field.
> 
> Is there a component/setting that can do that?




Links:
------
[1] https://issues.apache.org/jira/browse/SOLR-3925

Re: Ranking based on match position in field

Posted by Thomas Michael Engelke <th...@posteo.de>.

 Hi,

thanks for the link. I've upgraded from the used 4.7 to the
recent 4.9 version. I've tried to use the new feature with this query in
the admin interface using edismax:

description:Kühler^~1^5

However,
the result seems to stay the same:

<lst name="debug">
 <str
name="rawquerystring">description:Kühler~1^5</str>
 <str
name="querystring">description:Kühler~1^5</str>
 <str
name="parsedquery">(+description:kühler~1^5.0)/no_coord</str>
 <str
name="parsedquery_toString">+description:kühler~1^5.0</str>
 <lst
name="explain">
 <str name="17411">
2.334934 = (MATCH)
weight(description:kühler^5.0 in 4080) [DefaultSimilarity], result of:

2.334934 = score(doc=4080,freq=1.0 = termFreq=1.0
), product of:

0.99999994 = queryWeight, product of:
 5.0 = boost
 6.226491 =
idf(docFreq=64, maxDocs=12099)
 0.03212082 = queryNorm
 2.3349342 =
fieldWeight in 4080, product of:
 1.0 = tf(freq=1.0), with freq of:
 1.0
= termFreq=1.0
 6.226491 = idf(docFreq=64, maxDocs=12099)
 0.375 =
fieldNorm(doc=4080)
</str>
 <str name="19085">
2.334934 = (MATCH)
weight(description:kühler^5.0 in 5754) [DefaultSimilarity], result of:

2.334934 = score(doc=5754,freq=1.0 = termFreq=1.0
), product of:

0.99999994 = queryWeight, product of:
 5.0 = boost
 6.226491 =
idf(docFreq=64, maxDocs=12099)
 0.03212082 = queryNorm
 2.3349342 =
fieldWeight in 5754, product of:
 1.0 = tf(freq=1.0), with freq of:
 1.0
= termFreq=1.0
 6.226491 = idf(docFreq=64, maxDocs=12099)
 0.375 =
fieldNorm(doc=5754)
</str>

Am I using this feature wrong?

Am
30.07.2014 14:48 schrieb Ahmet Arslan: 

> Hi,
> 
> Please see :
https://issues.apache.org/jira/browse/SOLR-3925 [1]
> 
> Ahmet
> 
> On
Wednesday, July 30, 2014 2:39 PM, Thomas Michael Engelke
<th...@posteo.de> wrote:
> Hi,
> 
> an example. We have 2
records with this data in the same field
> (description):
> 
> 1:
Lufthutze vor Kühler Bj 62-65, DS
> 2: Kühler HY im
> Austausch,
Altteilpfand 250 Euro
> 
> A search with the parameters
>
'description:Kühler' does provide this debug:
> 
> 2.3234584 = (MATCH)
>
weight(description:kühler in 4053) [DefaultSimilarity], result of:
> 
>
2.3234584 = fieldWeight in 4053, product of:
> 1.0 = tf(freq=1.0),
with
> freq of:
> 1.0 = termFreq=1.0
> 6.195889 = idf(docFreq=69,
maxDocs=12637)
> 
> 0.375 = fieldNorm(doc=4053)
> </str>
> <str
name="16946">
> 2.3234584 =
> (MATCH) weight(description:kühler in 5729)
[DefaultSimilarity], result
> of:
> 2.3234584 = fieldWeight in 5729,
product of:
> 1.0 = tf(freq=1.0),
> with freq of:
> 1.0 = termFreq=1.0
>
6.195889 = idf(docFreq=69,
> maxDocs=12637)
> 0.375 =
fieldNorm(doc=5729)
> 
> As you can see, both get
> the exact same
score. However, we would like to rank the second document
> higher, on
the basis that the search term occurs further to the left of
> the
field.
> 
> Is there a component/setting that can do that?




Links:
------
[1] https://issues.apache.org/jira/browse/SOLR-3925

Re: Ranking based on match position in field

Posted by Ahmet Arslan <io...@yahoo.com.INVALID>.

Hi,

Please see : https://issues.apache.org/jira/browse/SOLR-3925

Ahmet



On Wednesday, July 30, 2014 2:39 PM, Thomas Michael Engelke <th...@posteo.de> wrote:
Hi,

an example. We have 2 records with this data in the same field
(description):

1: Lufthutze vor Kühler Bj 62-65, DS
2: Kühler HY im
Austausch, Altteilpfand 250 Euro

A search with the parameters
'description:Kühler' does provide this debug:

2.3234584 = (MATCH)
weight(description:kühler in 4053) [DefaultSimilarity], result of:

2.3234584 = fieldWeight in 4053, product of:
1.0 = tf(freq=1.0), with
freq of:
1.0 = termFreq=1.0
6.195889 = idf(docFreq=69, maxDocs=12637)

0.375 = fieldNorm(doc=4053)
</str>
<str name="16946">
2.3234584 =
(MATCH) weight(description:kühler in 5729) [DefaultSimilarity], result
of:
2.3234584 = fieldWeight in 5729, product of:
1.0 = tf(freq=1.0),
with freq of:
1.0 = termFreq=1.0
6.195889 = idf(docFreq=69,
maxDocs=12637)
0.375 = fieldNorm(doc=5729)

As you can see, both get
the exact same score. However, we would like to rank the second document
higher, on the basis that the search term occurs further to the left of
the field.

Is there a component/setting that can do that?