You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Yasoja Seneviratne (JIRA)" <ji...@apache.org> on 2007/12/10 19:55:43 UTC

[jira] Created: (LUCENE-1087) Explain shows incorrect docFreq number when used for documents in different indices searched via MultiSearcher

Explain shows incorrect docFreq number when used for documents in different indices searched via MultiSearcher
--------------------------------------------------------------------------------------------------------------

                 Key: LUCENE-1087
                 URL: https://issues.apache.org/jira/browse/LUCENE-1087
             Project: Lucene - Java
          Issue Type: Bug
          Components: Query/Scoring
    Affects Versions: 2.2
         Environment: No special hardware required to reproduce the issue.
            Reporter: Yasoja Seneviratne
            Priority: Minor


Creating 2 different indexes, searching  each individually and print score details and compare to searching both indexes with MulitSearcher and printing score details.  
 
The "docFreq" value printed isn't correct - the values it prints are as if each index was searched individually.

Code is like:
MultiSearcher multi = new MultiSearcher(searchables);
Hits hits = multi.search(query);
for(int i=0; i<hits.length(); i++)
{
  Explanation expl = multi.explain(query, hits.id(i));
  System.out.println(expl.toString());
}


I raised this in the Lucene user mailing list and was advised to log a bug, email thread given below.

-----Original Message-----
From: Chris Hostetter [mailto:hossman_lucene@fucit.org] 
Sent: Friday, December 07, 2007 10:30 PM
To: java-user@lucene.apache.org
Subject: Re: does the MultiSearcher class calculate IDF properly?


a quick glance at the code seems to indicate that MultiSearcher has code 
for calcuating the docFreq accross all of the Searchables when searching 
(or when the docFreq method is explicitly called) but that explain method 
just delegates to Searchable that the specific docid came from.

if you compare that Explanation score you got with the score returned by 
a HitCollector (or TopDocs) they probably won't match.

So i would say "yes MultiSearcher calculates IDF properly, but 
MultiSeracher.explain is broken.  Please file a bug about this, i can't 
think of an easy way to fix it, but it certianly seems broken to me.


: Subject: does the MultiSearcher class calculate IDF properly?
: 
: I tried the following.  Creating 2 different indexes, search each
: individually and print score details and compare to searching both
: indexes with MulitSearcher and printing score details.  
: 
: The "docFreq" value printed don't seem right - is this just a problem
: with using Explain together with the MultiSearcher?
: 
: 
: Code is like:
: MultiSearcher multi = new MultiSearcher(searchables);
: Hits hits = multi.search(query);
: for(int i=0; i<hits.length(); i++)
: {
:   Explanation expl = multi.explain(query, hits.id(i));
:   System.out.println(expl.toString());
: }
: 
: 
: Output:
: id = 14 score = 0.071
: 0.07073946 = (MATCH) fieldWeight(contents:climate in 2), product of:
:   1.0 = tf(termFreq(contents:climate)=1)
:   1.8109303 = idf(docFreq=1)
:   0.0390625 = fieldNorm(field=contents, doc=2)
: 
: ---------------------------------------------------------------------
: To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
: For additional commands, e-mail: java-user-help@lucene.apache.org
: 



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1087) MultiSearcher.explain returns incorrect score/explanation relating to docFreq

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745983#action_12745983 ] 

Mark Miller commented on LUCENE-1087:
-------------------------------------

bq. as it is, it looks like explain on MultiSearcher is going to get left out in the cold - MultiSearcher works with Searchables, but Searchable#explain is deprecated with the replacement on Searcher. Thats trouble for MultiSearcher#explain.

Looks like I left it out in the cold ! This was a mistake when reverting from QueryWeight to Weight - I missed removing this deprecation.

fixed and committed r806561

> MultiSearcher.explain returns incorrect score/explanation relating to docFreq
> -----------------------------------------------------------------------------
>
>                 Key: LUCENE-1087
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1087
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Query/Scoring
>    Affects Versions: 2.2
>         Environment: No special hardware required to reproduce the issue.
>            Reporter: Yasoja Seneviratne
>            Assignee: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>
> Creating 2 different indexes, searching  each individually and print score details and compare to searching both indexes with MulitSearcher and printing score details.  
>  
> The "docFreq" value printed isn't correct - the values it prints are as if each index was searched individually.
> Code is like:
> {code}
> MultiSearcher multi = new MultiSearcher(searchables);
> Hits hits = multi.search(query);
> for(int i=0; i<hits.length(); i++)
> {
>   Explanation expl = multi.explain(query, hits.id(i));
>   System.out.println(expl.toString());
> }
> {code}
> I raised this in the Lucene user mailing list and was advised to log a bug, email thread given below.
> {noformat} 
> -----Original Message-----
> From: Chris Hostetter  
> Sent: Friday, December 07, 2007 10:30 PM
> To: java-user
> Subject: Re: does the MultiSearcher class calculate IDF properly?
> a quick glance at the code seems to indicate that MultiSearcher has code 
> for calcuating the docFreq accross all of the Searchables when searching 
> (or when the docFreq method is explicitly called) but that explain method 
> just delegates to Searchable that the specific docid came from.
> if you compare that Explanation score you got with the score returned by 
> a HitCollector (or TopDocs) they probably won't match.
> So i would say "yes MultiSearcher calculates IDF properly, but 
> MultiSeracher.explain is broken.  Please file a bug about this, i can't 
> think of an easy way to fix it, but it certianly seems broken to me.
> : Subject: does the MultiSearcher class calculate IDF properly?
> : 
> : I tried the following.  Creating 2 different indexes, search each
> : individually and print score details and compare to searching both
> : indexes with MulitSearcher and printing score details.  
> : 
> : The "docFreq" value printed don't seem right - is this just a problem
> : with using Explain together with the MultiSearcher?
> : 
> : 
> : Code is like:
> : MultiSearcher multi = new MultiSearcher(searchables);
> : Hits hits = multi.search(query);
> : for(int i=0; i<hits.length(); i++)
> : {
> :   Explanation expl = multi.explain(query, hits.id(i));
> :   System.out.println(expl.toString());
> : }
> : 
> : 
> : Output:
> : id = 14 score = 0.071
> : 0.07073946 = (MATCH) fieldWeight(contents:climate in 2), product of:
> :   1.0 = tf(termFreq(contents:climate)=1)
> :   1.8109303 = idf(docFreq=1)
> :   0.0390625 = fieldNorm(field=contents, doc=2)
> {noformat} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1087) MultiSearcher.explain returns incorrect score/explanation relating to docFreq

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Miller updated LUCENE-1087:
--------------------------------

    Fix Version/s: 2.9
         Assignee: Mark Miller

> MultiSearcher.explain returns incorrect score/explanation relating to docFreq
> -----------------------------------------------------------------------------
>
>                 Key: LUCENE-1087
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1087
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Query/Scoring
>    Affects Versions: 2.2
>         Environment: No special hardware required to reproduce the issue.
>            Reporter: Yasoja Seneviratne
>            Assignee: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>
> Creating 2 different indexes, searching  each individually and print score details and compare to searching both indexes with MulitSearcher and printing score details.  
>  
> The "docFreq" value printed isn't correct - the values it prints are as if each index was searched individually.
> Code is like:
> {code}
> MultiSearcher multi = new MultiSearcher(searchables);
> Hits hits = multi.search(query);
> for(int i=0; i<hits.length(); i++)
> {
>   Explanation expl = multi.explain(query, hits.id(i));
>   System.out.println(expl.toString());
> }
> {code}
> I raised this in the Lucene user mailing list and was advised to log a bug, email thread given below.
> {noformat} 
> -----Original Message-----
> From: Chris Hostetter  
> Sent: Friday, December 07, 2007 10:30 PM
> To: java-user
> Subject: Re: does the MultiSearcher class calculate IDF properly?
> a quick glance at the code seems to indicate that MultiSearcher has code 
> for calcuating the docFreq accross all of the Searchables when searching 
> (or when the docFreq method is explicitly called) but that explain method 
> just delegates to Searchable that the specific docid came from.
> if you compare that Explanation score you got with the score returned by 
> a HitCollector (or TopDocs) they probably won't match.
> So i would say "yes MultiSearcher calculates IDF properly, but 
> MultiSeracher.explain is broken.  Please file a bug about this, i can't 
> think of an easy way to fix it, but it certianly seems broken to me.
> : Subject: does the MultiSearcher class calculate IDF properly?
> : 
> : I tried the following.  Creating 2 different indexes, search each
> : individually and print score details and compare to searching both
> : indexes with MulitSearcher and printing score details.  
> : 
> : The "docFreq" value printed don't seem right - is this just a problem
> : with using Explain together with the MultiSearcher?
> : 
> : 
> : Code is like:
> : MultiSearcher multi = new MultiSearcher(searchables);
> : Hits hits = multi.search(query);
> : for(int i=0; i<hits.length(); i++)
> : {
> :   Explanation expl = multi.explain(query, hits.id(i));
> :   System.out.println(expl.toString());
> : }
> : 
> : 
> : Output:
> : id = 14 score = 0.071
> : 0.07073946 = (MATCH) fieldWeight(contents:climate in 2), product of:
> :   1.0 = tf(termFreq(contents:climate)=1)
> :   1.8109303 = idf(docFreq=1)
> :   0.0390625 = fieldNorm(field=contents, doc=2)
> {noformat} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1087) MultiSearcher.explain returns incorrect score/explanation relating to docFreq

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated LUCENE-1087:
-----------------------------

    Description: 
Creating 2 different indexes, searching  each individually and print score details and compare to searching both indexes with MulitSearcher and printing score details.  
 
The "docFreq" value printed isn't correct - the values it prints are as if each index was searched individually.

Code is like:
{code}
MultiSearcher multi = new MultiSearcher(searchables);
Hits hits = multi.search(query);
for(int i=0; i<hits.length(); i++)
{
  Explanation expl = multi.explain(query, hits.id(i));
  System.out.println(expl.toString());
}
{code}

I raised this in the Lucene user mailing list and was advised to log a bug, email thread given below.

{noformat} 
-----Original Message-----
From: Chris Hostetter  
Sent: Friday, December 07, 2007 10:30 PM
To: java-user
Subject: Re: does the MultiSearcher class calculate IDF properly?


a quick glance at the code seems to indicate that MultiSearcher has code 
for calcuating the docFreq accross all of the Searchables when searching 
(or when the docFreq method is explicitly called) but that explain method 
just delegates to Searchable that the specific docid came from.

if you compare that Explanation score you got with the score returned by 
a HitCollector (or TopDocs) they probably won't match.

So i would say "yes MultiSearcher calculates IDF properly, but 
MultiSeracher.explain is broken.  Please file a bug about this, i can't 
think of an easy way to fix it, but it certianly seems broken to me.


: Subject: does the MultiSearcher class calculate IDF properly?
: 
: I tried the following.  Creating 2 different indexes, search each
: individually and print score details and compare to searching both
: indexes with MulitSearcher and printing score details.  
: 
: The "docFreq" value printed don't seem right - is this just a problem
: with using Explain together with the MultiSearcher?
: 
: 
: Code is like:
: MultiSearcher multi = new MultiSearcher(searchables);
: Hits hits = multi.search(query);
: for(int i=0; i<hits.length(); i++)
: {
:   Explanation expl = multi.explain(query, hits.id(i));
:   System.out.println(expl.toString());
: }
: 
: 
: Output:
: id = 14 score = 0.071
: 0.07073946 = (MATCH) fieldWeight(contents:climate in 2), product of:
:   1.0 = tf(termFreq(contents:climate)=1)
:   1.8109303 = idf(docFreq=1)
:   0.0390625 = fieldNorm(field=contents, doc=2)
{noformat} 

  was:
Creating 2 different indexes, searching  each individually and print score details and compare to searching both indexes with MulitSearcher and printing score details.  
 
The "docFreq" value printed isn't correct - the values it prints are as if each index was searched individually.

Code is like:
MultiSearcher multi = new MultiSearcher(searchables);
Hits hits = multi.search(query);
for(int i=0; i<hits.length(); i++)
{
  Explanation expl = multi.explain(query, hits.id(i));
  System.out.println(expl.toString());
}


I raised this in the Lucene user mailing list and was advised to log a bug, email thread given below.

-----Original Message-----
From: Chris Hostetter [mailto:hossman_lucene@fucit.org] 
Sent: Friday, December 07, 2007 10:30 PM
To: java-user@lucene.apache.org
Subject: Re: does the MultiSearcher class calculate IDF properly?


a quick glance at the code seems to indicate that MultiSearcher has code 
for calcuating the docFreq accross all of the Searchables when searching 
(or when the docFreq method is explicitly called) but that explain method 
just delegates to Searchable that the specific docid came from.

if you compare that Explanation score you got with the score returned by 
a HitCollector (or TopDocs) they probably won't match.

So i would say "yes MultiSearcher calculates IDF properly, but 
MultiSeracher.explain is broken.  Please file a bug about this, i can't 
think of an easy way to fix it, but it certianly seems broken to me.


: Subject: does the MultiSearcher class calculate IDF properly?
: 
: I tried the following.  Creating 2 different indexes, search each
: individually and print score details and compare to searching both
: indexes with MulitSearcher and printing score details.  
: 
: The "docFreq" value printed don't seem right - is this just a problem
: with using Explain together with the MultiSearcher?
: 
: 
: Code is like:
: MultiSearcher multi = new MultiSearcher(searchables);
: Hits hits = multi.search(query);
: for(int i=0; i<hits.length(); i++)
: {
:   Explanation expl = multi.explain(query, hits.id(i));
:   System.out.println(expl.toString());
: }
: 
: 
: Output:
: id = 14 score = 0.071
: 0.07073946 = (MATCH) fieldWeight(contents:climate in 2), product of:
:   1.0 = tf(termFreq(contents:climate)=1)
:   1.8109303 = idf(docFreq=1)
:   0.0390625 = fieldNorm(field=contents, doc=2)
: 
: ---------------------------------------------------------------------
: To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
: For additional commands, e-mail: java-user-help@lucene.apache.org
: 



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



        Summary: MultiSearcher.explain returns incorrect score/explanation relating to docFreq  (was: Explain shows incorrect docFreq number when used for documents in different indices searched via MultiSearcher)

clarifying summary, and cleaning up description (formating and removing spam bait)

> MultiSearcher.explain returns incorrect score/explanation relating to docFreq
> -----------------------------------------------------------------------------
>
>                 Key: LUCENE-1087
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1087
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Query/Scoring
>    Affects Versions: 2.2
>         Environment: No special hardware required to reproduce the issue.
>            Reporter: Yasoja Seneviratne
>            Priority: Minor
>
> Creating 2 different indexes, searching  each individually and print score details and compare to searching both indexes with MulitSearcher and printing score details.  
>  
> The "docFreq" value printed isn't correct - the values it prints are as if each index was searched individually.
> Code is like:
> {code}
> MultiSearcher multi = new MultiSearcher(searchables);
> Hits hits = multi.search(query);
> for(int i=0; i<hits.length(); i++)
> {
>   Explanation expl = multi.explain(query, hits.id(i));
>   System.out.println(expl.toString());
> }
> {code}
> I raised this in the Lucene user mailing list and was advised to log a bug, email thread given below.
> {noformat} 
> -----Original Message-----
> From: Chris Hostetter  
> Sent: Friday, December 07, 2007 10:30 PM
> To: java-user
> Subject: Re: does the MultiSearcher class calculate IDF properly?
> a quick glance at the code seems to indicate that MultiSearcher has code 
> for calcuating the docFreq accross all of the Searchables when searching 
> (or when the docFreq method is explicitly called) but that explain method 
> just delegates to Searchable that the specific docid came from.
> if you compare that Explanation score you got with the score returned by 
> a HitCollector (or TopDocs) they probably won't match.
> So i would say "yes MultiSearcher calculates IDF properly, but 
> MultiSeracher.explain is broken.  Please file a bug about this, i can't 
> think of an easy way to fix it, but it certianly seems broken to me.
> : Subject: does the MultiSearcher class calculate IDF properly?
> : 
> : I tried the following.  Creating 2 different indexes, search each
> : individually and print score details and compare to searching both
> : indexes with MulitSearcher and printing score details.  
> : 
> : The "docFreq" value printed don't seem right - is this just a problem
> : with using Explain together with the MultiSearcher?
> : 
> : 
> : Code is like:
> : MultiSearcher multi = new MultiSearcher(searchables);
> : Hits hits = multi.search(query);
> : for(int i=0; i<hits.length(); i++)
> : {
> :   Explanation expl = multi.explain(query, hits.id(i));
> :   System.out.println(expl.toString());
> : }
> : 
> : 
> : Output:
> : id = 14 score = 0.071
> : 0.07073946 = (MATCH) fieldWeight(contents:climate in 2), product of:
> :   1.0 = tf(termFreq(contents:climate)=1)
> :   1.8109303 = idf(docFreq=1)
> :   0.0390625 = fieldNorm(field=contents, doc=2)
> {noformat} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-1087) MultiSearcher.explain returns incorrect score/explanation relating to docFreq

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Miller resolved LUCENE-1087.
---------------------------------

    Resolution: Fixed

Thanks for the report Yasoja! This has been fixed in the linked issue.

> MultiSearcher.explain returns incorrect score/explanation relating to docFreq
> -----------------------------------------------------------------------------
>
>                 Key: LUCENE-1087
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1087
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Query/Scoring
>    Affects Versions: 2.2
>         Environment: No special hardware required to reproduce the issue.
>            Reporter: Yasoja Seneviratne
>            Assignee: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>
> Creating 2 different indexes, searching  each individually and print score details and compare to searching both indexes with MulitSearcher and printing score details.  
>  
> The "docFreq" value printed isn't correct - the values it prints are as if each index was searched individually.
> Code is like:
> {code}
> MultiSearcher multi = new MultiSearcher(searchables);
> Hits hits = multi.search(query);
> for(int i=0; i<hits.length(); i++)
> {
>   Explanation expl = multi.explain(query, hits.id(i));
>   System.out.println(expl.toString());
> }
> {code}
> I raised this in the Lucene user mailing list and was advised to log a bug, email thread given below.
> {noformat} 
> -----Original Message-----
> From: Chris Hostetter  
> Sent: Friday, December 07, 2007 10:30 PM
> To: java-user
> Subject: Re: does the MultiSearcher class calculate IDF properly?
> a quick glance at the code seems to indicate that MultiSearcher has code 
> for calcuating the docFreq accross all of the Searchables when searching 
> (or when the docFreq method is explicitly called) but that explain method 
> just delegates to Searchable that the specific docid came from.
> if you compare that Explanation score you got with the score returned by 
> a HitCollector (or TopDocs) they probably won't match.
> So i would say "yes MultiSearcher calculates IDF properly, but 
> MultiSeracher.explain is broken.  Please file a bug about this, i can't 
> think of an easy way to fix it, but it certianly seems broken to me.
> : Subject: does the MultiSearcher class calculate IDF properly?
> : 
> : I tried the following.  Creating 2 different indexes, search each
> : individually and print score details and compare to searching both
> : indexes with MulitSearcher and printing score details.  
> : 
> : The "docFreq" value printed don't seem right - is this just a problem
> : with using Explain together with the MultiSearcher?
> : 
> : 
> : Code is like:
> : MultiSearcher multi = new MultiSearcher(searchables);
> : Hits hits = multi.search(query);
> : for(int i=0; i<hits.length(); i++)
> : {
> :   Explanation expl = multi.explain(query, hits.id(i));
> :   System.out.println(expl.toString());
> : }
> : 
> : 
> : Output:
> : id = 14 score = 0.071
> : 0.07073946 = (MATCH) fieldWeight(contents:climate in 2), product of:
> :   1.0 = tf(termFreq(contents:climate)=1)
> :   1.8109303 = idf(docFreq=1)
> :   0.0390625 = fieldNorm(field=contents, doc=2)
> {noformat} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1087) MultiSearcher.explain returns incorrect score/explanation relating to docFreq

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12741636#action_12741636 ] 

Mark Miller commented on LUCENE-1087:
-------------------------------------

Not sure how to fix this, but I think LUCENE-1771 will make it a bit easier - if we can just get the multi searcher passed as the searcher to explain(searcher, reader, doc), it would pull the right numbers.

Not sure how to do that though - as it is, it looks like explain on MultiSearcher is going to get left out in the cold - MultiSearcher works with Searchables, but Searchable#explain is deprecated with the replacement on Searcher. Thats trouble for MultiSearcher#explain.

> MultiSearcher.explain returns incorrect score/explanation relating to docFreq
> -----------------------------------------------------------------------------
>
>                 Key: LUCENE-1087
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1087
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Query/Scoring
>    Affects Versions: 2.2
>         Environment: No special hardware required to reproduce the issue.
>            Reporter: Yasoja Seneviratne
>            Priority: Minor
>
> Creating 2 different indexes, searching  each individually and print score details and compare to searching both indexes with MulitSearcher and printing score details.  
>  
> The "docFreq" value printed isn't correct - the values it prints are as if each index was searched individually.
> Code is like:
> {code}
> MultiSearcher multi = new MultiSearcher(searchables);
> Hits hits = multi.search(query);
> for(int i=0; i<hits.length(); i++)
> {
>   Explanation expl = multi.explain(query, hits.id(i));
>   System.out.println(expl.toString());
> }
> {code}
> I raised this in the Lucene user mailing list and was advised to log a bug, email thread given below.
> {noformat} 
> -----Original Message-----
> From: Chris Hostetter  
> Sent: Friday, December 07, 2007 10:30 PM
> To: java-user
> Subject: Re: does the MultiSearcher class calculate IDF properly?
> a quick glance at the code seems to indicate that MultiSearcher has code 
> for calcuating the docFreq accross all of the Searchables when searching 
> (or when the docFreq method is explicitly called) but that explain method 
> just delegates to Searchable that the specific docid came from.
> if you compare that Explanation score you got with the score returned by 
> a HitCollector (or TopDocs) they probably won't match.
> So i would say "yes MultiSearcher calculates IDF properly, but 
> MultiSeracher.explain is broken.  Please file a bug about this, i can't 
> think of an easy way to fix it, but it certianly seems broken to me.
> : Subject: does the MultiSearcher class calculate IDF properly?
> : 
> : I tried the following.  Creating 2 different indexes, search each
> : individually and print score details and compare to searching both
> : indexes with MulitSearcher and printing score details.  
> : 
> : The "docFreq" value printed don't seem right - is this just a problem
> : with using Explain together with the MultiSearcher?
> : 
> : 
> : Code is like:
> : MultiSearcher multi = new MultiSearcher(searchables);
> : Hits hits = multi.search(query);
> : for(int i=0; i<hits.length(); i++)
> : {
> :   Explanation expl = multi.explain(query, hits.id(i));
> :   System.out.println(expl.toString());
> : }
> : 
> : 
> : Output:
> : id = 14 score = 0.071
> : 0.07073946 = (MATCH) fieldWeight(contents:climate in 2), product of:
> :   1.0 = tf(termFreq(contents:climate)=1)
> :   1.8109303 = idf(docFreq=1)
> :   0.0390625 = fieldNorm(field=contents, doc=2)
> {noformat} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org