You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2010/08/08 16:09:15 UTC

[jira] Created: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Enable access to the freq information in a Query's sub-scorers
--------------------------------------------------------------

                 Key: LUCENE-2590
                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
             Project: Lucene - Java
          Issue Type: Improvement
          Components: Search
            Reporter: Michael McCandless
            Assignee: Michael McCandless


The ability to gather more details than just the score, of how a given
doc matches the current query, has come up a number of times on the
user's lists.  (most recently in the thread "Query Match Count" by
Ryan McV on java-user).

EG if you have a simple TermQuery "foo", on each hit you'd like to
know how many times "foo" occurred in that doc; or a BooleanQuery +foo
+bar, being able to separately see the freq of foo and bar for the
current hit.

Lucene doesn't make this possible today, which is a shame because
Lucene in fact does compute exactly this information; it's just not
accessible from the Collector.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903952#action_12903952 ] 

Michael McCandless commented on LUCENE-2590:
--------------------------------------------

{quote}
One way of doing it would be adding the scorers Weight as a protected final member since Weight already has a #getQuery() method we can easily access it or throw an UnsupportedOperationException if the weight is null (force it via ctor and have a default one which sets it to null).
{quote}

+1

This would then make a default base impl work well for all leaf queries.

> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898854#action_12898854 ] 

Michael McCandless commented on LUCENE-2590:
--------------------------------------------

OK I like the 3 methods instead of the single method with BooleanClause.OCCUR param.

The heavy generics would be useful for apps that eg have a fixed structure to all Queries right?  Eg if you know your app only ever makes BooleanQuery w/ TermQuery sub-classes, you'd parameterize the instantiation as <BooleanQuery,TermQuery,TermScrorer> (hmm though TermScorer isn't public; maybe just leave S as Scorer)?  I think this makes sense.

I'll work on these but it'll be a little while -- I'm out on "vacation".

> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer resolved LUCENE-2590.
-------------------------------------

    Fix Version/s: 4.0
       Resolution: Fixed

Committed revision 991310 

Thanks Mike


> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902379#action_12902379 ] 

Simon Willnauer commented on LUCENE-2590:
-----------------------------------------

bq. Oh I see we can't quite have Scorer impl this because it doesn't know the query. But maybe we can factor out a common method, that the subclass passed the query to?
I had the same idea in a previous iteration but since Scorer doesn't know about the Query the scorer concerns I can not do the call. One way of doing it would be adding the scorers {{Weight}} as a protected final member since {{Weight}} already has a {{#getQuery()}} method we can easily access it or throw an UnsupportedOperationException if the weight is null (force it via ctor and have a default one which sets it to null).

Since the most of the scorers know their {{Weight}} anyway and would need to call the visitor we can also factor it out.


bq. Also, we are missing some scorers (SpanScorer, ConstantScoreQuery.ConstantScorer, probably others), but if we do the super approach, we'd get these "for free" (I think?).
most of them would then be for free though!

Thoughts?

> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer updated LUCENE-2590:
------------------------------------

    Attachment: LUCENE-2590.patch

Attaching a slightly new version of the patch incorporating roughly the idea of 3-distinct methods. Yet, having 3 methods instead of one was kind of ugly to implement and I didn't want to add switch case statements to visitSubScorer methods just to call a different method.
 
Anyway, I changed the 3  parameter methods to take a single argument holding all required information. This is way more extensible and flexible since we don't need to change a method signature just to add a new param. 
I am not sure about the 3 different callbacks but I thought I would leave it in the patch for now.

Parent, Child and Scorer became generic arguments.

simon 

> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer updated LUCENE-2590:
------------------------------------

    Attachment: LUCENE-2590.patch

another iteration moving back to the originally discussed approach. The switch case block shouldn't be any bottleneck since the visitor should not run for each scored doc anyway.



> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901843#action_12901843 ] 

Michael McCandless commented on LUCENE-2590:
--------------------------------------------

Patch looks good!  Thanks, Simon.

Can we factor out that switch statement into super's visitSubScorers?  This way the base (Scorer) impl would be responsible for visiting "this", and then the subclass would override only if it had subs (ie BS, BS2)?

Also, we are missing some scorers (SpanScorer, ConstantScoreQuery.ConstantScorer, probably others), but if we do the super approach, we'd get these "for free" (I think?).

Oh I see we can't quite have Scorer impl this because it doesn't know the query.   But maybe we can factor out a common method, that the subclass passed the query to?

> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901349#action_12901349 ] 

Simon Willnauer commented on LUCENE-2590:
-----------------------------------------

{quote}
Can't we simplify this? I liked your original proposal, breaking out
explicit visitRequired, visitOptional, etc., but keeping this as a
single class. Or we can go back to the original patch (just passing
an arg expressing the relationship)?
{quote}

The problem leading to this quiet heavy change was that I wanted to call a specific function for each Boolean.Occur relationship. That turned out to be ugly since I had to either add a switch / case statement to each of the visitSubScorers methods or add  a visitSubScorers method for each relationship. I didn't like that at all so I moved forward to specify a callback for each relationship. I see your point of being to heavy though. I guess we should go back to the original approach since I don't want to decide it on a switch case basis.

{quote}
 also don't like the "context" approach, setting attrs on a shared
instance. This is basically setting up arguments to pass to the
callback - why not simply pass these arguments (on the stack)
instead?
{quote}
Passing it on the stack has one limitation which is that if I want to pass more information around in the future I need to change the interface while I only add a member to the "context" if I pass it that way. Another reason was that if I want to pass custom info  from a custom query scorer to another I can not do that since there is not context. 

One other solution would be to reduce the interface back to:
{code}
public abstract class ScorerVisitor<P extends Query, C extends Query, S extends Scorer>{
   public Occur relationship;
   public abstract void visit(P parent, C child, S childScorer);
}
{code}

but hold the relationship as an attribute that way we don't  have it in the method signature which I don't like though.


{quote}
I don't like the "accept" name - it's very generic - can we put this
back to visitSubScorers or something that makes it clear you're
visiting the full sub-tree (visitScorers? visitScorerTree?)?
{quote}

I'm ok with visitScorers


> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904751#action_12904751 ] 

Michael McCandless commented on LUCENE-2590:
--------------------------------------------

Thank you!

I think we should also backport to 3.x?


> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer resolved LUCENE-2590.
-------------------------------------

    Fix Version/s: 3.1
       Resolution: Fixed

Ported back to 3.x

Committed revision 991537.

> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Assigned: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer reassigned LUCENE-2590:
---------------------------------------

    Assignee: Simon Willnauer  (was: Michael McCandless)

> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904250#action_12904250 ] 

Michael McCandless commented on LUCENE-2590:
--------------------------------------------

Looks great!

Though, can't BS and BS2 just call super.visitSubScorers first, and then visit their subs?  (Ie right now they dup super's code right?).

> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904313#action_12904313 ] 

Simon Willnauer commented on LUCENE-2590:
-----------------------------------------

bq. Though, can't BS and BS2 just call super.visitSubScorers first, and then visit their subs? (Ie right now they dup super's code right?).
Nah, good point mike :) I missed that, nice code reuse though! - will fix that soon.

> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-2590:
---------------------------------------

    Attachment: LUCENE-2590.patch

Another iter of this patch; I think it's ready to commit.  I commented out BooleanScorer's handling of required=true since we never use it (but left TODOs in there to re-enable this code if/when we go back to sending required clauses to BooleanScorer).  I also added CHANGES entry.

> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904753#action_12904753 ] 

Simon Willnauer commented on LUCENE-2590:
-----------------------------------------

bq. I think we should also backport to 3.x?
Yeah we should I guess - I will look into this tomorrow. 

> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Reopened: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless reopened LUCENE-2590:
----------------------------------------


OK sounds good...

> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899000#action_12899000 ] 

Simon Willnauer commented on LUCENE-2590:
-----------------------------------------

{quote}
The heavy generics would be useful for apps that eg have a fixed structure to all Queries right? Eg if you know your app only ever makes BooleanQuery w/ TermQuery sub-classes, you'd parameterize the instantiation as <BooleanQuery,TermQuery,TermScrorer> (hmm though TermScorer isn't public; maybe just leave S as Scorer)? I think this makes sense.
{quote}

jep, this is exactly what I was aiming at. Scorer might be an exception but in general this it what you would do. A special case where I could think of is FuzzyQuery where you can tell from the rewrite method what kind of query will be constructed and where this information could be useful (think of spellchecking of similar usecases)

bq. I'll work on these but it'll be a little while - I'm out on "vacation".
Good to hear, enjoy! I might update you patch accordingly before you come back though!

simon

> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898638#action_12898638 ] 

Simon Willnauer commented on LUCENE-2590:
-----------------------------------------

Mike, eventually we are getting this feature - I guess many are waiting for this!

I wonder if we really need to pass the BooleanClause.Occur to the visit method of if we rather defined a visit method for each boolean relationship. We could make it an abstract class and if you are not interested in Prohibited or Optional scorers you can simply leave the method empty. JIT compiler might be able to optimize this method calls away. if we go this way I'd make {{VisitSubScorer}} abstract  
{code}
public abstract class VisitSubScorer {
   public void visitRequired(Query parent, Query child, Scorer childScorer){}
   public void visitOptional(Query parent, Query child, Scorer childScorer){}
   public void visitProhibited(Query parent, Query child, Scorer childScorer){}
}
{code}

I we could also make parent and child <P extends Query, C extends Query, S extends Scorer> in certain (I have to admit rather rare) situations this could make users code simpler as they might know how their queries look like and the cast is done implicitly. For all others nothing really changes though. Its just an idea and I thought I'd mention it.



> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899370#action_12899370 ] 

Michael McCandless commented on LUCENE-2590:
--------------------------------------------

bq. Good to hear, enjoy! I might update you patch accordingly before you come back though!

Please do!!  Feel free :)

> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-2590:
---------------------------------------

    Attachment: LUCENE-2590.patch

Attached patch.

There were two things to fix.  First, I added Scorer.freq() so that
you could get the freq of the current match.  Second, I added a way to
visit all sub-scorers for a given Scorer.  The functionality is
definitely expert level -- you'll have to make a custom Collector that
gathers these stats & saves them away in your PQ, to use this "for
real".  But it's a good start I think; there's a simple test case
showing how it works.


> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer updated LUCENE-2590:
------------------------------------

    Attachment: LUCENE-2590.patch

new patch - fixed the code dup

> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901344#action_12901344 ] 

Michael McCandless commented on LUCENE-2590:
--------------------------------------------

I think the API changes are too complex/heavyweight here!

With this new patch we've added Scorer.ScorerVisitor,
Scorer.SubScorerCallback, Scorer.ScorerContext, and two new
Scorer.accept methods.

The original patch added only a 1-method interface
"VisitSubScorer".

Can't we simplify this?  I liked your original proposal, breaking out
explicit visitRequired, visitOptional, etc., but keeping this as a
single class.  Or we can go back to the original patch (just passing
an arg expressing the relationship)?

I also don't like the "context" approach, setting attrs on a shared
instance.  This is basically setting up arguments to pass to the
callback -- why not simply pass these arguments (on the stack)
instead?

I don't like the "accept" name -- it's very generic -- can we put this
back to visitSubScorers or something that makes it clear you're
visiting the full sub-tree (visitScorers?  visitScorerTree?)?


> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer updated LUCENE-2590:
------------------------------------

    Attachment: LUCENE-2590.patch

Another iteration with Weight as a protected member or Scorer. All scorers I looked at had the weight already as a member so this change makes things way simpler though. I think this is close to commit.

> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904729#action_12904729 ] 

Michael McCandless commented on LUCENE-2590:
--------------------------------------------

Patch looks great Simon, thanks!  +1 to commit.

> Enable access to the freq information in a Query's sub-scorers
> --------------------------------------------------------------
>
>                 Key: LUCENE-2590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2590
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>         Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org