You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ross Woolf <ro...@rosswoolf.com> on 2013/08/12 16:43:25 UTC

How to retrieve value of NumericDocValuesField in similarity

The JavaDocs for NumericDocValuesField indicates that this field value can
be used for scoring.  The example shows how to store the field, but I am
unclear as to how to retrieve the value of the field while in a similarity
to use it when scoring a document?  Can someone point me to an example or
give me one that demonstrates how I can fetch the value associated with the
document being scored while in a similarity class that I have created?

Re: How to retrieve value of NumericDocValuesField in similarity

Posted by Shai Erera <se...@gmail.com>.
ok that makes sense.

Shai


On Mon, Aug 12, 2013 at 9:18 PM, Robert Muir <rc...@gmail.com> wrote:

> On Mon, Aug 12, 2013 at 11:06 AM, Shai Erera <se...@gmail.com> wrote:
> >
> > Or, you'd like to keep FieldCache API for sort of back-compat with
> existing
> > features, and let the app control the "caching" by using an explicit
> > RamDVFormat?
> >
>
> Yes. In the future ideally fieldcache goes away and is a
> UninvertingFilterReader or something like that, that exposes DV apis.
>
> so then things can just use the DV apis... but to get things started
> we did it this way in the interim.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: How to retrieve value of NumericDocValuesField in similarity

Posted by Robert Muir <rc...@gmail.com>.
On Mon, Aug 12, 2013 at 11:06 AM, Shai Erera <se...@gmail.com> wrote:
>
> Or, you'd like to keep FieldCache API for sort of back-compat with existing
> features, and let the app control the "caching" by using an explicit
> RamDVFormat?
>

Yes. In the future ideally fieldcache goes away and is a
UninvertingFilterReader or something like that, that exposes DV apis.

so then things can just use the DV apis... but to get things started
we did it this way in the interim.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to retrieve value of NumericDocValuesField in similarity

Posted by Shai Erera <se...@gmail.com>.
Rob, when DiskDV becomes the default DVFormat, would it not make sense to
load the values into the cache if someone uses FieldCache API? Vs. if
someone calls DV API directly, he uses whatever is the default Codec, or
the one that he plugs.

That's what I would expect from a 'cache'. So it's ok that currently all
FieldCache does is delegate the call to DV API, but perhaps we'd want to
change that so that in the DiskDV case, it actually caches things?

Or, you'd like to keep FieldCache API for sort of back-compat with existing
features, and let the app control the "caching" by using an explicit
RamDVFormat?

Shai


On Mon, Aug 12, 2013 at 7:07 PM, Ross Woolf <ro...@rosswoolf.com> wrote:

> Yes, I will open an issue.
>
>
> On Mon, Aug 12, 2013 at 10:02 AM, Robert Muir <rc...@gmail.com> wrote:
>
> > On Mon, Aug 12, 2013 at 8:48 AM, Ross Woolf <ro...@rosswoolf.com> wrote:
> > > Okay, just for clarity sake, what you are saying is that if I make the
> > > FieldCache call it won't actually create and impose the loading time of
> > the
> > > FieldCache, but rather just use the NumericDocValuesField instead.  Is
> > this
> > > correct?
> >
> > Yes, exactly. its a little confusing, but a tradeoff to make docvalues
> > work transparently with lots of existing code built off of fieldcache
> > (sorting/grouping/joins/faceting/...) without having to have 2
> > separate implementations of what is the same thing. so its like
> > "docvalues is a fieldcache you already built at index-time".
> >
> > >
> > > Also, my similarity was extending SimilarityBase, and I can't see how
> to
> > > get a docId as it is not passed in the score method "score(BasicStats
> > > stats, float freq, float docLen)".  Will I need to extend using
> > Similarity
> > > instead of SimilarityBase, or is there a way to get the docId using
> > > SimilarityBase?
> >
> > Maybe we should just add a 'int doc' parameter to the
> > SimilarityBase.score() method? Do you want to open a JIRA issue for
> > this?
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>

Re: How to retrieve value of NumericDocValuesField in similarity

Posted by Ross Woolf <ro...@rosswoolf.com>.
Yes, I will open an issue.


On Mon, Aug 12, 2013 at 10:02 AM, Robert Muir <rc...@gmail.com> wrote:

> On Mon, Aug 12, 2013 at 8:48 AM, Ross Woolf <ro...@rosswoolf.com> wrote:
> > Okay, just for clarity sake, what you are saying is that if I make the
> > FieldCache call it won't actually create and impose the loading time of
> the
> > FieldCache, but rather just use the NumericDocValuesField instead.  Is
> this
> > correct?
>
> Yes, exactly. its a little confusing, but a tradeoff to make docvalues
> work transparently with lots of existing code built off of fieldcache
> (sorting/grouping/joins/faceting/...) without having to have 2
> separate implementations of what is the same thing. so its like
> "docvalues is a fieldcache you already built at index-time".
>
> >
> > Also, my similarity was extending SimilarityBase, and I can't see how to
> > get a docId as it is not passed in the score method "score(BasicStats
> > stats, float freq, float docLen)".  Will I need to extend using
> Similarity
> > instead of SimilarityBase, or is there a way to get the docId using
> > SimilarityBase?
>
> Maybe we should just add a 'int doc' parameter to the
> SimilarityBase.score() method? Do you want to open a JIRA issue for
> this?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: How to retrieve value of NumericDocValuesField in similarity

Posted by Robert Muir <rc...@gmail.com>.
On Mon, Aug 12, 2013 at 8:48 AM, Ross Woolf <ro...@rosswoolf.com> wrote:
> Okay, just for clarity sake, what you are saying is that if I make the
> FieldCache call it won't actually create and impose the loading time of the
> FieldCache, but rather just use the NumericDocValuesField instead.  Is this
> correct?

Yes, exactly. its a little confusing, but a tradeoff to make docvalues
work transparently with lots of existing code built off of fieldcache
(sorting/grouping/joins/faceting/...) without having to have 2
separate implementations of what is the same thing. so its like
"docvalues is a fieldcache you already built at index-time".

>
> Also, my similarity was extending SimilarityBase, and I can't see how to
> get a docId as it is not passed in the score method "score(BasicStats
> stats, float freq, float docLen)".  Will I need to extend using Similarity
> instead of SimilarityBase, or is there a way to get the docId using
> SimilarityBase?

Maybe we should just add a 'int doc' parameter to the
SimilarityBase.score() method? Do you want to open a JIRA issue for
this?

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to retrieve value of NumericDocValuesField in similarity

Posted by Ross Woolf <ro...@rosswoolf.com>.
Okay, just for clarity sake, what you are saying is that if I make the
FieldCache call it won't actually create and impose the loading time of the
FieldCache, but rather just use the NumericDocValuesField instead.  Is this
correct?

Also, my similarity was extending SimilarityBase, and I can't see how to
get a docId as it is not passed in the score method "score(BasicStats
stats, float freq, float docLen)".  Will I need to extend using Similarity
instead of SimilarityBase, or is there a way to get the docId using
SimilarityBase?


On Mon, Aug 12, 2013 at 9:27 AM, Robert Muir <rc...@gmail.com> wrote:

> Hello:
>
> This call just "passes thru" to docvalues:
>
>    FieldCache.DEFAULT.getFloats(context.reader(), boostField, false)
>
> if you want to call context.reader().getNumericDocValues... you could
> do that too, but thats all its doing in this case.
>
> On Mon, Aug 12, 2013 at 11:09 AM, Ross Woolf <ro...@rosswoolf.com> wrote:
> > That example shows using fieldcache, I am not wanting to use the
> > fieldcache.  I want to use the newer NumericDocValuesField.  Any
> direction
> > or examples of how to retrieve a value from the created
> > NumericDocValuesField in most efficient way would be appreciated.
> >
> >
> > On Mon, Aug 12, 2013 at 8:54 AM, Robert Muir <rc...@gmail.com> wrote:
> >
> >> There is a unit test demonstrating this at a very basic level here:
> >>
> >>
> >>
> http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/search/TestDocValuesScoring.java
> >>
> >> On Mon, Aug 12, 2013 at 10:43 AM, Ross Woolf <ro...@rosswoolf.com>
> wrote:
> >> > The JavaDocs for NumericDocValuesField indicates that this field value
> >> can
> >> > be used for scoring.  The example shows how to store the field, but I
> am
> >> > unclear as to how to retrieve the value of the field while in a
> >> similarity
> >> > to use it when scoring a document?  Can someone point me to an
> example or
> >> > give me one that demonstrates how I can fetch the value associated
> with
> >> the
> >> > document being scored while in a similarity class that I have created?
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: How to retrieve value of NumericDocValuesField in similarity

Posted by Robert Muir <rc...@gmail.com>.
Hello:

This call just "passes thru" to docvalues:

   FieldCache.DEFAULT.getFloats(context.reader(), boostField, false)

if you want to call context.reader().getNumericDocValues... you could
do that too, but thats all its doing in this case.

On Mon, Aug 12, 2013 at 11:09 AM, Ross Woolf <ro...@rosswoolf.com> wrote:
> That example shows using fieldcache, I am not wanting to use the
> fieldcache.  I want to use the newer NumericDocValuesField.  Any direction
> or examples of how to retrieve a value from the created
> NumericDocValuesField in most efficient way would be appreciated.
>
>
> On Mon, Aug 12, 2013 at 8:54 AM, Robert Muir <rc...@gmail.com> wrote:
>
>> There is a unit test demonstrating this at a very basic level here:
>>
>>
>> http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/search/TestDocValuesScoring.java
>>
>> On Mon, Aug 12, 2013 at 10:43 AM, Ross Woolf <ro...@rosswoolf.com> wrote:
>> > The JavaDocs for NumericDocValuesField indicates that this field value
>> can
>> > be used for scoring.  The example shows how to store the field, but I am
>> > unclear as to how to retrieve the value of the field while in a
>> similarity
>> > to use it when scoring a document?  Can someone point me to an example or
>> > give me one that demonstrates how I can fetch the value associated with
>> the
>> > document being scored while in a similarity class that I have created?
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to retrieve value of NumericDocValuesField in similarity

Posted by Ross Woolf <ro...@rosswoolf.com>.
That example shows using fieldcache, I am not wanting to use the
fieldcache.  I want to use the newer NumericDocValuesField.  Any direction
or examples of how to retrieve a value from the created
NumericDocValuesField in most efficient way would be appreciated.


On Mon, Aug 12, 2013 at 8:54 AM, Robert Muir <rc...@gmail.com> wrote:

> There is a unit test demonstrating this at a very basic level here:
>
>
> http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/search/TestDocValuesScoring.java
>
> On Mon, Aug 12, 2013 at 10:43 AM, Ross Woolf <ro...@rosswoolf.com> wrote:
> > The JavaDocs for NumericDocValuesField indicates that this field value
> can
> > be used for scoring.  The example shows how to store the field, but I am
> > unclear as to how to retrieve the value of the field while in a
> similarity
> > to use it when scoring a document?  Can someone point me to an example or
> > give me one that demonstrates how I can fetch the value associated with
> the
> > document being scored while in a similarity class that I have created?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: How to retrieve value of NumericDocValuesField in similarity

Posted by Robert Muir <rc...@gmail.com>.
There is a unit test demonstrating this at a very basic level here:

http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/search/TestDocValuesScoring.java

On Mon, Aug 12, 2013 at 10:43 AM, Ross Woolf <ro...@rosswoolf.com> wrote:
> The JavaDocs for NumericDocValuesField indicates that this field value can
> be used for scoring.  The example shows how to store the field, but I am
> unclear as to how to retrieve the value of the field while in a similarity
> to use it when scoring a document?  Can someone point me to an example or
> give me one that demonstrates how I can fetch the value associated with the
> document being scored while in a similarity class that I have created?

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org