You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Parvesh Garg <pa...@zettata.com> on 2015/05/13 07:41:07 UTC

utility methods to get field values from index

Hi All,

Was wondering if there is any class in Solr that provides utility methods
to fetch indexed field values for documents using docId. Something simple
like

getMultiLong(String field, int docId)

getLong(String field, int docId)

We have written a solr component to return group level stats like avg
score, max score etc over a large number of documents (say 5000+) against a
query executed using edismax. Need to get the group id fields value to do
that, this is a single valued long field.

This component also looks at one more field that is a multivalued long
field for each document and compute a score based on frequency + document
score for each value.

Currently we are using stored fields and was wondering if this approach
would be faster.

Apologies if this is too much to ask for.

Parvesh Garg,

Re: utility methods to get field values from index

Posted by Parvesh Garg <pa...@zettata.com>.
Hi Shalin,

Thanks for your answer. Forgot to mention that we are using 4.10 solr.
Also, I tried using docValues and the performance was worse than getting it
from stored values. Time taken to retrieve data for 2000 docs  for 2 fields
was 120 ms vs 230 ms previously and for docValues respectively.

May be there is something wrong in my code.

The code used for retrieving docValues is:

  *public* *static* *long* getSingleLong(*SolrIndexSearcher* searcher, *int*
docId,

      *String* field) *throws* IOException {


    *NumericDocValues* sdv = *DocValues*.*getNumeric*
(searcher.getAtomicReader(),

        field);


    *return* sdv.get(docId);

  }

and

  *public* *static* *List<Long>* getMultiLong(*SolrIndexSearcher* searcher,

      *int* docId, *String* field) *throws* IOException {

    *SortedSetDocValues* ssdv = *DocValues*.*getSortedSet*(

        searcher.getAtomicReader(), field);


    ssdv.setDocument(docId);

    *long* l;

    *List<Long>* retval = *new* *ArrayList<Long>*(40);


    *while* ((l = ssdv.nextOrd()) != *SortedSetDocValues*.*NO_MORE_ORDS*) {

      *BytesRef* bytes = ssdv.lookupOrd(l);

      retval.add(*NumericUtils*.*prefixCodedToLong*(bytes));

    }


    *return* retval;

  }



Parvesh Garg

On Wed, May 13, 2015 at 11:36 AM, Shalin Shekhar Mangar <
shalinmangar@gmail.com> wrote:

> In Solr 5.0+ you can use Lucene's DocValues API to read the indexed
> information. This is a unifying API over field cache and doc values so it
> can be used on all indexed fields.
>
> e.g. for single-valued field use
> searcher.getLeafReader().getSortedDocValues(fieldName);
> and for multi-valued fields
> use searcher.getLeafReader().getSortedSetDocValues(fieldName);
>
> On Wed, May 13, 2015 at 11:11 AM, Parvesh Garg <pa...@zettata.com>
> wrote:
>
> > Hi All,
> >
> > Was wondering if there is any class in Solr that provides utility methods
> > to fetch indexed field values for documents using docId. Something simple
> > like
> >
> > getMultiLong(String field, int docId)
> >
> > getLong(String field, int docId)
> >
> > We have written a solr component to return group level stats like avg
> > score, max score etc over a large number of documents (say 5000+)
> against a
> > query executed using edismax. Need to get the group id fields value to do
> > that, this is a single valued long field.
> >
> > This component also looks at one more field that is a multivalued long
> > field for each document and compute a score based on frequency + document
> > score for each value.
> >
> > Currently we are using stored fields and was wondering if this approach
> > would be faster.
> >
> > Apologies if this is too much to ask for.
> >
> > Parvesh Garg,
> >
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>

Re: utility methods to get field values from index

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
In Solr 5.0+ you can use Lucene's DocValues API to read the indexed
information. This is a unifying API over field cache and doc values so it
can be used on all indexed fields.

e.g. for single-valued field use
searcher.getLeafReader().getSortedDocValues(fieldName);
and for multi-valued fields
use searcher.getLeafReader().getSortedSetDocValues(fieldName);

On Wed, May 13, 2015 at 11:11 AM, Parvesh Garg <pa...@zettata.com> wrote:

> Hi All,
>
> Was wondering if there is any class in Solr that provides utility methods
> to fetch indexed field values for documents using docId. Something simple
> like
>
> getMultiLong(String field, int docId)
>
> getLong(String field, int docId)
>
> We have written a solr component to return group level stats like avg
> score, max score etc over a large number of documents (say 5000+) against a
> query executed using edismax. Need to get the group id fields value to do
> that, this is a single valued long field.
>
> This component also looks at one more field that is a multivalued long
> field for each document and compute a score based on frequency + document
> score for each value.
>
> Currently we are using stored fields and was wondering if this approach
> would be faster.
>
> Apologies if this is too much to ask for.
>
> Parvesh Garg,
>



-- 
Regards,
Shalin Shekhar Mangar.