You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Ryan Josal <rj...@gmail.com> on 2015/03/21 02:07:07 UTC

DocTransformer#setContext

Hey guys, I wanted to ask if I'm using the DocTransformer API as intended.
There is a setContext( TransformerContext c ) method which is called by the
TextResponseWriter before it calls transform on any docs.  That context
object contains a DocIterator reference.  I want to use a DocTransformer to
add info from DynamoDB based on the uniquekeys of docs, so I figured this
would be the way to go to get all needed data from DDB in a batch before
transform.

Turns out if you call nextDoc on that iterator, that doc will not be
transformed because the iterator is not reset or regenerated in any way
before transformations start being called.  In some cases, if the Collector
collected extra docs, the DocSlice will have more docids to return even
after hasNext, and the code doesn't check that, so it will transform
those.  Then eventually it may throw an IndexOutOfBoundsException.  My gut
says this is not intended.  Why not give the DocList in the
TransformContext?

So in the example solrconfig, I think there is a suggestion to use
DocTransformers to get data from external DBs, but has anyone done this,
and how do they handle making a single/batch request instead of doing one
for every transform call?

Ryan