You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hadoop.apache.org by Darren Lee <dl...@amplience.com> on 2013/06/11 18:27:52 UTC

Slice MapWritable on Map

Hello,

I am working on a hadoop based solr indexing system. The reason we are using hadoop is because we need to prepare the data (compute values and add them to the solr documents).

For a full index I am reading in the records and outputting a MapWritable with all the fields I want to index. I then have other Hadoop jobs which use this output as an input. They contribute new computed fields to each document at reduce time.

This feels wrong as I am making each map read in the full document when they may only need one or two fields from the Map to add their computed field.

Is it possible in Hadoop to request a slice of the MapWritable? Or perhaps a better way to structure this? Would I even want to?


Thanks for any help,
Darren