You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Nishanth S <ch...@gmail.com> on 2015/07/21 00:56:00 UTC

Fwd: Avro Map Reduce for Multiple Schemas

Hello,

I have to output multiple avro  files with different schemas as the  output
of a  mapreduce  job.Currently I am achieving this by doing a union of all
the schemas in the driver and then  by using Avromultipleoutputs  to
 output two files.


AvroMultipleOutputs.addNamedOutput(job, "a",
AvroKeyValueOutputFormat.class,
 Schema.create(Schema.Type.NULL),A.getClassSchema());
        AvroMultipleOutputs.addNamedOutput(job, "b",
AvroKeyValueOutputFormat.class,
 Schema.create(Schema.Type.NULL),B.getClassSchema());
List<Schema> schemas = new ArrayList<Schema>();
schemas.add(C.getClassSchema());
schemas.add(D.getClassSchema());
AvroKeyValueOutputFormat.class,
 Schema.create(Schema.Type.NULL),A.getClassSchema());
        AvroJob.setOutputKeySchema(job, Schema.create(Schema.Type.NULL));

AvroJob.setOutputValueSchema(job,B.getClassSchema().createUnion(schemas) );

Is there a better way to do this?.Request help.

Thanks,
Nishan