You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Nishanth S <ch...@gmail.com> on 2015/07/21 00:56:00 UTC
Fwd: Avro Map Reduce for Multiple Schemas
Hello,
I have to output multiple avro files with different schemas as the output
of a mapreduce job.Currently I am achieving this by doing a union of all
the schemas in the driver and then by using Avromultipleoutputs to
output two files.
AvroMultipleOutputs.addNamedOutput(job, "a",
AvroKeyValueOutputFormat.class,
Schema.create(Schema.Type.NULL),A.getClassSchema());
AvroMultipleOutputs.addNamedOutput(job, "b",
AvroKeyValueOutputFormat.class,
Schema.create(Schema.Type.NULL),B.getClassSchema());
List<Schema> schemas = new ArrayList<Schema>();
schemas.add(C.getClassSchema());
schemas.add(D.getClassSchema());
AvroKeyValueOutputFormat.class,
Schema.create(Schema.Type.NULL),A.getClassSchema());
AvroJob.setOutputKeySchema(job, Schema.create(Schema.Type.NULL));
AvroJob.setOutputValueSchema(job,B.getClassSchema().createUnion(schemas) );
Is there a better way to do this?.Request help.
Thanks,
Nishan