You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Nishanth S <ch...@gmail.com> on 2015/06/25 00:28:56 UTC

Avro MultipleOutputs in Mapreduce

Hello All,

We are using avro 1.7.7  and hadoop 2.5.1 in our project.We need to process
a mixed mode binary file using map reduce and have the output as multiple
avro files and each of these avro files would have different avro schemas.I
looked at AvroMultipleOutputs class but did not completely understand  on
what needs to be done in the driver class.This is a map only job the output
of which should be  4 different avro files(which has different avro
schemas) into different hdfs directories.

Do we need to set all key and value avro schemas to Avrojob in driver class?

AvroJob.setOutputKeySchema(job, Schema.create(Schema.Type.NULL));
AvroJob.setOutputValueSchema(job, A.getClassSchema());



Now if  I have schemas B,C and D  how would  these be set to
AvroJob?.Thanks for  your help.


Thanks,
Nishan

Fwd: Avro MultipleOutputs in Mapreduce

Posted by Nishanth S <ch...@gmail.com>.
Hello All,

We are using avro 1.7.7  and hadoop 2.5.1 in our project.We need to process
a mixed mode binary file using map reduce and have the output as multiple
avro files and each of these avro files would have different avro schemas.I
looked at AvroMultipleOutputs class but did not completely understand  on
what needs to be done in the driver class.This is a map only job the output
of which should be  4 different avro files(which has different avro
schemas) into different hdfs directories.

Do we need to set all key and value avro schemas to Avrojob in driver class?

AvroJob.setOutputKeySchema(job, Schema.create(Schema.Type.NULL));
AvroJob.setOutputValueSchema(job, A.getClassSchema());



Now if  I have schemas B,C and D  how would  these be set to
AvroJob?.Thanks for  your help.


Thanks,
Nishan

Fwd: Avro MultipleOutputs in Mapreduce

Posted by Nishanth S <ch...@gmail.com>.
Hello All,

We are using avro 1.7.7  and hadoop 2.5.1 in our project.We need to process
a mixed mode binary file using map reduce and have the output as multiple
avro files and each of these avro files would have different avro schemas.I
looked at AvroMultipleOutputs class but did not completely understand  on
what needs to be done in the driver class.This is a map only job the output
of which should be  4 different avro files(which has different avro
schemas) into different hdfs directories.

Do we need to set all key and value avro schemas to Avrojob in driver class?

AvroJob.setOutputKeySchema(job, Schema.create(Schema.Type.NULL));
AvroJob.setOutputValueSchema(job, A.getClassSchema());



Now if  I have schemas B,C and D  how would  these be set to
AvroJob?.Thanks for  your help.


Thanks,
Nishan

Fwd: Avro MultipleOutputs in Mapreduce

Posted by Nishanth S <ch...@gmail.com>.
Hello All,

We are using avro 1.7.7  and hadoop 2.5.1 in our project.We need to process
a mixed mode binary file using map reduce and have the output as multiple
avro files and each of these avro files would have different avro schemas.I
looked at AvroMultipleOutputs class but did not completely understand  on
what needs to be done in the driver class.This is a map only job the output
of which should be  4 different avro files(which has different avro
schemas) into different hdfs directories.

Do we need to set all key and value avro schemas to Avrojob in driver class?

AvroJob.setOutputKeySchema(job, Schema.create(Schema.Type.NULL));
AvroJob.setOutputValueSchema(job, A.getClassSchema());



Now if  I have schemas B,C and D  how would  these be set to
AvroJob?.Thanks for  your help.


Thanks,
Nishan

Fwd: Avro MultipleOutputs in Mapreduce

Posted by Nishanth S <ch...@gmail.com>.
Hello All,

We are using avro 1.7.7  and hadoop 2.5.1 in our project.We need to process
a mixed mode binary file using map reduce and have the output as multiple
avro files and each of these avro files would have different avro schemas.I
looked at AvroMultipleOutputs class but did not completely understand  on
what needs to be done in the driver class.This is a map only job the output
of which should be  4 different avro files(which has different avro
schemas) into different hdfs directories.

Do we need to set all key and value avro schemas to Avrojob in driver class?

AvroJob.setOutputKeySchema(job, Schema.create(Schema.Type.NULL));
AvroJob.setOutputValueSchema(job, A.getClassSchema());



Now if  I have schemas B,C and D  how would  these be set to
AvroJob?.Thanks for  your help.


Thanks,
Nishan