You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Nishanth S <ch...@gmail.com> on 2015/07/08 23:10:06 UTC

Different outputformats in avro map reduce job

Hi All,

I  have a map reduce job which  reads a binary file and  needs to output
multiple avro files and a textformat file.I was able to output multiplle
avro files using (Avromultipleouts).How would I modify the job to output
textformat as well along with these avro files.Is it possible.

Thanks,
Nishanth

Re: Different outputformats in avro map reduce job

Posted by Harsh J <ha...@cloudera.com>.
You can write out the text file by using a direct HDFS file writer.
Your only concern in doing this approach would be the use of proper
target directories, which is documented at
http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2Fwrite-to_hdfs_files_directly_from_map.2Freduce_tasks.3F.

On Thu, Jul 9, 2015 at 2:40 AM, Nishanth S <ch...@gmail.com> wrote:
> Hi All,
>
> I  have a map reduce job which  reads a binary file and  needs to output
> multiple avro files and a textformat file.I was able to output multiplle
> avro files using (Avromultipleouts).How would I modify the job to output
> textformat as well along with these avro files.Is it possible.
>
> Thanks,
> Nishanth



-- 
Harsh J

Re: Different outputformats in avro map reduce job

Posted by Harsh J <ha...@cloudera.com>.
You can write out the text file by using a direct HDFS file writer.
Your only concern in doing this approach would be the use of proper
target directories, which is documented at
http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2Fwrite-to_hdfs_files_directly_from_map.2Freduce_tasks.3F.

On Thu, Jul 9, 2015 at 2:40 AM, Nishanth S <ch...@gmail.com> wrote:
> Hi All,
>
> I  have a map reduce job which  reads a binary file and  needs to output
> multiple avro files and a textformat file.I was able to output multiplle
> avro files using (Avromultipleouts).How would I modify the job to output
> textformat as well along with these avro files.Is it possible.
>
> Thanks,
> Nishanth



-- 
Harsh J

Re: Different outputformats in avro map reduce job

Posted by Harsh J <ha...@cloudera.com>.
You can write out the text file by using a direct HDFS file writer.
Your only concern in doing this approach would be the use of proper
target directories, which is documented at
http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2Fwrite-to_hdfs_files_directly_from_map.2Freduce_tasks.3F.

On Thu, Jul 9, 2015 at 2:40 AM, Nishanth S <ch...@gmail.com> wrote:
> Hi All,
>
> I  have a map reduce job which  reads a binary file and  needs to output
> multiple avro files and a textformat file.I was able to output multiplle
> avro files using (Avromultipleouts).How would I modify the job to output
> textformat as well along with these avro files.Is it possible.
>
> Thanks,
> Nishanth



-- 
Harsh J

Re: Different outputformats in avro map reduce job

Posted by Harsh J <ha...@cloudera.com>.
You can write out the text file by using a direct HDFS file writer.
Your only concern in doing this approach would be the use of proper
target directories, which is documented at
http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2Fwrite-to_hdfs_files_directly_from_map.2Freduce_tasks.3F.

On Thu, Jul 9, 2015 at 2:40 AM, Nishanth S <ch...@gmail.com> wrote:
> Hi All,
>
> I  have a map reduce job which  reads a binary file and  needs to output
> multiple avro files and a textformat file.I was able to output multiplle
> avro files using (Avromultipleouts).How would I modify the job to output
> textformat as well along with these avro files.Is it possible.
>
> Thanks,
> Nishanth



-- 
Harsh J

Re: Different outputformats in avro map reduce job

Posted by Harsh J <ha...@cloudera.com>.
You can write out the text file by using a direct HDFS file writer.
Your only concern in doing this approach would be the use of proper
target directories, which is documented at
http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2Fwrite-to_hdfs_files_directly_from_map.2Freduce_tasks.3F.

On Thu, Jul 9, 2015 at 2:40 AM, Nishanth S <ch...@gmail.com> wrote:
> Hi All,
>
> I  have a map reduce job which  reads a binary file and  needs to output
> multiple avro files and a textformat file.I was able to output multiplle
> avro files using (Avromultipleouts).How would I modify the job to output
> textformat as well along with these avro files.Is it possible.
>
> Thanks,
> Nishanth



-- 
Harsh J

Re: Different outputformats in avro map reduce job

Posted by Nishanth S <ch...@gmail.com>.
Thank   you.I would probably try to write o hdfs directly.

-Nishanth

On Thu, Jul 9, 2015 at 8:23 AM, Marshall Bockrath-Vandegrift <
llasram@damballa.com> wrote:

> Nishanth S <ch...@gmail.com> writes:
>
> > I have a map reduce job which reads a binary file and needs to output
> > multiple avro files and a textformat file.I was able to output
> > multiplle avro files using (Avromultipleouts).How would I modify the
> > job to output textformat as well along with these avro files.Is it
> > possible.
>
> I’m not aware of a general solution to this problem in raw Java
> MapReduce.  But in Parkour (Clojure MapReduce wrapper) I’ve implemented
> a “de-multiplexing” output format which ties multiple outputs to
> arbitrary isolated job sub-configurations, allowing each output to
> specify a separate output format and any output format configuration.
> This both solves your problem and avoids the need for special purpose
> multiple-output classes like Avro’s.
>
> It should be fairly straightforward to implement the same thing in Java,
> or if you’re feeling adventurous it should be possible to use the
> Parkour de-multiplexing output configuration from a Java job.
>
> --
> Marshall Bockrath-Vandegrift <ll...@damballa.com>
> Principal Software Engineer, Damballa R&D
>
>

Re: Different outputformats in avro map reduce job

Posted by Marshall Bockrath-Vandegrift <ll...@damballa.com>.
Nishanth S <ch...@gmail.com> writes:

> I have a map reduce job which reads a binary file and needs to output
> multiple avro files and a textformat file.I was able to output
> multiplle avro files using (Avromultipleouts).How would I modify the
> job to output textformat as well along with these avro files.Is it
> possible.

I’m not aware of a general solution to this problem in raw Java
MapReduce.  But in Parkour (Clojure MapReduce wrapper) I’ve implemented
a “de-multiplexing” output format which ties multiple outputs to
arbitrary isolated job sub-configurations, allowing each output to
specify a separate output format and any output format configuration.
This both solves your problem and avoids the need for special purpose
multiple-output classes like Avro’s.

It should be fairly straightforward to implement the same thing in Java,
or if you’re feeling adventurous it should be possible to use the
Parkour de-multiplexing output configuration from a Java job.

-- 
Marshall Bockrath-Vandegrift <ll...@damballa.com>
Principal Software Engineer, Damballa R&D