You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Nishanth S <ch...@gmail.com> on 2015/07/08 23:10:06 UTC
Different outputformats in avro map reduce job
Hi All,
I have a map reduce job which reads a binary file and needs to output
multiple avro files and a textformat file.I was able to output multiplle
avro files using (Avromultipleouts).How would I modify the job to output
textformat as well along with these avro files.Is it possible.
Thanks,
Nishanth
Re: Different outputformats in avro map reduce job
Posted by Harsh J <ha...@cloudera.com>.
You can write out the text file by using a direct HDFS file writer.
Your only concern in doing this approach would be the use of proper
target directories, which is documented at
http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2Fwrite-to_hdfs_files_directly_from_map.2Freduce_tasks.3F.
On Thu, Jul 9, 2015 at 2:40 AM, Nishanth S <ch...@gmail.com> wrote:
> Hi All,
>
> I have a map reduce job which reads a binary file and needs to output
> multiple avro files and a textformat file.I was able to output multiplle
> avro files using (Avromultipleouts).How would I modify the job to output
> textformat as well along with these avro files.Is it possible.
>
> Thanks,
> Nishanth
--
Harsh J
Re: Different outputformats in avro map reduce job
Posted by Harsh J <ha...@cloudera.com>.
You can write out the text file by using a direct HDFS file writer.
Your only concern in doing this approach would be the use of proper
target directories, which is documented at
http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2Fwrite-to_hdfs_files_directly_from_map.2Freduce_tasks.3F.
On Thu, Jul 9, 2015 at 2:40 AM, Nishanth S <ch...@gmail.com> wrote:
> Hi All,
>
> I have a map reduce job which reads a binary file and needs to output
> multiple avro files and a textformat file.I was able to output multiplle
> avro files using (Avromultipleouts).How would I modify the job to output
> textformat as well along with these avro files.Is it possible.
>
> Thanks,
> Nishanth
--
Harsh J
Re: Different outputformats in avro map reduce job
Posted by Harsh J <ha...@cloudera.com>.
You can write out the text file by using a direct HDFS file writer.
Your only concern in doing this approach would be the use of proper
target directories, which is documented at
http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2Fwrite-to_hdfs_files_directly_from_map.2Freduce_tasks.3F.
On Thu, Jul 9, 2015 at 2:40 AM, Nishanth S <ch...@gmail.com> wrote:
> Hi All,
>
> I have a map reduce job which reads a binary file and needs to output
> multiple avro files and a textformat file.I was able to output multiplle
> avro files using (Avromultipleouts).How would I modify the job to output
> textformat as well along with these avro files.Is it possible.
>
> Thanks,
> Nishanth
--
Harsh J
Re: Different outputformats in avro map reduce job
Posted by Harsh J <ha...@cloudera.com>.
You can write out the text file by using a direct HDFS file writer.
Your only concern in doing this approach would be the use of proper
target directories, which is documented at
http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2Fwrite-to_hdfs_files_directly_from_map.2Freduce_tasks.3F.
On Thu, Jul 9, 2015 at 2:40 AM, Nishanth S <ch...@gmail.com> wrote:
> Hi All,
>
> I have a map reduce job which reads a binary file and needs to output
> multiple avro files and a textformat file.I was able to output multiplle
> avro files using (Avromultipleouts).How would I modify the job to output
> textformat as well along with these avro files.Is it possible.
>
> Thanks,
> Nishanth
--
Harsh J
Re: Different outputformats in avro map reduce job
Posted by Harsh J <ha...@cloudera.com>.
You can write out the text file by using a direct HDFS file writer.
Your only concern in doing this approach would be the use of proper
target directories, which is documented at
http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2Fwrite-to_hdfs_files_directly_from_map.2Freduce_tasks.3F.
On Thu, Jul 9, 2015 at 2:40 AM, Nishanth S <ch...@gmail.com> wrote:
> Hi All,
>
> I have a map reduce job which reads a binary file and needs to output
> multiple avro files and a textformat file.I was able to output multiplle
> avro files using (Avromultipleouts).How would I modify the job to output
> textformat as well along with these avro files.Is it possible.
>
> Thanks,
> Nishanth
--
Harsh J
Re: Different outputformats in avro map reduce job
Posted by Nishanth S <ch...@gmail.com>.
Thank you.I would probably try to write o hdfs directly.
-Nishanth
On Thu, Jul 9, 2015 at 8:23 AM, Marshall Bockrath-Vandegrift <
llasram@damballa.com> wrote:
> Nishanth S <ch...@gmail.com> writes:
>
> > I have a map reduce job which reads a binary file and needs to output
> > multiple avro files and a textformat file.I was able to output
> > multiplle avro files using (Avromultipleouts).How would I modify the
> > job to output textformat as well along with these avro files.Is it
> > possible.
>
> I’m not aware of a general solution to this problem in raw Java
> MapReduce. But in Parkour (Clojure MapReduce wrapper) I’ve implemented
> a “de-multiplexing” output format which ties multiple outputs to
> arbitrary isolated job sub-configurations, allowing each output to
> specify a separate output format and any output format configuration.
> This both solves your problem and avoids the need for special purpose
> multiple-output classes like Avro’s.
>
> It should be fairly straightforward to implement the same thing in Java,
> or if you’re feeling adventurous it should be possible to use the
> Parkour de-multiplexing output configuration from a Java job.
>
> --
> Marshall Bockrath-Vandegrift <ll...@damballa.com>
> Principal Software Engineer, Damballa R&D
>
>
Re: Different outputformats in avro map reduce job
Posted by Marshall Bockrath-Vandegrift <ll...@damballa.com>.
Nishanth S <ch...@gmail.com> writes:
> I have a map reduce job which reads a binary file and needs to output
> multiple avro files and a textformat file.I was able to output
> multiplle avro files using (Avromultipleouts).How would I modify the
> job to output textformat as well along with these avro files.Is it
> possible.
I’m not aware of a general solution to this problem in raw Java
MapReduce. But in Parkour (Clojure MapReduce wrapper) I’ve implemented
a “de-multiplexing” output format which ties multiple outputs to
arbitrary isolated job sub-configurations, allowing each output to
specify a separate output format and any output format configuration.
This both solves your problem and avoids the need for special purpose
multiple-output classes like Avro’s.
It should be fairly straightforward to implement the same thing in Java,
or if you’re feeling adventurous it should be possible to use the
Parkour de-multiplexing output configuration from a Java job.
--
Marshall Bockrath-Vandegrift <ll...@damballa.com>
Principal Software Engineer, Damballa R&D