You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by "S. Zhou" <my...@yahoo.com> on 2013/10/28 20:11:03 UTC
stop generating these "part-XXXX" empty files when using MultipleOutputs in mapreduce job
I use MultipleOutputs so the output data are no longer stored in files "part-XXX". But they are still generated (though empty). Is it possible to stop generating these files when running MR job? (BTW, my MR job only has mapper). Thanks
Senqiang
Re: stop generating these "part-XXXX" empty files when using
MultipleOutputs in mapreduce job
Posted by Niels Basjes <Ni...@basjes.nl>.
Use the LazyOutputFormat.
Have a look at this:
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/lib/output/LazyOutputFormat.html
and
http://stackoverflow.com/questions/6137139/how-to-save-only-non-empty-reducers-output-in-hdfs
Niels Basjes
On Mon, Oct 28, 2013 at 8:11 PM, S. Zhou <my...@yahoo.com> wrote:
> I use MultipleOutputs so the output data are no longer stored in files
> "part-XXX". But they are still generated (though empty). Is it possible to
> stop generating these files when running MR job? (BTW, my MR job only has
> mapper). Thanks
>
> Senqiang
>
>
--
Best regards / Met vriendelijke groeten,
Niels Basjes
Re: stop generating these "part-XXXX" empty files when using
MultipleOutputs in mapreduce job
Posted by Niels Basjes <Ni...@basjes.nl>.
Use the LazyOutputFormat.
Have a look at this:
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/lib/output/LazyOutputFormat.html
and
http://stackoverflow.com/questions/6137139/how-to-save-only-non-empty-reducers-output-in-hdfs
Niels Basjes
On Mon, Oct 28, 2013 at 8:11 PM, S. Zhou <my...@yahoo.com> wrote:
> I use MultipleOutputs so the output data are no longer stored in files
> "part-XXX". But they are still generated (though empty). Is it possible to
> stop generating these files when running MR job? (BTW, my MR job only has
> mapper). Thanks
>
> Senqiang
>
>
--
Best regards / Met vriendelijke groeten,
Niels Basjes
Re: stop generating these "part-XXXX" empty files when using
MultipleOutputs in mapreduce job
Posted by Niels Basjes <Ni...@basjes.nl>.
Use the LazyOutputFormat.
Have a look at this:
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/lib/output/LazyOutputFormat.html
and
http://stackoverflow.com/questions/6137139/how-to-save-only-non-empty-reducers-output-in-hdfs
Niels Basjes
On Mon, Oct 28, 2013 at 8:11 PM, S. Zhou <my...@yahoo.com> wrote:
> I use MultipleOutputs so the output data are no longer stored in files
> "part-XXX". But they are still generated (though empty). Is it possible to
> stop generating these files when running MR job? (BTW, my MR job only has
> mapper). Thanks
>
> Senqiang
>
>
--
Best regards / Met vriendelijke groeten,
Niels Basjes
Re: stop generating these "part-XXXX" empty files when using
MultipleOutputs in mapreduce job
Posted by Niels Basjes <Ni...@basjes.nl>.
Use the LazyOutputFormat.
Have a look at this:
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/lib/output/LazyOutputFormat.html
and
http://stackoverflow.com/questions/6137139/how-to-save-only-non-empty-reducers-output-in-hdfs
Niels Basjes
On Mon, Oct 28, 2013 at 8:11 PM, S. Zhou <my...@yahoo.com> wrote:
> I use MultipleOutputs so the output data are no longer stored in files
> "part-XXX". But they are still generated (though empty). Is it possible to
> stop generating these files when running MR job? (BTW, my MR job only has
> mapper). Thanks
>
> Senqiang
>
>
--
Best regards / Met vriendelijke groeten,
Niels Basjes