You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by "S. Zhou" <my...@yahoo.com> on 2013/10/28 20:11:03 UTC

stop generating these "part-XXXX" empty files when using MultipleOutputs in mapreduce job

I use MultipleOutputs so the output data are no longer stored in files "part-XXX". But they are still generated (though empty). Is it possible to stop generating these files when running MR job? (BTW, my MR job only has mapper). Thanks

Senqiang

Re: stop generating these "part-XXXX" empty files when using MultipleOutputs in mapreduce job

Posted by Niels Basjes <Ni...@basjes.nl>.
Use the LazyOutputFormat.

Have a look at this:
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/lib/output/LazyOutputFormat.html
and
http://stackoverflow.com/questions/6137139/how-to-save-only-non-empty-reducers-output-in-hdfs

Niels Basjes


On Mon, Oct 28, 2013 at 8:11 PM, S. Zhou <my...@yahoo.com> wrote:

> I use MultipleOutputs so the output data are no longer stored in files
> "part-XXX". But they are still generated (though empty). Is it possible to
> stop generating these files when running MR job? (BTW, my MR job only has
> mapper). Thanks
>
> Senqiang
>
>


-- 
Best regards / Met vriendelijke groeten,

Niels Basjes

Re: stop generating these "part-XXXX" empty files when using MultipleOutputs in mapreduce job

Posted by Niels Basjes <Ni...@basjes.nl>.
Use the LazyOutputFormat.

Have a look at this:
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/lib/output/LazyOutputFormat.html
and
http://stackoverflow.com/questions/6137139/how-to-save-only-non-empty-reducers-output-in-hdfs

Niels Basjes


On Mon, Oct 28, 2013 at 8:11 PM, S. Zhou <my...@yahoo.com> wrote:

> I use MultipleOutputs so the output data are no longer stored in files
> "part-XXX". But they are still generated (though empty). Is it possible to
> stop generating these files when running MR job? (BTW, my MR job only has
> mapper). Thanks
>
> Senqiang
>
>


-- 
Best regards / Met vriendelijke groeten,

Niels Basjes

Re: stop generating these "part-XXXX" empty files when using MultipleOutputs in mapreduce job

Posted by Niels Basjes <Ni...@basjes.nl>.
Use the LazyOutputFormat.

Have a look at this:
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/lib/output/LazyOutputFormat.html
and
http://stackoverflow.com/questions/6137139/how-to-save-only-non-empty-reducers-output-in-hdfs

Niels Basjes


On Mon, Oct 28, 2013 at 8:11 PM, S. Zhou <my...@yahoo.com> wrote:

> I use MultipleOutputs so the output data are no longer stored in files
> "part-XXX". But they are still generated (though empty). Is it possible to
> stop generating these files when running MR job? (BTW, my MR job only has
> mapper). Thanks
>
> Senqiang
>
>


-- 
Best regards / Met vriendelijke groeten,

Niels Basjes

Re: stop generating these "part-XXXX" empty files when using MultipleOutputs in mapreduce job

Posted by Niels Basjes <Ni...@basjes.nl>.
Use the LazyOutputFormat.

Have a look at this:
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/lib/output/LazyOutputFormat.html
and
http://stackoverflow.com/questions/6137139/how-to-save-only-non-empty-reducers-output-in-hdfs

Niels Basjes


On Mon, Oct 28, 2013 at 8:11 PM, S. Zhou <my...@yahoo.com> wrote:

> I use MultipleOutputs so the output data are no longer stored in files
> "part-XXX". But they are still generated (though empty). Is it possible to
> stop generating these files when running MR job? (BTW, my MR job only has
> mapper). Thanks
>
> Senqiang
>
>


-- 
Best regards / Met vriendelijke groeten,

Niels Basjes