You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by ilyal levin <ni...@gmail.com> on 2011/09/11 13:52:34 UTC

Stop chained mapreduce.

Hi
I created a chained mapreduce program where each job creates a SequenceFile
output.
My stopping condition is simply to check if the last output file (Type -
SequenceFile) is empty.
In order to do that i need to use the SequenceFile.Reader
and for him to read the data i need the path of the output file. The problem
is that i don't know the name of the file,
it usually depends on the number of the reducer. What can i do to solve
this?

Thanks.

Re: Stop chained mapreduce.

Posted by Alejandro Abdelnur <tu...@cloudera.com>.

Ilyal,

The MR output files names follow the pattern part-#### and you'll have as
many as reducers your job had.

As you know the output directory, you could do a fs.listStatus() of the
output directory and check all the part-* files.

Hope this helps.

Thanks.

Alejandro

On Sun, Sep 11, 2011 at 4:52 AM, ilyal levin <ni...@gmail.com> wrote:

> Hi
> I created a chained mapreduce program where each job creates a SequenceFile
> output.
> My stopping condition is simply to check if the last output file (Type -
> SequenceFile) is empty.
>  In order to do that i need to use the SequenceFile.Reader
> and for him to read the data i need the path of the output file. The
> problem is that i don't know the name of the file,
> it usually depends on the number of the reducer. What can i do to solve
> this?
>
> Thanks.
>