You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by Ferdy Galema <fe...@kalooga.com> on 2011/01/10 17:37:56 UTC

how to read map_0.out

I stopped a job that was running very slowly, it was running in it's 
reduce (phase:reduce) part. However, I still want it's output and I 
cannot run this job again. So I have to stick with the intermediate files.

I have a 30GB file map_0.out (found in reducer jobcache) and I want to 
read it's contents using an InputFormat. It's not a SequenceFile as I 
already tried that out. How do I read this file? I presume it's some 
sort of sorted map of Writable key with corresponding Writable values. 
(After all, this file was being used directly for the reducer function).

Any help will be greatly appreciated.

Re: how to read map_0.out

Posted by Ferdy Galema <fe...@kalooga.com>.

Thanks. I succesfully created an InputFormat that uses an IFile.Reader. 
The fact that the files are concatenated did not seem to matter much, I 
could use a single IFile.Reader to read the entire map_0.out file.

Ferdy.

Owen O'Malley wrote:
> The intermediate files are called IFiles. The format is trivial and 
> you can read the code to see it. The only tricky bit is that you 
> effectively have N IFiles concatenated together (one per a reduce).
>
> -- Owen

Re: how to read map_0.out

Posted by Owen O'Malley <om...@apache.org>.

The intermediate files are called IFiles. The format is trivial and you can
read the code to see it. The only tricky bit is that you effectively have N
IFiles concatenated together (one per a reduce).

-- Owen