You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Ian Soboroff <ia...@nist.gov> on 2009/06/04 03:43:45 UTC

Task files in _temporary not getting promoted out

Ok, help.  I am trying to create local task outputs in my reduce job,  
and they get created, then go poof when the job's done.

My first take was to use FileOutputFormat.getWorkOutputPath, and  
create directories in there for my outputs (which are Lucene  
indexes).  Exasperated, I then wrote a small OutputFormat/RecordWriter  
pair to write the indexes.  In each case, I can see directories being  
created in attempt_foo/_temporary, but when the task is over they're  
gone.

I've stared at TextOutputFormat and I can't figure out why it's files  
survive and mine don't.  Help!  Again, this is 0.18.3.

Thanks,
Ian


Re: Task files in _temporary not getting promoted out

Posted by Ian Soboroff <ia...@nist.gov>.
No, they were completing successfully.

In the end, I got it to work by manually making a local path (via
JobConf), and then moving the output to HDFS in close().

Ian

jason hadoop <ja...@gmail.com> writes:

> Are your tasks failing or completing successfully. Failed tasks have the
> output directory wiped, only successfully completed tasks have the files
> moved up.
>
> I don't recall if the FileOutputCommitter class appeared in 0.18
>
>
> On Wed, Jun 3, 2009 at 6:43 PM, Ian Soboroff <ia...@nist.gov> wrote:
>
>> Ok, help.  I am trying to create local task outputs in my reduce job, and
>> they get created, then go poof when the job's done.
>>
>> My first take was to use FileOutputFormat.getWorkOutputPath, and create
>> directories in there for my outputs (which are Lucene indexes).
>>  Exasperated, I then wrote a small OutputFormat/RecordWriter pair to write
>> the indexes.  In each case, I can see directories being created in
>> attempt_foo/_temporary, but when the task is over they're gone.
>>
>> I've stared at TextOutputFormat and I can't figure out why it's files
>> survive and mine don't.  Help!  Again, this is 0.18.3.
>>
>> Thanks,
>> Ian
>>
>>

Re: Task files in _temporary not getting promoted out

Posted by jason hadoop <ja...@gmail.com>.
Are your tasks failing or completing successfully. Failed tasks have the
output directory wiped, only successfully completed tasks have the files
moved up.

I don't recall if the FileOutputCommitter class appeared in 0.18


On Wed, Jun 3, 2009 at 6:43 PM, Ian Soboroff <ia...@nist.gov> wrote:

> Ok, help.  I am trying to create local task outputs in my reduce job, and
> they get created, then go poof when the job's done.
>
> My first take was to use FileOutputFormat.getWorkOutputPath, and create
> directories in there for my outputs (which are Lucene indexes).
>  Exasperated, I then wrote a small OutputFormat/RecordWriter pair to write
> the indexes.  In each case, I can see directories being created in
> attempt_foo/_temporary, but when the task is over they're gone.
>
> I've stared at TextOutputFormat and I can't figure out why it's files
> survive and mine don't.  Help!  Again, this is 0.18.3.
>
> Thanks,
> Ian
>
>


-- 
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422
www.prohadoopbook.com a community for Hadoop Professionals