You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Aaron Kimball <aa...@cloudera.com> on 2009/01/23 10:29:27 UTC
_temporary directory getting deleted mid-job?
I saw some puzzling behavior tonight when running a MapReduce program I
wrote.
It would perform the mapping just fine, and would begin to shuffle. It got
to 33% complete reduce (end of shuffle) and then the task fails, claiming
that <output_dir>/_temporary was deleted.
I didn't touch HDFS while this was going on.
I tried running the job multiple more times, and this repeated twice more.
Puzzlingly, I was doing bin/hadoop fs -ls <output_dir> periodically in
another window. The _temporary directory got created just fine, but at some
point after shuffling began, it was removed.
I tried to see if I could manually race this, so I did a mkdir _temporary,
and the job proceeded just fine. Even more bizarre, the removal of the
_temporary directory did not occur on any subsequent MR jobs (executions of
the same, unmodified program). So I can't reproduce the bug.
This is on 0.18.2.
It went away, so I'm not *too* concerned, but I'd rather not deal with
heisenbugs if at all possible
So: has anyone seen this behavior? Have you figured out how to reproduce it,
or even better, prevent it?
Thanks,
- Aaron