You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Ivan Leonardi <iv...@gmail.com> on 2010/11/27 15:45:06 UTC
Non-deterministic execution: Shuffle Error, Exceeded MAX_FAILED_UNIQUE_FETCHES
Hi,
I'm trying to split a single file throug a mep-reduce job. My input is
a sequence file where each entry represent a graph node together with
its neighbors and i would like to split it in more files.
A typical execution is, for example
hadoop jar bin/kshell1.jar jm.job.GraphPartitioner graphStructure
graphPartitions 5 93 1
where
- graphStructure is the input folder, containing just one file
- graphPartition is the output folder
- 5 is the number of partitions
- 93 is the number of graph nodes
- 1 is a flag for a "range mode" (i.e. nodes are splitted in ranges
0-18, 19-37, 38-55, 56-74 and 75-92 )
Executing the *same exact afore-mentioned command* two subsequent
times, the behavior is not the same. How is it possible? (in
attachment, the log)
I'm running Hadoop in distributed mode on 5 machines with no special
parameterization.
Thank you all!
Ivan