You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by "YangLai (JIRA)" <ji...@apache.org> on 2009/12/05 14:07:20 UTC
[jira] Created: (MAPREDUCE-1269) Failed on write sequence files in
mapper.
Failed on write sequence files in mapper.
-----------------------------------------
Key: MAPREDUCE-1269
URL: https://issues.apache.org/jira/browse/MAPREDUCE-1269
Project: Hadoop Map/Reduce
Issue Type: Bug
Affects Versions: 0.20.1
Environment: Hadoop 0.20.1
Compiled by oom on Tue Sep 1 20:55:56 UTC 2009
Linux version 2.6.18-128.el5 (mockbuild@builder10.centos.org) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-44)) #1 SMP Wed Jan 21 10:41:14 EST 2009
Reporter: YangLai
Priority: Critical
Because the sort phase is not necessary for my job, I want to write only values into sequence files by keys. So I set a hashmap into mapper:
private HashMap<String, Writer> hm;
and I find a suitable org.apache.hadoop.io.SequenceFile.Writer by HashMap:
Writer seqWriter = hm.get(skey);
if (seqWriter==null){
try {
seqWriter = new SequenceFile.Writer(new JobClient(job).getFs()
, job, new Path(pPathOut, skey), VLongWritable.class, ByteWritable.class);
} catch (IOException e) {
e.printStackTrace();
}
if (seqWriter!=null){
hm.put(skey, seqWriter);
}else{
return;
}
}
The file names are obtained from job.get("mapred.task.id"), that insure no replicas exist.
The system always outputs :
java.io.IOException: Could not obtain block: blk_-5398274085876111743_1021 file=/YangLai/ranNum1GB/part-00015
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1787)
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1615)
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1742)
at java.io.DataInputStream.readFully(Unknown Source)
at java.io.DataInputStream.readFully(Unknown Source)
at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1450)
at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1428)
at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1417)
at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1412)
at org.apache.hadoop.mapred.SequenceFileRecordReader.(SequenceFileRecordReader.java:43)
at org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:63)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:338)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
In fact, each mapper only write 16 sequence files, that will not be overloads to the hadoop system.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.