You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Pedro Costa <ps...@gmail.com> on 2010/11/12 22:42:51 UTC

Problems in running teragen example

Hi,

I'm trying to run teragen to generate data to run the terasort
example, but I get the following error. Does anyone have any clue of
what's happening?
At the end it's the my mapred-site.xml?

[code]
root@s11:~/orig/hadoop-0.20.2-dev# bin/hadoop jar
hadoop-*-examples.jar teragen 10000000000 in-dir
Generating 10000000000 using 2 maps with step of 5000000000
10/11/12 21:25:42 INFO mapred.JobClient: Running job:
job_201011122120_0001
10/11/12 21:25:43 INFO mapred.JobClient:  map 0% reduce 0%
10/11/12 21:28:44 INFO mapred.JobClient:  map 1% reduce 0%
10/11/12 21:31:26 INFO mapred.JobClient:  map 2% reduce 0%
10/11/12 21:34:17 INFO mapred.JobClient:  map 3% reduce 0%
10/11/12 21:36:37 INFO mapred.JobClient: Task Id :
attempt_201011122120_0001_m_000001_0, Status : FAILED
java.io.IOException: All datanodes 127.0.0.1:50010 are bad.
Aborting...
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2523)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2076)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2239)

10/11/12 21:36:37 WARN mapred.JobClient: Error reading task
outputhttp://s11.wan:50060/tasklog?plaintext=true&taskid=attempt_201011122120_0001_m_000001_0&filter=stdout
10/11/12 21:36:37 WARN mapred.JobClient: Error reading task
outputhttp://s11.wan:50060/tasklog?plaintext=true&taskid=attempt_201011122120_0001_m_000001_0&filter=stderr
10/11/12 21:36:38 INFO mapred.JobClient:  map 1% reduce 0%
10/11/12 21:36:40 INFO mapred.JobClient: Task Id :
attempt_201011122120_0001_m_000000_0, Status : FAILED
java.io.IOException: All datanodes 127.0.0.1:50010 are bad.
Aborting...
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2523)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2076)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2239)

10/11/12 21:36:41 INFO mapred.JobClient:  map 0% reduce 0%
10/11/12 21:36:43 INFO mapred.JobClient: Task Id :
attempt_201011122120_0001_m_000001_1, Status : FAILED
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/user/root/in-dir/_temporary/_attempt_201011122120_0001_m_000001_1/part-00001
could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1267)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
        at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown
Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:416)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

        at org.apache.hadoop.ipc.Client.call(Client.java:739)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy1.addBlock(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at $Proxy1.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2904)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2786)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262)

10/11/12 21:36:46 INFO mapred.JobClient: Task Id :
attempt_201011122120_0001_m_000000_1, Status : FAILED
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/user/root/in-dir/_temporary/_attempt_201011122120_0001_m_000000_1/part-00000
could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1267)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
        at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown
Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:416)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

        at org.apache.hadoop.ipc.Client.call(Client.java:739)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy1.addBlock(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at $Proxy1.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2904)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2786)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262)

10/11/12 21:36:49 INFO mapred.JobClient: Task Id :
attempt_201011122120_0001_m_000001_2, Status : FAILED
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/user/root/in-dir/_temporary/_attempt_201011122120_0001_m_000001_2/part-00001
could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1267)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
        at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown
Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:416)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

        at org.apache.hadoop.ipc.Client.call(Client.java:739)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy1.addBlock(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at $Proxy1.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2904)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2786)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262)

10/11/12 21:36:52 INFO mapred.JobClient: Task Id :
attempt_201011122120_0001_m_000000_2, Status : FAILED
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/user/root/in-dir/_temporary/_attempt_201011122120_0001_m_000000_2/part-00000
could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1267)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
        at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown
Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:416)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

        at org.apache.hadoop.ipc.Client.call(Client.java:739)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy1.addBlock(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at $Proxy1.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2904)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2786)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262)

10/11/12 21:36:58 INFO mapred.JobClient: Job complete: job_201011122120_0001
10/11/12 21:36:58 INFO mapred.JobClient: Counters: 2
10/11/12 21:36:58 INFO mapred.JobClient:   Job Counters
10/11/12 21:36:58 INFO mapred.JobClient:     Launched map tasks=8
10/11/12 21:36:58 INFO mapred.JobClient:     Failed map tasks=1
java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
        at org.apache.hadoop.examples.terasort.TeraGen.run(TeraGen.java:352)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.examples.terasort.TeraGen.main(TeraGen.java:357)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
root@s11:~/orig/hadoop-0.20.2-dev#
[/code]

Here's the mapred-site.xml

[code]

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>
  <name>hadoop.tmp.dir</name>
  <value>/tmp/dir/hadoop-${user.name}</value>
  <description>A base for other temporary directories.</description>
</property>

<property>
  <name>fs.default.name</name>
  <value>hdfs://localhost:54310</value>
  <description>The name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a FileSystem.</description>
</property>


<property>
  <name>mapred.job.tracker</name>
  <value>localhost:9001</value>
  <description>The host and port that the MapReduce job tracker runs
  at.  If "local", then jobs are run in-process as a single map
  and reduce task.
  </description>
</property>

<property>
  <name>dfs.replication</name>
  <value>1</value>
  <description>Default block replication.
  The actual number of replications can be specified when the file is created.
  The default is used if replication is not specified in create time.
  </description>
</property>

<property>
        <name>mapred.reduce.tasks</name>
        <value>2</value>
 </property>

<property>
        <name>mapred.local.dir</name>
        <value>/root/temp</value>
</property>

<property>
        <name>mapreduce.map.log.level</name>
        <value>ALL</value>
</property>

<property>
        <name>mapreduce.reduce.log.level</name>
        <value>ALL</value>
</property>

</configuration>



[/code]


-- 
Pedro