You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by "kira.wang" <ki...@xiaoi.com> on 2013/01/21 14:31:04 UTC

答复: Can't find the Job Status in WEB UI

Of course.

 

Mapreduce-site.xml

 

<configuration> 

 

  <!-- kira 2013-01-18 如果没配置以下会导致报错 -->

  <property>

    <name>mapreduce.framework.name</name>

    <value>yarn</value>

  </property>

  

</configuration>

 

<?xml version="1.0"?>

<configuration>

 

  <!-- Site specific YARN configuration properties -->

  <!-- kira 2013-01-18 ref:
http://www.cnblogs.com/scotoma/archive/2012/09/18/2689902.html -->

  <property>

    <name>yarn.resourcemanager.address</name>

    <value>master2:18040</value>

  </property>

 

  <property>

    <name>yarn.resourcemanager.scheduler.address</name>

    <value>master2:18030</value>

  </property>

  

  <property>

    <name>yarn.resourcemanager.webapp.address</name>

    <value>0.0.0.0:18088</value>

  </property>

  <property>

    <name>yarn.resourcemanager.resource-tracker.address</name>

    <value>master2:18025</value>

  </property>

 

  <property>

    <name>yarn.resourcemanager.admin.address</name>

    <value>master2:18141</value>

  </property>

 

  <property>

    <name>yarn.nodemanager.aux-services</name>

    <value>mapreduce.shuffle</value>

  </property>

  

  <property>

    <name>yarn.nodemanager.local-dirs</name>

    <value>/hadoop/tmp/nm-local-dir</value>

    <description>the local directories used by thenodemanager</description>

  </property>

 

  <property>

   <name>yarn.nodemanager.log-dirs</name>

   <value>/hadoop/tmp/log</value>

   <description>the directories used by Nodemanagers as
logdirectories</description>

  </property>

   

</configuration>

 

Core-site.xml

<configuration>

   

   <property>

    <name>fs.default.name</name>

    <value>hdfs://master2:9000</value>

    <final>true</final>

   </property>  

   

   <property>

    <name>hadoop.tmp.dir</name>

    <value>/hadoop/tmp</value>

   </property>

   

   <!-- freepose, 2013/1/17, create directory: /hadoop/tmp first -->

   <!-- 如果DataNode启动不起来,就尝试删除DataNode下面的hadoop.tmp.dir文件
-->

</configuration>

 

Hdfs-site.xml

<configuration> 

 

  <!-- kira: replication是数据副本数量,默认为3,slave少于3台会报错 -->

  <property>

    <name>dfs.replication</name>

    <value>1</value>

    <description>Default block replication. The actual number of
replications can be specified when the file is created. The default(3
replication -- by kira) is used if replication is not specified in create
time.</description>

  </property> 

  

  <property>

    <name>dfs.namenode.name.dir</name>

    <value>file:/hadoop/tmp/dfs/data,file:/opt/tmp/dfs/data</value> 

    <description>Determines where on the local filesystem the DFS name node
should store the name table(fsimage). If this is a comma-delimited list of
directories then the name table is replicated in all of the directories, for
redundancy.</description>

    <final>true</final> 

  </property>

  

  <property>

    <name>dfs.datanode.data.dir</name>

    <value>/hadoop/tmp/dfs/data,/opt/tmp/dfs/data</value> 

    <description>Determines where on the local filesystem an DFS data node
should store its blocks.If this is a comma-delimited list of
directories,then data will be stored in all named directories,typically on
different devices.Directories that do not exist are ignored.</description>

    <final>true</final> 

  </property> 

  

  <property>

    <name>dfs.block.access.key.update.interval</name>

    <value>600</value> 

    <description>Interval in minutes at which namenode updates its access
keys.</description>

  </property>

  

   <property>

    <name>dfs.block.access.token.lifetime</name>

    <value>600</value> 

    <description>The lifetime of access tokens in minutes.</description>

  </property> 

  

  <!-- kira 2012-01-19: yarn执行时: File does not exist hdfs://... 这里配置
成允许 -->

  <property>

    <name>dfs.permissions.enabled</name>

    <value>false</value> 

    <description>If "true", enable permission checking in HDFS. If "false",
permission checking is turned off, but all other behavior is unchanged.
Switching from one parameter value to the other does not change the mode,
owner or group of files or directories.</description>

  </property> 

  

</configuration>

 

 

 

发件人: Mohammad Tariq [mailto:dontariq@gmail.com] 
发送时间: 2013年1月21日 17:25
收件人: user@hadoop.apache.org
主题: Re: Can't find the Job Status in WEB UI

 

Could you share your config files with us?




Warm Regards,

Tariq

https://mtariq.jux.com/

cloudfront.blogspot.com

 

On Mon, Jan 21, 2013 at 2:49 PM, kira.wang <ki...@xiaoi.com> wrote:

1.      Actually, The job in the picture in the last email was running via
the local form.  Because I delete the mapred-site.xml in
@HADOOP_HOME/etc/Hadoop, and start resourcemanager.

2.      But, when I configured mapreduce-site.xml as below:

<property>

    <name>mapreduce.framework.name</name>

    <value>yarn</value>

  </property>

 

It does not work and carry out the errors:

 

13/01/21 16:53:16 INFO mapreduce.Job:  map 0% reduce 0%

13/01/21 16:53:16 INFO mapreduce.Job: Job job_1358758352533_0001 failed with
state FAILED due to: Application appl
ication_1358758352533_0001 failed 1 times due to AM Container for
appattempt_1358758352533_0001_000001 exited with
exitCode: 1 due to:

.Failing this attempt.. Failing the application.

13/01/21 16:53:16 INFO mapreduce.Job: Counters: 0

Job Finished in 6.192 seconds

java.io.FileNotFoundException: File does not exist:
hdfs://master2:9000/user/root/QuasiMonteCarlo_TMP_3_141592654/
out/reduce-out

        at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSy
stem.java:736)

        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1685)

        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1709)

        at
org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:3
14)

        at
org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:351)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

        at
org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:360)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57
)

        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:43)

        at java.lang.reflect.Method.invoke(Method.java:601)

        at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver
.java:72)

        at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)

        at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:68)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57
)

        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:43)

        at java.lang.reflect.Method.invoke(Method.java:601)

        at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

 

I checked the logs: the container status changes from ACCEPTED to FAILED
suddenly, 

 

2013-01-21 16:53:13,310 INFO
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Done
launching container Container: [ContainerId:
container_1358758352533_0001_01_000001, NodeId: xiaoi-115:50782,
NodeHttpAddress: xiaoi-115:8042, Resource: memory: 1536, Priority:
org.apache.hadoop.yarn.api.records.impl.pb.PriorityPBImpl@1f, State: NEW,
Token: null, Status: container_id {, app_attempt_id {, application_id {, id:
1, cluster_timestamp: 1358758352533, }, attemptId: 1, }, id: 1, }, state:
C_NEW, ] for AM appattempt_1358758352533_0001_000001

2013-01-21 16:53:13,311 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl
: appattempt_1358758352533_0001_000001 State change from ALLOCATED to
LAUNCHED

2013-01-21 16:53:13,693 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
container_1358758352533_0001_01_000001 Container Transitioned from ACQUIRED
to RUNNING

2013-01-21 16:53:15,703 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
container_1358758352533_0001_01_000001 Container Transitioned from RUNNING
to COMPLETED

2013-01-21 16:53:15,703 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApp:
Completed container: container_1358758352533_0001_01_000001 in state:
COMPLETED event:FINISHED

2013-01-21 16:53:15,703 INFO
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root
OPERATION=AM Released Container     TARGET=SchedulerApp   RESULT=SUCCESS
APPID=application_1358758352533_0001
CONTAINERID=container_1358758352533_0001_01_000001

2013-01-21 16:53:15,703 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode:
Released container container_1358758352533_0001_01_000001 of capacity
memory: 1536 on host xiaoi-115:50782, which currently has 0 containers,
memory: 0 used and memory: 8192 available, release resources=true

2013-01-21 16:53:15,704 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler:
Application appattempt_1358758352533_0001_000001 released container
container_1358758352533_0001_01_000001 on node: host: xiaoi-115:50782
#containers=0 available=8192 used=0 with event: FINISHED

2013-01-21 16:53:15,705 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl
: appattempt_1358758352533_0001_000001 State change from LAUNCHED to FAILED

2013-01-21 16:53:15,705 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application
application_1358758352533_0001 failed 1 times due to AM Container for
appattempt_1358758352533_0001_000001 exited with  exitCode: 1 due to: 

.Failing this attempt.. Failing the application.

2013-01-21 16:53:15,706 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
application_1358758352533_0001 State change from ACCEPTED to FAILED

2013-01-21 16:53:15,707 WARN
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root
OPERATION=Application Finished - Failed       TARGET=RMAppManager
RESULT=FAILURE     DESCRIPTION=App failed with state: FAILED
PERMISSIONS=Application application_1358758352533_0001 failed 1 times due to
AM Container for appattempt_1358758352533_0001_000001 exited with  exitCode:
1 due to: 

.Failing this attempt.. Failing the application.
APPID=application_1358758352533_0001

2013-01-21 16:53:15,708 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo:
Application application_1358758352533_0001 requests cleared

2013-01-21 16:53:15,709 INFO
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummar
y:
appId=application_1358758352533_0001,name=QuasiMonteCarlo,user=root,queue=de
fault,state=FAILED,trackingUrl=master2:18088/proxy/application_1358758352533
_0001/,appMasterHost=N/A,startTime=1358758392410,finishTime=1358758395706

 

      Where should the problem be addressed? 

      I am looking forward to your reply. Thanks.

 

 

 

 

发件人: Harsh J [mailto:harsh@cloudera.com] 
发送时间: 2013年1月21日 16:05
收件人: <us...@hadoop.apache.org>
主题: Re: Can't find the Job Status in WEB UI

 

Your jobs are running via the LocalJobRunner, which would mean that your
mapred-site.xml (mapreduce.framework.name) or yarn-site.xml (RM address
config) is not configured correctly. Your applications are running locally,
not on the cluster.

 

On Mon, Jan 21, 2013 at 12:15 PM, kira.wang <ki...@xiaoi.com> wrote:

Hi,

 

I am running a mapreduce job, but I can’t find the job status in the web UI
which the namenode(NN) servers.

As the picture shows below.

The Hadoop version is 2.0.0-alpha. Cluster: 1 NN, 3 datanodes(DNs). NN: two
NICs  DN: only one NIC.

The datanodes can only access LAN in the cluster.

cid:image002.jpg@01CDF7E5.F8D45990

 





 

-- 
Harsh J