You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by "Gao, Yunlong" <dg...@gmail.com> on 2016/08/18 07:05:52 UTC

Issue with Hadoop Job History Server

To whom it may concern,

I am using Hadoop 2.7.1.2.3.6.0-3796, with the Hortonworks distribution of
HDP-2.3.6.0-3796. I have a question with the Hadoop Job History sever.

After I set up everything, the resource manager/name nodes/data nodes seem
to be running fine. But the job history server is not working correctly.
The issue with it is that the UI of the job history server does not show
any jobs.  And all the rest calls to the job history server do not work
either. Also notice that there is no logs in HDFS under the directory of "
mapreduce.jobhistory.done-dir"

I have tried with different things, including restarting the job history
server and monitor the log -- no error/exceptions is observed. I also
rename the /hadoop/mapreduce/jhs/mr-jhs-state for the state recovery of job
history server, and then restart it again, but no particular error happens.
I tried with some other random stuff that I borrowed from online
blogs/documents but got no luck.


Any help would be very much appreciated.

Thanks,
Yunlong

RE: Issue with Hadoop Job History Server

Posted by Benjamin Ross <br...@Lattice-Engines.com>.
Turns out we made a stupid mistake - our system was managing to mix configuration between an old cluster and a new cluster.  So, things are working now.

Thanks,
Ben
________________________________
From: Benjamin Ross
Sent: Thursday, August 18, 2016 10:05 AM
To: Rohith Sharma K S; Gao, Yunlong
Cc: user@hadoop.apache.org
Subject: RE: Issue with Hadoop Job History Server

Rohith,
Thanks - we're still having issues.  Can you help out with this?

How do you specify the done directory for an MR job?  The job history done dir is mapreduce.jobhistory.done-dir.  I specified the job one as mapreduce.jobtracker.jobhistory.location as per the documentation here.
https://hadoop.apache.org/docs/r2.7.1/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml

They're both set to the same thing.  I did a recursive ls on hadoop and it doesn't seem like there are any directories called "done" with recent data in them.  All of the data in /mr-history is old.  Here's a summary of that ls:

drwx------   - yarn          hadoop          0 2016-07-14 16:39 /ats/done
drwxr-xr-x   - yarn          hadoop          0 2016-07-14 16:39 /ats/done/1468528507723
drwxr-xr-x   - yarn          hadoop          0 2016-07-14 16:39 /ats/done/1468528507723/0000
drwxr-xr-x   - yarn          hadoop          0 2016-07-25 20:10 /ats/done/1468528507723/0000/000
drwxrwxrwx   - mapred        hadoop          0 2016-07-19 14:47 /mr-history/done
drwxrwx---   - mapred        hadoop          0 2016-07-19 14:47 /mr-history/done/2016
drwxrwx---   - mapred        hadoop          0 2016-07-19 14:47 /mr-history/done/2016/07
drwxrwx---   - mapred        hadoop          0 2016-07-27 13:49 /mr-history/done/2016/07/19
drwxrwxrwt   - bross         hdfs            0 2016-08-15 22:39 /tmp/hadoop-yarn/staging/history/done_intermediate
       =========> lots of recent data in /tmp/hadoop-yarn/staging/history/done_intermediate

Here's our mapred-site.xml:

  <configuration>

    <property>
      <name>mapreduce.admin.map.child.java.opts</name>
      <value>-server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.6.0-3796</value>
    </property>

    <property>
      <name>mapreduce.admin.reduce.child.java.opts</name>
      <value>-server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.6.0-3796</value>
    </property>

    <property>
      <name>mapreduce.admin.user.env</name>
      <value>LD_LIBRARY_PATH=/usr/hdp/2.3.6.0-3796/hadoop/lib/native:/usr/hdp/2.3.6.0-3796/hadoop/lib/native/Linux-amd64-64</value>
    </property>

    <property>
      <name>mapreduce.am.max-attempts</name>
      <value>2</value>
    </property>

    <property>
      <name>mapreduce.application.classpath</name>
      <value>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/2.3.6.0-3796/hadoop/lib/hadoop-lzo-0.6.0.2.3.6.0-3796.jar:/etc/hadoop/conf/secure</value>
    </property>

    <property>
      <name>mapreduce.application.framework.path</name>
      <value>/hdp/apps/2.3.6.0-3796/mapreduce/mapreduce.tar.gz#mr-framework</value>
    </property>

    <property>
      <name>mapreduce.cluster.administrators</name>
      <value> hadoop</value>
    </property>

    <property>
      <name>mapreduce.framework.name</name>
      <value>yarn</value>
    </property>

    <property>
      <name>mapreduce.job.counters.max</name>
      <value>130</value>
    </property>

    <property>
      <name>mapreduce.job.emit-timeline-data</name>
      <value>false</value>
    </property>

    <property>
      <name>mapreduce.job.reduce.slowstart.completedmaps</name>
      <value>0.05</value>
    </property>

    <property>
      <name>mapreduce.job.user.classpath.first</name>
      <value>true</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.address</name>
      <value>bodcdevhdp6.dev.lattice.local:10020</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.bind-host</name>
      <value>0.0.0.0</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.done-dir</name>
      <value>/mr-history/done</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.intermediate-done-dir</name>
      <value>/mr-history/tmp</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.recovery.enable</name>
      <value>true</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.recovery.store.class</name>
      <value>org.apache.hadoop.mapreduce.v2.hs.HistoryServerLeveldbStateStoreService</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.recovery.store.leveldb.path</name>
      <value>/hadoop/mapreduce/jhs</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.webapp.address</name>
      <value>bodcdevhdp6.dev.lattice.local:19888</value>
    </property>

    <property>
      <name>mapreduce.jobtracker.jobhistory.completed.location</name>
      <value>/mr-history/done</value>
    </property>

    <property>
      <name>mapreduce.map.java.opts</name>
      <value>-Xmx4915m</value>
    </property>

    <property>
      <name>mapreduce.map.log.level</name>
      <value>INFO</value>
    </property>

    <property>
      <name>mapreduce.map.memory.mb</name>
      <value>6144</value>
    </property>

    <property>
      <name>mapreduce.map.output.compress</name>
      <value>false</value>
    </property>

    <property>
      <name>mapreduce.map.sort.spill.percent</name>
      <value>0.7</value>
    </property>

    <property>
      <name>mapreduce.map.speculative</name>
      <value>false</value>
    </property>

    <property>
      <name>mapreduce.output.fileoutputformat.compress</name>
      <value>false</value>
    </property>

    <property>
      <name>mapreduce.output.fileoutputformat.compress.type</name>
      <value>BLOCK</value>
    </property>

    <property>
      <name>mapreduce.reduce.input.buffer.percent</name>
      <value>0.0</value>
    </property>

    <property>
      <name>mapreduce.reduce.java.opts</name>
      <value>-Xmx9830m</value>
    </property>

    <property>
      <name>mapreduce.reduce.log.level</name>
      <value>INFO</value>
    </property>

    <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>12288</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.fetch.retry.enabled</name>
      <value>1</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.fetch.retry.interval-ms</name>
      <value>1000</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.fetch.retry.timeout-ms</name>
      <value>30000</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.input.buffer.percent</name>
      <value>0.7</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.merge.percent</name>
      <value>0.66</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.parallelcopies</name>
      <value>30</value>
    </property>

    <property>
      <name>mapreduce.reduce.speculative</name>
      <value>false</value>
    </property>

    <property>
      <name>mapreduce.shuffle.port</name>
      <value>13562</value>
    </property>

    <property>
      <name>mapreduce.task.io.sort.factor</name>
      <value>100</value>
    </property>

    <property>
      <name>mapreduce.task.io.sort.mb</name>
      <value>2047</value>
    </property>

    <property>
      <name>mapreduce.task.timeout</name>
      <value>300000</value>
    </property>

    <property>
      <name>yarn.app.mapreduce.am.admin-command-opts</name>
      <value>-Dhdp.version=2.3.6.0-3796</value>
    </property>

    <property>
      <name>yarn.app.mapreduce.am.command-opts</name>
      <value>-Xmx4915m -Dhdp.version=${hdp.version}</value>
    </property>

    <property>
      <name>yarn.app.mapreduce.am.log.level</name>
      <value>INFO</value>
    </property>

    <property>
      <name>yarn.app.mapreduce.am.resource.mb</name>
      <value>6144</value>
    </property>

    <property>
      <name>yarn.app.mapreduce.am.staging-dir</name>
      <value>/user</value>
    </property>

  </configuration>

Thanks,
Ben

________________________________
From: Rohith Sharma K S [ksrohithsharma@gmail.com]
Sent: Thursday, August 18, 2016 3:17 AM
To: Gao, Yunlong
Cc: user@hadoop.apache.org; Benjamin Ross
Subject: Re: Issue with Hadoop Job History Server

MR jobs and JHS should have same configurations for done-dir if configured. Otherwise staging-dir should be same for both. Make sure both Job and JHS has same configurations value.

Usually what would happen is , MRApp writes job file in one location and HistoryServer trying to read from different location. This causes, JHS to display empty jobs.

Thanks & Regards
Rohith Sharma K S

On Aug 18, 2016, at 12:35 PM, Gao, Yunlong <dg...@gmail.com>> wrote:

To whom it may concern,

I am using Hadoop 2.7.1.2.3.6.0-3796, with the Hortonworks distribution of HDP-2.3.6.0-3796. I have a question with the Hadoop Job History sever.

After I set up everything, the resource manager/name nodes/data nodes seem to be running fine. But the job history server is not working correctly.  The issue with it is that the UI of the job history server does not show any jobs.  And all the rest calls to the job history server do not work either. Also notice that there is no logs in HDFS under the directory of "mapreduce.jobhistory.done-dir"

I have tried with different things, including restarting the job history server and monitor the log -- no error/exceptions is observed. I also rename the /hadoop/mapreduce/jhs/mr-jhs-state for the state recovery of job history server, and then restart it again, but no particular error happens. I tried with some other random stuff that I borrowed from online blogs/documents but got no luck.


Any help would be very much appreciated.

Thanks,
Yunlong





Click here<https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ==> to report this email as spam.


This message has been scanned for malware by Websense. www.websense.com

RE: Issue with Hadoop Job History Server

Posted by Benjamin Ross <br...@Lattice-Engines.com>.
Rohith,
Thanks - we're still having issues.  Can you help out with this?

How do you specify the done directory for an MR job?  The job history done dir is mapreduce.jobhistory.done-dir.  I specified the job one as mapreduce.jobtracker.jobhistory.location as per the documentation here.
https://hadoop.apache.org/docs/r2.7.1/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml

They're both set to the same thing.  I did a recursive ls on hadoop and it doesn't seem like there are any directories called "done" with recent data in them.  All of the data in /mr-history is old.  Here's a summary of that ls:

drwx------   - yarn          hadoop          0 2016-07-14 16:39 /ats/done
drwxr-xr-x   - yarn          hadoop          0 2016-07-14 16:39 /ats/done/1468528507723
drwxr-xr-x   - yarn          hadoop          0 2016-07-14 16:39 /ats/done/1468528507723/0000
drwxr-xr-x   - yarn          hadoop          0 2016-07-25 20:10 /ats/done/1468528507723/0000/000
drwxrwxrwx   - mapred        hadoop          0 2016-07-19 14:47 /mr-history/done
drwxrwx---   - mapred        hadoop          0 2016-07-19 14:47 /mr-history/done/2016
drwxrwx---   - mapred        hadoop          0 2016-07-19 14:47 /mr-history/done/2016/07
drwxrwx---   - mapred        hadoop          0 2016-07-27 13:49 /mr-history/done/2016/07/19
drwxrwxrwt   - bross         hdfs            0 2016-08-15 22:39 /tmp/hadoop-yarn/staging/history/done_intermediate
       =========> lots of recent data in /tmp/hadoop-yarn/staging/history/done_intermediate

Here's our mapred-site.xml:

  <configuration>

    <property>
      <name>mapreduce.admin.map.child.java.opts</name>
      <value>-server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.6.0-3796</value>
    </property>

    <property>
      <name>mapreduce.admin.reduce.child.java.opts</name>
      <value>-server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.6.0-3796</value>
    </property>

    <property>
      <name>mapreduce.admin.user.env</name>
      <value>LD_LIBRARY_PATH=/usr/hdp/2.3.6.0-3796/hadoop/lib/native:/usr/hdp/2.3.6.0-3796/hadoop/lib/native/Linux-amd64-64</value>
    </property>

    <property>
      <name>mapreduce.am.max-attempts</name>
      <value>2</value>
    </property>

    <property>
      <name>mapreduce.application.classpath</name>
      <value>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/2.3.6.0-3796/hadoop/lib/hadoop-lzo-0.6.0.2.3.6.0-3796.jar:/etc/hadoop/conf/secure</value>
    </property>

    <property>
      <name>mapreduce.application.framework.path</name>
      <value>/hdp/apps/2.3.6.0-3796/mapreduce/mapreduce.tar.gz#mr-framework</value>
    </property>

    <property>
      <name>mapreduce.cluster.administrators</name>
      <value> hadoop</value>
    </property>

    <property>
      <name>mapreduce.framework.name</name>
      <value>yarn</value>
    </property>

    <property>
      <name>mapreduce.job.counters.max</name>
      <value>130</value>
    </property>

    <property>
      <name>mapreduce.job.emit-timeline-data</name>
      <value>false</value>
    </property>

    <property>
      <name>mapreduce.job.reduce.slowstart.completedmaps</name>
      <value>0.05</value>
    </property>

    <property>
      <name>mapreduce.job.user.classpath.first</name>
      <value>true</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.address</name>
      <value>bodcdevhdp6.dev.lattice.local:10020</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.bind-host</name>
      <value>0.0.0.0</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.done-dir</name>
      <value>/mr-history/done</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.intermediate-done-dir</name>
      <value>/mr-history/tmp</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.recovery.enable</name>
      <value>true</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.recovery.store.class</name>
      <value>org.apache.hadoop.mapreduce.v2.hs.HistoryServerLeveldbStateStoreService</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.recovery.store.leveldb.path</name>
      <value>/hadoop/mapreduce/jhs</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.webapp.address</name>
      <value>bodcdevhdp6.dev.lattice.local:19888</value>
    </property>

    <property>
      <name>mapreduce.jobtracker.jobhistory.completed.location</name>
      <value>/mr-history/done</value>
    </property>

    <property>
      <name>mapreduce.map.java.opts</name>
      <value>-Xmx4915m</value>
    </property>

    <property>
      <name>mapreduce.map.log.level</name>
      <value>INFO</value>
    </property>

    <property>
      <name>mapreduce.map.memory.mb</name>
      <value>6144</value>
    </property>

    <property>
      <name>mapreduce.map.output.compress</name>
      <value>false</value>
    </property>

    <property>
      <name>mapreduce.map.sort.spill.percent</name>
      <value>0.7</value>
    </property>

    <property>
      <name>mapreduce.map.speculative</name>
      <value>false</value>
    </property>

    <property>
      <name>mapreduce.output.fileoutputformat.compress</name>
      <value>false</value>
    </property>

    <property>
      <name>mapreduce.output.fileoutputformat.compress.type</name>
      <value>BLOCK</value>
    </property>

    <property>
      <name>mapreduce.reduce.input.buffer.percent</name>
      <value>0.0</value>
    </property>

    <property>
      <name>mapreduce.reduce.java.opts</name>
      <value>-Xmx9830m</value>
    </property>

    <property>
      <name>mapreduce.reduce.log.level</name>
      <value>INFO</value>
    </property>

    <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>12288</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.fetch.retry.enabled</name>
      <value>1</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.fetch.retry.interval-ms</name>
      <value>1000</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.fetch.retry.timeout-ms</name>
      <value>30000</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.input.buffer.percent</name>
      <value>0.7</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.merge.percent</name>
      <value>0.66</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.parallelcopies</name>
      <value>30</value>
    </property>

    <property>
      <name>mapreduce.reduce.speculative</name>
      <value>false</value>
    </property>

    <property>
      <name>mapreduce.shuffle.port</name>
      <value>13562</value>
    </property>

    <property>
      <name>mapreduce.task.io.sort.factor</name>
      <value>100</value>
    </property>

    <property>
      <name>mapreduce.task.io.sort.mb</name>
      <value>2047</value>
    </property>

    <property>
      <name>mapreduce.task.timeout</name>
      <value>300000</value>
    </property>

    <property>
      <name>yarn.app.mapreduce.am.admin-command-opts</name>
      <value>-Dhdp.version=2.3.6.0-3796</value>
    </property>

    <property>
      <name>yarn.app.mapreduce.am.command-opts</name>
      <value>-Xmx4915m -Dhdp.version=${hdp.version}</value>
    </property>

    <property>
      <name>yarn.app.mapreduce.am.log.level</name>
      <value>INFO</value>
    </property>

    <property>
      <name>yarn.app.mapreduce.am.resource.mb</name>
      <value>6144</value>
    </property>

    <property>
      <name>yarn.app.mapreduce.am.staging-dir</name>
      <value>/user</value>
    </property>

  </configuration>

Thanks,
Ben

________________________________
From: Rohith Sharma K S [ksrohithsharma@gmail.com]
Sent: Thursday, August 18, 2016 3:17 AM
To: Gao, Yunlong
Cc: user@hadoop.apache.org; Benjamin Ross
Subject: Re: Issue with Hadoop Job History Server

MR jobs and JHS should have same configurations for done-dir if configured. Otherwise staging-dir should be same for both. Make sure both Job and JHS has same configurations value.

Usually what would happen is , MRApp writes job file in one location and HistoryServer trying to read from different location. This causes, JHS to display empty jobs.

Thanks & Regards
Rohith Sharma K S

On Aug 18, 2016, at 12:35 PM, Gao, Yunlong <dg...@gmail.com>> wrote:

To whom it may concern,

I am using Hadoop 2.7.1.2.3.6.0-3796, with the Hortonworks distribution of HDP-2.3.6.0-3796. I have a question with the Hadoop Job History sever.

After I set up everything, the resource manager/name nodes/data nodes seem to be running fine. But the job history server is not working correctly.  The issue with it is that the UI of the job history server does not show any jobs.  And all the rest calls to the job history server do not work either. Also notice that there is no logs in HDFS under the directory of "mapreduce.jobhistory.done-dir"

I have tried with different things, including restarting the job history server and monitor the log -- no error/exceptions is observed. I also rename the /hadoop/mapreduce/jhs/mr-jhs-state for the state recovery of job history server, and then restart it again, but no particular error happens. I tried with some other random stuff that I borrowed from online blogs/documents but got no luck.


Any help would be very much appreciated.

Thanks,
Yunlong





Click here<https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ==> to report this email as spam.


This message has been scanned for malware by Websense. www.websense.com

Re: Issue with Hadoop Job History Server

Posted by Rohith Sharma K S <ks...@gmail.com>.
MR jobs and JHS should have same configurations for done-dir if configured. Otherwise staging-dir should be same for both. Make sure both Job and JHS has same configurations value.

Usually what would happen is , MRApp writes job file in one location and HistoryServer trying to read from different location. This causes, JHS to display empty jobs.

Thanks & Regards
Rohith Sharma K S

> On Aug 18, 2016, at 12:35 PM, Gao, Yunlong <dg...@gmail.com> wrote:
> 
> To whom it may concern,
> 
> I am using Hadoop 2.7.1.2.3.6.0-3796, with the Hortonworks distribution of HDP-2.3.6.0-3796. I have a question with the Hadoop Job History sever. 
> 
> After I set up everything, the resource manager/name nodes/data nodes seem to be running fine. But the job history server is not working correctly.  The issue with it is that the UI of the job history server does not show any jobs.  And all the rest calls to the job history server do not work either. Also notice that there is no logs in HDFS under the directory of "mapreduce.jobhistory.done-dir"
> 
> I have tried with different things, including restarting the job history server and monitor the log -- no error/exceptions is observed. I also rename the /hadoop/mapreduce/jhs/mr-jhs-state for the state recovery of job history server, and then restart it again, but no particular error happens. I tried with some other random stuff that I borrowed from online blogs/documents but got no luck.
> 
> 
> Any help would be very much appreciated.
> 
> Thanks,
> Yunlong
>