You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Anfernee Xu <an...@gmail.com> on 2014/02/21 03:08:05 UTC

history server for 2 clusters

Hi,

I'm at 2.2.0 release and I have a HDFS cluster which is shared by 2
YARN(MR) cluster, also I have a single shared history server, what I'm
seeing is I can see all job summary for all jobs from history server UI, I
also can see task log for jobs running in one cluster, but if I want to see
log for jobs running in another cluster, it showed me below error

Logs not available for attempt_1392933787561_0024_m_000000_0. Aggregation
may not be complete, Check back later or try the nodemanager at
slc03jvt.mydomain.com:31303

Here's my configuration:

Note: my history server is running on RM node of the MR cluster where I can
see the log.


--------mapred-site.xml
<property>
  <name>mapreduce.jobhistory.address</name>
  <value>slc00dgd:10020</value>
  <description>MapReduce JobHistory Server IPC host:port</description>
</property>

<property>
  <name>mapreduce.jobhistory.webapp.address</name>
  <value>slc00dgd:19888</value>
  <description>MapReduce JobHistory Server Web UI host:port</description>
</property>

----------yarn-site.xml
  <property>
     <name>yarn.log-aggregation-enable</name>
     <value>true</value>
   </property>

   <property>
     <name>yarn.nodemanager.remote-app-log-dir-suffix</name>
     <value>dc</value>
   </property>

Above configuration are almost same for both clusters, the only difference
is "yarn.nodemanager.remote-app-log-dir-suffix", they have different suffix.



-- 
--Anfernee

Re: history server for 2 clusters

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.
Interesting use-case and setup. We never had this use-case in mind so far - we so far assumed a history-server per YARN cluster. You may be running into some issues where this assumption is not valid.

Why do you need two separate YARN clusters for the same underlying data on HDFS? And if that can't change, why can't you have two history-servers?

+Vinod

On Feb 20, 2014, at 6:08 PM, Anfernee Xu <an...@gmail.com> wrote:

> Hi,
> 
> I'm at 2.2.0 release and I have a HDFS cluster which is shared by 2 YARN(MR) cluster, also I have a single shared history server, what I'm seeing is I can see all job summary for all jobs from history server UI, I also can see task log for jobs running in one cluster, but if I want to see log for jobs running in another cluster, it showed me below error
> 
> Logs not available for attempt_1392933787561_0024_m_000000_0. Aggregation may not be complete, Check back later or try the nodemanager at slc03jvt.mydomain.com:31303 
> 
> Here's my configuration:
> 
> Note: my history server is running on RM node of the MR cluster where I can see the log.
> 
> 
> --------mapred-site.xml
> <property>
>   <name>mapreduce.jobhistory.address</name>
>   <value>slc00dgd:10020</value>
>   <description>MapReduce JobHistory Server IPC host:port</description>
> </property>
> 
> <property>
>   <name>mapreduce.jobhistory.webapp.address</name>
>   <value>slc00dgd:19888</value>
>   <description>MapReduce JobHistory Server Web UI host:port</description>
> </property>
> 
> ----------yarn-site.xml
>   <property>
>      <name>yarn.log-aggregation-enable</name>
>      <value>true</value>
>    </property>
> 
>    <property>
>      <name>yarn.nodemanager.remote-app-log-dir-suffix</name>
>      <value>dc</value>
>    </property>
> 
> Above configuration are almost same for both clusters, the only difference is "yarn.nodemanager.remote-app-log-dir-suffix", they have different suffix.
> 
> 
> 
> -- 
> --Anfernee


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: history server for 2 clusters

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.
Interesting use-case and setup. We never had this use-case in mind so far - we so far assumed a history-server per YARN cluster. You may be running into some issues where this assumption is not valid.

Why do you need two separate YARN clusters for the same underlying data on HDFS? And if that can't change, why can't you have two history-servers?

+Vinod

On Feb 20, 2014, at 6:08 PM, Anfernee Xu <an...@gmail.com> wrote:

> Hi,
> 
> I'm at 2.2.0 release and I have a HDFS cluster which is shared by 2 YARN(MR) cluster, also I have a single shared history server, what I'm seeing is I can see all job summary for all jobs from history server UI, I also can see task log for jobs running in one cluster, but if I want to see log for jobs running in another cluster, it showed me below error
> 
> Logs not available for attempt_1392933787561_0024_m_000000_0. Aggregation may not be complete, Check back later or try the nodemanager at slc03jvt.mydomain.com:31303 
> 
> Here's my configuration:
> 
> Note: my history server is running on RM node of the MR cluster where I can see the log.
> 
> 
> --------mapred-site.xml
> <property>
>   <name>mapreduce.jobhistory.address</name>
>   <value>slc00dgd:10020</value>
>   <description>MapReduce JobHistory Server IPC host:port</description>
> </property>
> 
> <property>
>   <name>mapreduce.jobhistory.webapp.address</name>
>   <value>slc00dgd:19888</value>
>   <description>MapReduce JobHistory Server Web UI host:port</description>
> </property>
> 
> ----------yarn-site.xml
>   <property>
>      <name>yarn.log-aggregation-enable</name>
>      <value>true</value>
>    </property>
> 
>    <property>
>      <name>yarn.nodemanager.remote-app-log-dir-suffix</name>
>      <value>dc</value>
>    </property>
> 
> Above configuration are almost same for both clusters, the only difference is "yarn.nodemanager.remote-app-log-dir-suffix", they have different suffix.
> 
> 
> 
> -- 
> --Anfernee


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: history server for 2 clusters

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.
Interesting use-case and setup. We never had this use-case in mind so far - we so far assumed a history-server per YARN cluster. You may be running into some issues where this assumption is not valid.

Why do you need two separate YARN clusters for the same underlying data on HDFS? And if that can't change, why can't you have two history-servers?

+Vinod

On Feb 20, 2014, at 6:08 PM, Anfernee Xu <an...@gmail.com> wrote:

> Hi,
> 
> I'm at 2.2.0 release and I have a HDFS cluster which is shared by 2 YARN(MR) cluster, also I have a single shared history server, what I'm seeing is I can see all job summary for all jobs from history server UI, I also can see task log for jobs running in one cluster, but if I want to see log for jobs running in another cluster, it showed me below error
> 
> Logs not available for attempt_1392933787561_0024_m_000000_0. Aggregation may not be complete, Check back later or try the nodemanager at slc03jvt.mydomain.com:31303 
> 
> Here's my configuration:
> 
> Note: my history server is running on RM node of the MR cluster where I can see the log.
> 
> 
> --------mapred-site.xml
> <property>
>   <name>mapreduce.jobhistory.address</name>
>   <value>slc00dgd:10020</value>
>   <description>MapReduce JobHistory Server IPC host:port</description>
> </property>
> 
> <property>
>   <name>mapreduce.jobhistory.webapp.address</name>
>   <value>slc00dgd:19888</value>
>   <description>MapReduce JobHistory Server Web UI host:port</description>
> </property>
> 
> ----------yarn-site.xml
>   <property>
>      <name>yarn.log-aggregation-enable</name>
>      <value>true</value>
>    </property>
> 
>    <property>
>      <name>yarn.nodemanager.remote-app-log-dir-suffix</name>
>      <value>dc</value>
>    </property>
> 
> Above configuration are almost same for both clusters, the only difference is "yarn.nodemanager.remote-app-log-dir-suffix", they have different suffix.
> 
> 
> 
> -- 
> --Anfernee


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: history server for 2 clusters

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.
Interesting use-case and setup. We never had this use-case in mind so far - we so far assumed a history-server per YARN cluster. You may be running into some issues where this assumption is not valid.

Why do you need two separate YARN clusters for the same underlying data on HDFS? And if that can't change, why can't you have two history-servers?

+Vinod

On Feb 20, 2014, at 6:08 PM, Anfernee Xu <an...@gmail.com> wrote:

> Hi,
> 
> I'm at 2.2.0 release and I have a HDFS cluster which is shared by 2 YARN(MR) cluster, also I have a single shared history server, what I'm seeing is I can see all job summary for all jobs from history server UI, I also can see task log for jobs running in one cluster, but if I want to see log for jobs running in another cluster, it showed me below error
> 
> Logs not available for attempt_1392933787561_0024_m_000000_0. Aggregation may not be complete, Check back later or try the nodemanager at slc03jvt.mydomain.com:31303 
> 
> Here's my configuration:
> 
> Note: my history server is running on RM node of the MR cluster where I can see the log.
> 
> 
> --------mapred-site.xml
> <property>
>   <name>mapreduce.jobhistory.address</name>
>   <value>slc00dgd:10020</value>
>   <description>MapReduce JobHistory Server IPC host:port</description>
> </property>
> 
> <property>
>   <name>mapreduce.jobhistory.webapp.address</name>
>   <value>slc00dgd:19888</value>
>   <description>MapReduce JobHistory Server Web UI host:port</description>
> </property>
> 
> ----------yarn-site.xml
>   <property>
>      <name>yarn.log-aggregation-enable</name>
>      <value>true</value>
>    </property>
> 
>    <property>
>      <name>yarn.nodemanager.remote-app-log-dir-suffix</name>
>      <value>dc</value>
>    </property>
> 
> Above configuration are almost same for both clusters, the only difference is "yarn.nodemanager.remote-app-log-dir-suffix", they have different suffix.
> 
> 
> 
> -- 
> --Anfernee


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.