You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@ambari.apache.org by Jonathan Hurley <jh...@hortonworks.com> on 2017/12/11 14:04:58 UTC

Review Request 64502: Downloaded client configs have invalid values for spark properties in yarn-site.xml

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64502/
-----------------------------------------------------------

Review request for Ambari, Dmitro Lisnichenko and Nate Cole.


Bugs: AMBARI-22628
    https://issues.apache.org/jira/browse/AMBARI-22628


Repository: ambari


Description
-------

Downloaded client configs have invalid values for spark properties in yarn-site.xml.

Issue: spark_version variable is replaced by 'None' in the spark related config properties in yarn-site in the client configs downloaded.

Attaching downloaded yarn-site.xml
 [^yarn-site.xml] 

Properties with issue:

{code:java}
<property>
      <name>yarn.nodemanager.aux-services.spark2_shuffle.classpath</name>
      <value>/usr/hdp/None/spark2/aux/*</value>
    </property>

 <property>
      <name>yarn.nodemanager.aux-services.spark_shuffle.classpath</name>
      <value>/usr/hdp/None/spark/aux/*</value>
    </property>

<property>
      <name>yarn.timeline-service.entity-group-fs-store.group-id-plugin-classpath</name>
      <value>/usr/hdp/None/spark/hdpLib/*</value>
    </property>
{code}

The cause for this is that YARN Clients on hosts without daemons never get a restart command after the initial {{yarn-site.xml}}, and can never fill in the correct values.


Diffs
-----

  ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/scripts/params_linux.py 98141456c7 


Diff: https://reviews.apache.org/r/64502/diff/1/


Testing
-------

Manual install via UI and Blueprint


Thanks,

Jonathan Hurley


Re: Review Request 64502: YARN Shuffle Service Can't Be Found On Client-Only Nodes After New Cluster Install

Posted by Jonathan Hurley <jh...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64502/#review193403
-----------------------------------------------------------




ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
Lines 2812-2841 (patched)
<https://reviews.apache.org/r/64502/#comment271958>

    There is change to the logic here. We're only going to send down a component version if it advertises a version AND its been resolved.
    
    There's no point in sending down a version if it's not trusted, as is the case with "latest" installations.


- Jonathan Hurley


On Dec. 11, 2017, 12:04 p.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64502/
> -----------------------------------------------------------
> 
> (Updated Dec. 11, 2017, 12:04 p.m.)
> 
> 
> Review request for Ambari, Dmitro Lisnichenko and Nate Cole.
> 
> 
> Bugs: AMBARI-22628
>     https://issues.apache.org/jira/browse/AMBARI-22628
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Installing a new cluster can create values in yarn-site.xml which have {{None}} specified in the classpath for Spark
> 
> ```
> <property>
>       <name>yarn.nodemanager.aux-services.spark2_shuffle.classpath</name>
>       <value>/usr/hdp/None/spark2/aux/*</value>
>     </property>
> 
>  <property>
>       <name>yarn.nodemanager.aux-services.spark_shuffle.classpath</name>
>       <value>/usr/hdp/None/spark/aux/*</value>
>     </property>
> 
> <property>
>       <name>yarn.timeline-service.entity-group-fs-store.group-id-plugin-classpath</name>
>       <value>/usr/hdp/None/spark/hdpLib/*</value>
>     </property>
> ```
> 
> The cause for this is that YARN Clients on hosts without daemons never get a restart command after the initial {{yarn-site.xml}}, and can never fill in the correct values. This causes problems when jobs are run on these nodes:
> 
> ```
> 2017-12-04 10:16:41,789 INFO  service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices failed in state INITED; cause: java.lang.ClassNotFoundException: org.apache.spark.network.yarn.YarnShuffleService
> java.lang.ClassNotFoundException: org.apache.spark.network.yarn.YarnShuffleService
> ```
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/ExecutionCommand.java 9d5e29ee8a 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/ClientConfigResourceProvider.java a7c712bd1a 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 15efcd2173 
>   ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java ce328f91ff 
>   ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/scripts/params_linux.py 98141456c7 
> 
> 
> Diff: https://reviews.apache.org/r/64502/diff/3/
> 
> 
> Testing
> -------
> 
> Manual install via UI and Blueprint
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>


Re: Review Request 64502: YARN Shuffle Service Can't Be Found On Client-Only Nodes After New Cluster Install

Posted by Dmitro Lisnichenko <dl...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64502/#review193410
-----------------------------------------------------------


Ship it!




Ship It!

- Dmitro Lisnichenko


On Dec. 11, 2017, 7:04 p.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64502/
> -----------------------------------------------------------
> 
> (Updated Dec. 11, 2017, 7:04 p.m.)
> 
> 
> Review request for Ambari, Dmitro Lisnichenko and Nate Cole.
> 
> 
> Bugs: AMBARI-22628
>     https://issues.apache.org/jira/browse/AMBARI-22628
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Installing a new cluster can create values in yarn-site.xml which have {{None}} specified in the classpath for Spark
> 
> ```
> <property>
>       <name>yarn.nodemanager.aux-services.spark2_shuffle.classpath</name>
>       <value>/usr/hdp/None/spark2/aux/*</value>
>     </property>
> 
>  <property>
>       <name>yarn.nodemanager.aux-services.spark_shuffle.classpath</name>
>       <value>/usr/hdp/None/spark/aux/*</value>
>     </property>
> 
> <property>
>       <name>yarn.timeline-service.entity-group-fs-store.group-id-plugin-classpath</name>
>       <value>/usr/hdp/None/spark/hdpLib/*</value>
>     </property>
> ```
> 
> The cause for this is that YARN Clients on hosts without daemons never get a restart command after the initial {{yarn-site.xml}}, and can never fill in the correct values. This causes problems when jobs are run on these nodes:
> 
> ```
> 2017-12-04 10:16:41,789 INFO  service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices failed in state INITED; cause: java.lang.ClassNotFoundException: org.apache.spark.network.yarn.YarnShuffleService
> java.lang.ClassNotFoundException: org.apache.spark.network.yarn.YarnShuffleService
> ```
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/ExecutionCommand.java 9d5e29ee8a 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/ClientConfigResourceProvider.java a7c712bd1a 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 15efcd2173 
>   ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java ce328f91ff 
>   ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/scripts/params_linux.py 98141456c7 
> 
> 
> Diff: https://reviews.apache.org/r/64502/diff/3/
> 
> 
> Testing
> -------
> 
> Manual install via UI and Blueprint
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>


Re: Review Request 64502: YARN Shuffle Service Can't Be Found On Client-Only Nodes After New Cluster Install

Posted by Nate Cole <nc...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64502/#review193416
-----------------------------------------------------------


Ship it!




Ship It!

- Nate Cole


On Dec. 11, 2017, 12:04 p.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64502/
> -----------------------------------------------------------
> 
> (Updated Dec. 11, 2017, 12:04 p.m.)
> 
> 
> Review request for Ambari, Dmitro Lisnichenko and Nate Cole.
> 
> 
> Bugs: AMBARI-22628
>     https://issues.apache.org/jira/browse/AMBARI-22628
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Installing a new cluster can create values in yarn-site.xml which have {{None}} specified in the classpath for Spark
> 
> ```
> <property>
>       <name>yarn.nodemanager.aux-services.spark2_shuffle.classpath</name>
>       <value>/usr/hdp/None/spark2/aux/*</value>
>     </property>
> 
>  <property>
>       <name>yarn.nodemanager.aux-services.spark_shuffle.classpath</name>
>       <value>/usr/hdp/None/spark/aux/*</value>
>     </property>
> 
> <property>
>       <name>yarn.timeline-service.entity-group-fs-store.group-id-plugin-classpath</name>
>       <value>/usr/hdp/None/spark/hdpLib/*</value>
>     </property>
> ```
> 
> The cause for this is that YARN Clients on hosts without daemons never get a restart command after the initial {{yarn-site.xml}}, and can never fill in the correct values. This causes problems when jobs are run on these nodes:
> 
> ```
> 2017-12-04 10:16:41,789 INFO  service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices failed in state INITED; cause: java.lang.ClassNotFoundException: org.apache.spark.network.yarn.YarnShuffleService
> java.lang.ClassNotFoundException: org.apache.spark.network.yarn.YarnShuffleService
> ```
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/ExecutionCommand.java 9d5e29ee8a 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/ClientConfigResourceProvider.java a7c712bd1a 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 15efcd2173 
>   ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java ce328f91ff 
>   ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/scripts/params_linux.py 98141456c7 
> 
> 
> Diff: https://reviews.apache.org/r/64502/diff/3/
> 
> 
> Testing
> -------
> 
> Manual install via UI and Blueprint
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>


Re: Review Request 64502: YARN Shuffle Service Can't Be Found On Client-Only Nodes After New Cluster Install

Posted by Jonathan Hurley <jh...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64502/
-----------------------------------------------------------

(Updated Dec. 11, 2017, 12:04 p.m.)


Review request for Ambari, Dmitro Lisnichenko and Nate Cole.


Changes
-------

I realized that downloading configurations is also problematic since it's done on the Ambari server and not on the real cluster. As such, we should also pass down the component version structure in client config download commands.


Bugs: AMBARI-22628
    https://issues.apache.org/jira/browse/AMBARI-22628


Repository: ambari


Description
-------

Installing a new cluster can create values in yarn-site.xml which have {{None}} specified in the classpath for Spark

```
<property>
      <name>yarn.nodemanager.aux-services.spark2_shuffle.classpath</name>
      <value>/usr/hdp/None/spark2/aux/*</value>
    </property>

 <property>
      <name>yarn.nodemanager.aux-services.spark_shuffle.classpath</name>
      <value>/usr/hdp/None/spark/aux/*</value>
    </property>

<property>
      <name>yarn.timeline-service.entity-group-fs-store.group-id-plugin-classpath</name>
      <value>/usr/hdp/None/spark/hdpLib/*</value>
    </property>
```

The cause for this is that YARN Clients on hosts without daemons never get a restart command after the initial {{yarn-site.xml}}, and can never fill in the correct values. This causes problems when jobs are run on these nodes:

```
2017-12-04 10:16:41,789 INFO  service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices failed in state INITED; cause: java.lang.ClassNotFoundException: org.apache.spark.network.yarn.YarnShuffleService
java.lang.ClassNotFoundException: org.apache.spark.network.yarn.YarnShuffleService
```


Diffs (updated)
-----

  ambari-server/src/main/java/org/apache/ambari/server/agent/ExecutionCommand.java 9d5e29ee8a 
  ambari-server/src/main/java/org/apache/ambari/server/controller/internal/ClientConfigResourceProvider.java a7c712bd1a 
  ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 15efcd2173 
  ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java ce328f91ff 
  ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/scripts/params_linux.py 98141456c7 


Diff: https://reviews.apache.org/r/64502/diff/3/

Changes: https://reviews.apache.org/r/64502/diff/2-3/


Testing
-------

Manual install via UI and Blueprint


Thanks,

Jonathan Hurley