You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bigtop.apache.org by evans-ye <gi...@git.apache.org> on 2017/05/10 19:24:36 UTC

[GitHub] bigtop pull request #208: BIGTOP-2766. [Puppet] Spark worker startup failed ...

GitHub user evans-ye opened a pull request:

    https://github.com/apache/bigtop/pull/208

    BIGTOP-2766. [Puppet] Spark worker startup failed due to default master_url is yarn

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/evans-ye/bigtop BIGTOP-2766

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/bigtop/pull/208.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #208
    
----
commit 8e1d7b8f3e34f6f4e59508ad453ef8f3b30e924b
Author: Evans Ye <ev...@apache.org>
Date:   2017-05-10T19:23:19Z

    BIGTOP-2766. [Puppet] Spark worker startup failed due to default master_url is yarn

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bigtop pull request #208: BIGTOP-2766. [Puppet] Spark worker startup failed ...

Posted by kwmonroe <gi...@git.apache.org>.
Github user kwmonroe commented on a diff in the pull request:

    https://github.com/apache/bigtop/pull/208#discussion_r116045104
  
    --- Diff: bigtop-deploy/puppet/modules/spark/templates/spark-defaults.conf ---
    @@ -13,7 +13,15 @@
     # See the License for the specific language governing permissions and
     # limitations under the License.
     
    +<% if @master_url -%>
     spark.master <%= @master_url %>
    +<% else -%>
    +<% if scope['deploy::roles'].include? 'spark-on-yarn' -%>
    +spark.master yarn
    +<% else -%>
    +spark.master spark://<%= @master_host %>:<%= @master_port %>
    --- End diff --
    
    Yup @evans-ye, I agree the `spark-master|worker` roles implied standalone, yet it was always configured for yarn by default.  Confusing for sure.
    
    I like your idea of keeping it backwards compat and introducing the new `spark-standalone`.  With that change, I imagine the templates patch would change to something like this:
    
    ```diff
    +<% if @master_url -%>
     spark.master <%= @master_url %>
    +<% else -%>
    +<% if scope['deploy::roles'].include? 'spark-standalone' -%>
    +spark.master spark://<%= @master_host %>:<%= @master_port %>
    +<% else -%>
    +spark.master yarn
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bigtop pull request #208: BIGTOP-2766. [Puppet] Spark worker startup failed ...

Posted by evans-ye <gi...@git.apache.org>.
Github user evans-ye commented on a diff in the pull request:

    https://github.com/apache/bigtop/pull/208#discussion_r116042957
  
    --- Diff: bigtop-deploy/puppet/modules/spark/templates/spark-defaults.conf ---
    @@ -13,7 +13,15 @@
     # See the License for the specific language governing permissions and
     # limitations under the License.
     
    +<% if @master_url -%>
     spark.master <%= @master_url %>
    +<% else -%>
    +<% if scope['deploy::roles'].include? 'spark-on-yarn' -%>
    +spark.master yarn
    +<% else -%>
    +spark.master spark://<%= @master_host %>:<%= @master_port %>
    --- End diff --
    
    I think this can be thoroughly discussed because backward compatibility is indeed important.
    
    I did it this way just because the logic deploying spark (spark-master, spark-worker) implies it's a standalone deployment. However it happens to be working spark on yarn instead.
    
    How about this way:
    spark remains deploying yarn, and then I'll add another component call spark-standalone:
    The code change will be looked like this:
    
    ```
    spark => {
      worker => ["spark-on-yarn"],
      client => ["spark-client"],
      library => ["spark-yarn-slave"],
    }
    spark-standalone => {
      master => ["spark-master"],
      worker => ["spark-worker"],
    }
    ```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bigtop issue #208: BIGTOP-2766. [Puppet] Spark worker startup failed due to ...

Posted by evans-ye <gi...@git.apache.org>.
Github user evans-ye commented on the issue:

    https://github.com/apache/bigtop/pull/208
  
    Seems no objection for two days. I need to get thing rolling to fix spark deployment in provisioner. So I'll commit now. Thanks for your high quality review @kwmonroe.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bigtop issue #208: BIGTOP-2766. [Puppet] Spark worker startup failed due to ...

Posted by evans-ye <gi...@git.apache.org>.
Github user evans-ye commented on the issue:

    https://github.com/apache/bigtop/pull/208
  
    Thanks for mentioning here. I intentionally left the include part out of this PR to avoid dup work in BIGTOP-2764. Since I'll rebase anyway, I suggest you to submit a patch for BIGTOP-2764, and I'll 
     rebase on it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bigtop pull request #208: BIGTOP-2766. [Puppet] Spark worker startup failed ...

Posted by kwmonroe <gi...@git.apache.org>.
Github user kwmonroe commented on a diff in the pull request:

    https://github.com/apache/bigtop/pull/208#discussion_r116011550
  
    --- Diff: bigtop-deploy/puppet/modules/spark/templates/spark-defaults.conf ---
    @@ -13,7 +13,15 @@
     # See the License for the specific language governing permissions and
     # limitations under the License.
     
    +<% if @master_url -%>
     spark.master <%= @master_url %>
    +<% else -%>
    +<% if scope['deploy::roles'].include? 'spark-on-yarn' -%>
    +spark.master yarn
    +<% else -%>
    +spark.master spark://<%= @master_host %>:<%= @master_port %>
    --- End diff --
    
    This block generally LGTM, but note that this is effectively making `spark://host:port` the default master.  Prior to this change, people deploying the `spark` component would get `yarn` as the master by default.
    
    So I'm +1 here, but it may be worth mentioning on-list or in the release notes that the default spark configuration is changing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bigtop pull request #208: BIGTOP-2766. [Puppet] Spark worker startup failed ...

Posted by kwmonroe <gi...@git.apache.org>.
Github user kwmonroe commented on a diff in the pull request:

    https://github.com/apache/bigtop/pull/208#discussion_r116002658
  
    --- Diff: bigtop-deploy/puppet/manifests/cluster.pp ---
    @@ -72,6 +72,10 @@
         master => ["spark-master"],
         worker => ["spark-worker"],
       },
    +  spark-on-yarn => {
    +    worker => ["spark-on-yarn", "spark-slave"],
    +    client => ["spark-client"],
    +  },
    --- End diff --
    
    Small typo above.  `spark-slave` should be `spark-yarn-slave` (see [init.pp](https://github.com/apache/bigtop/blob/master/bigtop-deploy/puppet/modules/spark/manifests/init.pp#L27)).
    
    Also, while it's probably inconsequential, `spark-yarn-slave` isn't really a worker -- it just provides the yarn shuffle jar for spark.  So I think the above block should read:
    
    ```diff
    +  spark-on-yarn => {
    +    worker => ["spark-on-yarn"],
    +    client => ["spark-client"],
    +    library => ["spark-yarn-slave"],
    +  },


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bigtop pull request #208: BIGTOP-2766. [Puppet] Spark worker startup failed ...

Posted by kwmonroe <gi...@git.apache.org>.
Github user kwmonroe commented on a diff in the pull request:

    https://github.com/apache/bigtop/pull/208#discussion_r116296932
  
    --- Diff: bigtop-deploy/puppet/modules/spark/templates/spark-defaults.conf ---
    @@ -13,7 +13,15 @@
     # See the License for the specific language governing permissions and
     # limitations under the License.
     
    +<% if @master_url -%>
     spark.master <%= @master_url %>
    +<% else -%>
    +<% if scope['deploy::roles'].include? 'spark-standalone' -%>
    --- End diff --
    
    @evans-ye sorry i didn't catch this earlier (i still get confused with roles vs components sometimes).  However, I think this condition isn't quite right.  The **component** is called `spark-standalone`, while the **role** that should be handled here is `spark-master` or `spark-worker`.
    
    I checked this a few different ways...
    
    - Using components on a single head node:
      - `['deploy::roles']` is `[spark-master, spark-worker]`
      - site.yaml:
    ```
    ...
    bigtop::hadoop_head_node: my.fqdn
    hadoop_cluster_node::cluster_components:
    - spark-standalone
    bigtop::roles_enabled: false
    ...
    ```
    
    - Using components where the worker is not on the head node
      - `['deploy::roles']` is `[spark-worker]`
      - site.yaml:
    ```
    ...
    bigtop::hadoop_head_node: NOT.my.fqdn
    hadoop_cluster_node::cluster_components:
    - spark-standalone
    bigtop::roles_enabled: false
    ...
    ```
    
    - Using roles
      - `['deploy::roles']` is `[spark-master, spark-worker]`
      - site.yaml:
    ```
    ...
    bigtop::roles:
    - spark-master
    - spark-worker
    bigtop::roles_enabled: true
    ...
    ```
    
    So I think the proper condition here (and in `spark-env.sh` below) is to check if `scope['deploy::roles']` includes either `spark-master` or `spark-worker` instead of checking for `spark-standalone`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bigtop issue #208: BIGTOP-2766. [Puppet] Spark worker startup failed due to ...

Posted by kwmonroe <gi...@git.apache.org>.
Github user kwmonroe commented on the issue:

    https://github.com/apache/bigtop/pull/208
  
    @evans-ye while you're updating spark's init.pp, would you consider replacing the `spark-datanucleus` definition with an `import spark::datanucleus`?
    
    https://github.com/evans-ye/bigtop/blob/aeb72aa14ecc3363723dce542f631aea2faaba41/bigtop-deploy/puppet/modules/spark/manifests/init.pp#L158
    
    As you noted in BIGTOP-2764, including the class seems better than redefining the package.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bigtop pull request #208: BIGTOP-2766. [Puppet] Spark worker startup failed ...

Posted by evans-ye <gi...@git.apache.org>.
Github user evans-ye commented on a diff in the pull request:

    https://github.com/apache/bigtop/pull/208#discussion_r116040559
  
    --- Diff: bigtop-deploy/puppet/manifests/cluster.pp ---
    @@ -72,6 +72,10 @@
         master => ["spark-master"],
         worker => ["spark-worker"],
       },
    +  spark-on-yarn => {
    +    worker => ["spark-on-yarn", "spark-slave"],
    +    client => ["spark-client"],
    +  },
    --- End diff --
    
    Very informative input. Thanks Kevin!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bigtop pull request #208: BIGTOP-2766. [Puppet] Spark worker startup failed ...

Posted by evans-ye <gi...@git.apache.org>.
Github user evans-ye commented on a diff in the pull request:

    https://github.com/apache/bigtop/pull/208#discussion_r116349526
  
    --- Diff: bigtop-deploy/puppet/modules/spark/templates/spark-defaults.conf ---
    @@ -13,7 +13,15 @@
     # See the License for the specific language governing permissions and
     # limitations under the License.
     
    +<% if @master_url -%>
     spark.master <%= @master_url %>
    +<% else -%>
    +<% if scope['deploy::roles'].include? 'spark-standalone' -%>
    --- End diff --
    
    Yep, you're right. I'm also confused as well ;)
    +1 to your suggestion. That way we still keep the default to yarn strategy consistent down to config level.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---