You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@bigtop.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/07/07 00:02:00 UTC

[jira] [Commented] (BIGTOP-2834) spark charm: refactor for restricted networks; lib cleanup

    [ https://issues.apache.org/jira/browse/BIGTOP-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16077377#comment-16077377 ] 

ASF GitHub Bot commented on BIGTOP-2834:
----------------------------------------

GitHub user kwmonroe opened a pull request:

    https://github.com/apache/bigtop/pull/246

    BIGTOP-2834: spark charm: refactor for restricted networks; lib cleanup

    See [BIGTOP-2834](https://issues.apache.org/jira/browse/BIGTOP-2834) for details.
    
    - Make the pagerank sample data a juju resource
    - Don't install Spark-Bench by default
    - cleanup the spark charm lib

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/juju-solutions/bigtop feature/BIGTOP-2834/spark-restricted-net

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/bigtop/pull/246.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #246
    
----
commit 7316ea6a3809b217acd370f83f231548ceaeaab3
Author: Kevin W Monroe <ke...@canonical.com>
Date:   2017-07-05T20:30:06Z

    sparkbench changes
    
    Do not install Spark-Bench by default for a couple reasons:
    - it has not been tested with Spark 2.1.x.
    - restricted networks may not allow the charm to fetch from s3,
    which is where we host the SB tarball.
    
    Simplify config related to SB since the payload is not arch specific.
    
    Include a SPARK_HOME config in /etc/environment since apps like SB
    will look for default values there.

commit b1c8f795a55dd12926b267ec1bbc3ea34d5a6c6a
Author: Kevin W Monroe <ke...@canonical.com>
Date:   2017-07-05T22:44:02Z

    add a sample-data resource for pagerank now that sparkbench is not installed by default

commit c11b1299f0aa4932407d32652045149c49b2967e
Author: Kevin W Monroe <ke...@canonical.com>
Date:   2017-07-06T04:23:11Z

    need the add_dirs to create spark event dir (else history server fails to start)

commit 06c103df43e5abf2ef712f762ea6e76b8f2d974c
Author: Kevin W Monroe <ke...@canonical.com>
Date:   2017-07-06T04:23:36Z

    update pagerank action to use sample from new resource

commit 50d464ee5a46b9caa3d5567d4c9ab86c41c8afda
Author: Kevin W Monroe <ke...@canonical.com>
Date:   2017-07-06T22:55:03Z

    pagerank: copy the sample data to hdfs when spark is in yarn mode

commit 3f6389ce74b742bfcdda02791d93fbb4ac4154ee
Author: Kevin W Monroe <ke...@canonical.com>
Date:   2017-07-06T22:57:35Z

    trigger reinstall if sample data resource has changed

commit 0a0f2fefb38f86ec15963598d8fb4225ca4cfba6
Author: Kevin W Monroe <ke...@canonical.com>
Date:   2017-07-06T23:08:43Z

    update/simplify spark lib
    
    - dont bootstrap spark
      - packages handle our users/groups
      - we handle the dirs as needed based on the spark exe mode
      - remove setup method
    - rename methods to be more accurate
      - handle_sparkbench -> configure_sparkbench
      - install_demo -> configure_examples
      - setup_hdfs_logs -> configure_events_dir
    - configure_examples ensures sparkpi and sample-data are installed correctly
    - configure_events_dir creates dirs and sets perms based on spark exe mode
    - BUG
      - passing an empty hosts['namenode'] breaks when exe mode goes from yarn->non-yarn
      - no need to nullify hosts['namenode']

----


> spark charm: refactor for restricted networks; lib cleanup
> ----------------------------------------------------------
>
>                 Key: BIGTOP-2834
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-2834
>             Project: Bigtop
>          Issue Type: Improvement
>          Components: deployment
>    Affects Versions: 1.2.0
>            Reporter: Kevin W Monroe
>            Assignee: Kevin W Monroe
>            Priority: Minor
>             Fix For: 1.3.0
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The spark charm currently grabs Spark-Bench and sample data for the pagerank benchmark from s3.  This is not good for a couple reasons:
> 1 - Spark-Bench hasn't been verified with Spark 2.1.0.
> 2 - Deployments might fail in an environment where the charm is not allowed to fetch things from s3.
> #1 is an easy fix; just change the default behavior.  SB is there if a user wants it, but we shouldn't install it by default.
> #2 is trickier because we *do* need the sample data for our pagerank benchmark.  Refactor this as a Juju Resource (vs s3 URL) so it comes from the charm store.
> While we're here, simplify the spark charm library to remove leftover cruft from pre-Bigtop days and break out the complex configure method into more purpose-optimized methods.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)