You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@bigtop.apache.org by kwmonroe <gi...@git.apache.org> on 2018/03/25 12:45:40 UTC

[GitHub] bigtop issue #348: BIGTOP-3015: hadoop-spark bundle correction

Github user kwmonroe commented on the issue:

    https://github.com/apache/bigtop/pull/348
  
    @jamesbeedy, the reason `hadoop-client` was in this bundle was to act as an endpoint for people that were accustomed to running jobs and managing hdfs from there.  iow, muscle memory and scripts that do this will work on any of the `hadoop-x` bundles::
    
    ```
    juju run --unit hadoop-client/0 'hdfs dfs -cmd'
    ```
    
    It was also there in case someone started with `hadoop-spark`, switched the spark runtime from YARN to HA, then scaled spark to x units.  In that scenario, the single `hadoop-client` would be a recognizable endpoint to facilitate hdfs administration, versus making users do that on a random spark unit.
    
    That said, I'm coming around to your proposal to remove `hadoop-client`. Hulk smashing apps on a single unit has never been recommended and is only done to balance app density with resource cost.  As you already know, `spark` *is* a `hadoop-client` from the charms perspective, so an additional explicit client isn't needed.
    
    I'm +1 to this removal if you'll also pull out the Client-related statements from the bundle README.md.
    



---