You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by florianverhein <gi...@git.apache.org> on 2015/02/13 07:20:06 UTC

[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

GitHub user florianverhein opened a pull request:

    https://github.com/apache/spark/pull/4583

    [SPARK-5641] [EC2] Allow spark_ec2.py to copy arbitrary files to cluster

    Give users an easy way to rcp a directory structure to the master's / as part of the cluster launch, at a useful point in the workflow (before setup.sh is called on the master).
    
    This is an alternative approach to meeting requirements discussed in https://github.com/apache/spark/pull/4487  

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/florianverhein/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/4583.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4583
    
----
commit 87d922cc7345d874b5e0214978d830099631555b
Author: Florian Verhein <fl...@gmail.com>
Date:   2015-02-13T06:14:12Z

    [SPARK-5641] [EC2] implement --deploy-root-dir

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-75397427
  
    Sorry I got caught up with some other stuff for the past couple of days. @florianverhein my recommendation is hopefully not arbitrary but just that if we take in a directory as an argument to `--deploy-root-dir` then its nicer to keep the directory at the destination as well. As you said this is the behavior that rsync supports without a trailing `/` and this is the behavior we support in spark-ec2's `copy-dir` as well. 
    
    FWIW I don't think there are any right or wrong choices here and this is mostly a question of behavior that will seem most natural to users. It'll be great to hear what other EC2 users like @nchammas or @JoshRosen feel about this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-74234006
  
    ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-74242357
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27438/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on a diff in the pull request:

    https://github.com/apache/spark/pull/4583#discussion_r24718371
  
    --- Diff: ec2/spark_ec2.py ---
    @@ -931,6 +947,22 @@ def deploy_files(conn, root_dir, opts, master_nodes, slave_nodes, modules):
         shutil.rmtree(tmp_dir)
     
     
    +# Deploy files in a given local directory to a cluster, WITHOUT parameter substitution.
    +# Note that unlike deploy_files, this works for binary files.
    +# Files are only deployed to the first master instance in the cluster.
    +#
    +# root_dir should be an absolute path to the directory with the files we want to deploy.
    +def deploy_user_files(conn, root_dir, opts, master_nodes, slave_nodes, modules):
    --- End diff --
    
    Yes - we can remove them for now and add things as we need them


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-74931875
  
    @florianverhein - Sorry for the delay. I just tested this out and it seemed to work okay. One thing that I was confused by is that its not very clear where the files are ending up on the master. For example I had a directory `/home/shivaram/dotfiles` that I passed in as the argument. I think it would be good to rsync this to `/root/dotfiles` on the master ? Right now the behavior is that the files inside the directory (like say `.vimrc`) are put in `/root/` (i.e. I got `/root/.vimrc`)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/4583


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-74452766
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27536/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-75487381
  
      [Test build #27846 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27846/consoleFull) for   PR 4583 at commit [`49dee88`](https://github.com/apache/spark/commit/49dee88fbfaae53ece27bc8eaf9c6c1bd3b86c8f).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by florianverhein <gi...@git.apache.org>.
Github user florianverhein commented on a diff in the pull request:

    https://github.com/apache/spark/pull/4583#discussion_r24718645
  
    --- Diff: ec2/spark_ec2.py ---
    @@ -931,6 +947,22 @@ def deploy_files(conn, root_dir, opts, master_nodes, slave_nodes, modules):
         shutil.rmtree(tmp_dir)
     
     
    +# Deploy files in a given local directory to a cluster, WITHOUT parameter substitution.
    +# Note that unlike deploy_files, this works for binary files.
    +# Files are only deployed to the first master instance in the cluster.
    +#
    +# root_dir should be an absolute path to the directory with the files we want to deploy.
    +def deploy_user_files(conn, root_dir, opts, master_nodes, slave_nodes, modules):
    --- End diff --
    
    :+1: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by florianverhein <gi...@git.apache.org>.
Github user florianverhein commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-75659457
  
    Have now launched clusters with and without trailing / too -- looks good. 
    ping @shivaram 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-74452762
  
      [Test build #27536 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27536/consoleFull) for   PR 4583 at commit [`7b8e3d8`](https://github.com/apache/spark/commit/7b8e3d8911d17fbf01ef60d62e802b875c752159).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-74234099
  
      [Test build #27438 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27438/consoleFull) for   PR 4583 at commit [`87d922c`](https://github.com/apache/spark/commit/87d922cc7345d874b5e0214978d830099631555b).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-74233845
  
    Jenkins, add to whitelist


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-74954531
  
    Hmm okay - My other concern was also that the directory itself wasn't maintained. i.e. it might be better to put the deploy-root-dir into `/` as a directory (`/dotfiles/.vimrc` instead of `/.vimrc`) ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by florianverhein <gi...@git.apache.org>.
Github user florianverhein commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-75417014
  
    Sure, so let's support both - I'll remove the addition of the "/", then the
    user can choose the behaviour. If you're happy with the current option name
    in that case then we're all set. I might add a bit more info to the help
    too, so it's clear to users what will happen with/out the /.
    Does that sound good @shivaram?
    
    On Sunday, 22 February 2015, Shivaram Venkataraman <no...@github.com>
    wrote:
    
    > Sorry I got caught up with some other stuff for the past couple of days.
    > @florianverhein <https://github.com/florianverhein> my recommendation is
    > hopefully not arbitrary but just that if we take in a directory as an
    > argument to --deploy-root-dir then its nicer to keep the directory at the
    > destination as well. As you said this is the behavior that rsync supports
    > without a trailing / and this is the behavior we support in spark-ec2's
    > copy-dir as well.
    >
    > FWIW I don't think there are any right or wrong choices here and this is
    > mostly a question of behavior that will seem most natural to users. It'll
    > be great to hear what other EC2 users like @nchammas
    > <https://github.com/nchammas> or @JoshRosen <https://github.com/JoshRosen>
    > feel about this.
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/spark/pull/4583#issuecomment-75397427>.
    >



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by florianverhein <gi...@git.apache.org>.
Github user florianverhein commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-74210659
  
    ping @shivaram 
    I called it `deploy-root-dir` for clarity.
    I edited the original jira ticket rather than creating a new one.
    It works with a trailing / too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by florianverhein <gi...@git.apache.org>.
Github user florianverhein commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-74461651
  
    Cool, thanks @shivaram. I've been using it to deploy rpms and jars :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-74210541
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by florianverhein <gi...@git.apache.org>.
Github user florianverhein commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-74946543
  
    Thanks @shivaram. I'm not sure I follow 100%. With that argument they should have ended up eg /.vimrc (unless root is a subdirectory of dotfiles). The contents if '--deploy-root-dir' end up in /, not /root/ (ie as documented in the help). This is necessary because you may want to copy files elsewhere on the file system. Eg /opt. It's just unfortunate that the existence of /root/ means "root" is not unambiguous. Therefore I made sure to use / in the help.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by florianverhein <gi...@git.apache.org>.
Github user florianverhein commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-74982303
  
    This seems arbitrary. Also, how would you copy a file into /? The behaviour is exactly what rsync does (with a trailing /), so I don't think its a good idea to deviate from this. Now note that I add the trailing slash (so it's consistent with current deploy.files). I could remove that (and keep the isdirectory check), which would mean the user can choose the desired behaviour (leaving off the / should do exactly what you want I think). We should probably rename the option then, as it's no longer limited to "deploying the contents of a directory into /". But I don't see what this buys us. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-74386047
  
    Functionality looks good to me. Might be good for @nchammas or @JoshRosen to take a look at any Python style issues


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-75491457
  
      [Test build #27846 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27846/consoleFull) for   PR 4583 at commit [`49dee88`](https://github.com/apache/spark/commit/49dee88fbfaae53ece27bc8eaf9c6c1bd3b86c8f).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on a diff in the pull request:

    https://github.com/apache/spark/pull/4583#discussion_r24716159
  
    --- Diff: ec2/spark_ec2.py ---
    @@ -931,6 +947,22 @@ def deploy_files(conn, root_dir, opts, master_nodes, slave_nodes, modules):
         shutil.rmtree(tmp_dir)
     
     
    +# Deploy files in a given local directory to a cluster, WITHOUT parameter substitution.
    +# Note that unlike deploy_files, this works for binary files.
    +# Files are only deployed to the first master instance in the cluster.
    +#
    +# root_dir should be an absolute path to the directory with the files we want to deploy.
    +def deploy_user_files(conn, root_dir, opts, master_nodes, slave_nodes, modules):
    --- End diff --
    
    `slave_nodes` and `modules` are not used in this function ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-75470850
  
    Yeah if we can write up documentation on how having `/` or not having '/' affects things it should be good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-74453770
  
    Thanks @florianverhein -- Code change looks good to me -- I'd like to test this out on a cluster launch to verify it though. Will try to get to it tomorrow.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by florianverhein <gi...@git.apache.org>.
Github user florianverhein commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-75194906
  
    Any thoughts on this @shivaram? Perhaps I can add more info / an example to the help for this option?
    
    Also note that `--deploy-root-dir` supports a single directory, so consider what happens if you want to deploy say `dotfiles2` in addition to `dotfiles`. It's not possible if we use your suggestion. But currently it's easy to do something like this: 
    ```
    /some_path/dotfiles2
              /opt/something
              /root/dotfiles
                   /extra/some.jar
                          run.sh
                   /rpms/some.rpm
    ```
    and use `--deploy-root-dir /some_path`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by florianverhein <gi...@git.apache.org>.
Github user florianverhein commented on a diff in the pull request:

    https://github.com/apache/spark/pull/4583#discussion_r24717863
  
    --- Diff: ec2/spark_ec2.py ---
    @@ -931,6 +947,22 @@ def deploy_files(conn, root_dir, opts, master_nodes, slave_nodes, modules):
         shutil.rmtree(tmp_dir)
     
     
    +# Deploy files in a given local directory to a cluster, WITHOUT parameter substitution.
    +# Note that unlike deploy_files, this works for binary files.
    +# Files are only deployed to the first master instance in the cluster.
    +#
    +# root_dir should be an absolute path to the directory with the files we want to deploy.
    +def deploy_user_files(conn, root_dir, opts, master_nodes, slave_nodes, modules):
    --- End diff --
    
    `conn` isn't either. I just kept the same signature as deploy_files (since we may add substitution capability later). But  that's a pretty weak reason.
    Want me to remove them @shivaram?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-74448214
  
      [Test build #27536 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27536/consoleFull) for   PR 4583 at commit [`7b8e3d8`](https://github.com/apache/spark/commit/7b8e3d8911d17fbf01ef60d62e802b875c752159).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-74242347
  
      [Test build #27438 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27438/consoleFull) for   PR 4583 at commit [`87d922c`](https://github.com/apache/spark/commit/87d922cc7345d874b5e0214978d830099631555b).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by florianverhein <gi...@git.apache.org>.
Github user florianverhein commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-76851833
  
    Hi @shivaram, have you had a chance to look at this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-77617236
  
    Many apologies for the delay. I tried this out on my machine today and it works great. LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5641] [EC2] Allow spark_ec2.py to copy ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/4583#issuecomment-75491462
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27846/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org