You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by mvj101 <gi...@git.apache.org> on 2014/10/21 08:29:50 UTC

[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

GitHub user mvj101 opened a pull request:

    https://github.com/apache/spark/pull/2872

    [SPARK-3405] add subnet-id and vpc-id options to spark_ec2.py

    Based on this gist:
    https://gist.github.com/amar-analytx/0b62543621e1f246c0a2
    
    We use security group ids instead of security group to get around this issue:
    https://github.com/boto/boto/issues/350

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mvj101/spark SPARK-3405

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2872.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2872
    
----
commit 52aaeec7b03251f3fcb4d1cf892df7c592e03408
Author: Mike Jennings <mv...@gmail.com>
Date:   2014-10-21T06:05:09Z

    [SPARK-3405] add subnet-id and vpc-id options to spark_ec2.py

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67666538
  
    Couple of quick questions about this, just to confirm: do I always need to specify the `--subnet-id` option when using the `--vpc-id` option?  Do I have to run `./spark-ec2` from a node that's inside the VPC?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67223750
  
    Thanks for fixing up the style issue.  This looks good to me, so I'll merge this into `master`.  Thanks for your patience with the slow review!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by tylerprete <gi...@git.apache.org>.
Github user tylerprete commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-66665398
  
    @jontg I'm using this patch with your modifications (private_ip_address), but I'm getting the following errors when the script tries and starts the master:
    
    SHUTDOWN_MSG: Shutting down NameNode at java.net.UnknownHostException: ip-10-0-2-213: ip-10-0-2-213
    
    10.0.2.213 is the master's ip in this case, but it looks like it's picking up ip-10-0-2-213 as the hostname and that isn't resolving. Did you run into anything like this, and if so, how'd you resolve it?
    
    Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by mvj101 <gi...@git.apache.org>.
Github user mvj101 commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67681180
  
    Haven't worked with this in a while and different versions of boto may alter things, but
    1. You might not need to specify a subnet if you have a default VPC configured. In general I think if you don't specify a subnet boto ignores the vpc argument. Since VPC/subnet can be tied to different regions it's a good practice to specify everything as a sanity check.
    2.  Should be able to to run ./spark-ec2 from any box connected to the internet - doesn't need to be run within ec2 or within a vpc/subnet.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-60013980
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by dreid93 <gi...@git.apache.org>.
Github user dreid93 commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67276691
  
    @jontg may I buy you a coffee for your work helping people with this issue? @ChangeTip 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by dreid93 <gi...@git.apache.org>.
Github user dreid93 commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67428485
  
    @changetip does not appear to be picking up my mentions and sending the appropriate tip. :/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by dreid93 <gi...@git.apache.org>.
Github user dreid93 commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67276753
  
    @JoshRosen thanks for merging this in. Here's a coffee @ChangeTip 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by dreid93 <gi...@git.apache.org>.
Github user dreid93 commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67276623
  
    @amar-analytx here's a coffee for making the gist that @mvj101 based his initial PR on. @ChangeTip


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/2872


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by jontg <gi...@git.apache.org>.
Github user jontg commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-64087720
  
    We had a couple of issues with this patch, in particular the script depends on instances having a public dns name or ip. I had to modify the script a little to get our cluster started, but of course we didn't invest in making it general. See https://github.com/relateiq/spark/commit/48ab2d1c8cccc00a5d26145b4d19a414c17f62c2 Does boto have a best practice for handling VPC?  Feels from the bugs I've read that boto+VPC is something everyone has their own workaround for... Jón Tómas Grétarsson 502 Emerson Street | Palo Alto, CA 94301 650.461.4544 | jon@relateiq.com Learn more about us on the RelateIQ blog! On November 22, 2014 at 3:48 AM Thomas Fattal <no...@github.com> wrote: I would be interested in merging this patch as well :) ? Reply to this email directly or view it on GitHub.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67434318
  
    > I or someone else can address that corner case as time allows.
    
    Let's just update boto.  I'll submit a PR for this shortly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by mvj101 <gi...@git.apache.org>.
Github user mvj101 commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67435661
  
    Ok, I'll send a PR to revert in a few minutes.
    
    Thanks,
    Mike


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by dreid93 <gi...@git.apache.org>.
Github user dreid93 commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-66823256
  
    Just a heads up / bump. I am buying everyone a coffee (
    in bitcoin ;) ) who contributes to getting this merged in!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by dreid93 <gi...@git.apache.org>.
Github user dreid93 commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67276649
  
    @mvj101 a coffee for you sir @ChangeTip 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by tylerprete <gi...@git.apache.org>.
Github user tylerprete commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-66894914
  
    @jontg thanks for the help. Turned on dns and now everything is working.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by evanv <gi...@git.apache.org>.
Github user evanv commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-65867098
  
    Is VPC support slated for the next maintenance release? Support for VPCs is definitely needed for a lot of us, and it'd be great if we didn't have to patch it ourselves. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67123620
  
    Jenkins, this is ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by dreid93 <gi...@git.apache.org>.
Github user dreid93 commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-59987812
  
    Awesome! I am glad to see that this was a priority to someone with the time. :+1: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by changetip <gi...@git.apache.org>.
Github user changetip commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67276643
  
    Hi @amar-analytx, @dreid93 sent you a Bitcoin tip worth a coffee (4,526 bits/$1.50), and I'm here to deliver it ➔ **[collect your tip](https://www.changetip.com/collect/267614)**.
    
    
    **[Learn more about ChangeTip](https://www.changetip.com/tip-online/github)**


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by mvj101 <gi...@git.apache.org>.
Github user mvj101 commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67432970
  
    Oops, apologies for this breakage. I haven't worked with spot instances. Feel free to revert this pull request and I or someone else can address that corner case as time allows.
    
    Thanks,
    
    Mike


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by mvj101 <gi...@git.apache.org>.
Github user mvj101 commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67124150
  
    Fixing style issues now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by dreid93 <gi...@git.apache.org>.
Github user dreid93 commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67428078
  
    @jontg a coffee for you sir @ChangeTip


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by dreid93 <gi...@git.apache.org>.
Github user dreid93 commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-65814791
  
    I'll buy anyone willing to take care of this merge lunch via @ChangeTip :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by changetip <gi...@git.apache.org>.
Github user changetip commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-65814859
  
    Hi mvj101, dreid93 sent you a Bitcoin tip worth 1 lunch (21,255 bits/$8.00), and I'm here to deliver it ➔ **[collect your tip at ChangeTip.com](https://www.changetip.com/collect/213620)**.
    
    
    **[Learn more about ChangeTip](https://www.changetip.com/tip-online/github)**


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67132172
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24488/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67434905
  
    Ugh, this doesn't revert cleanly due to another patch that I merged.  I've go to go, so I'm just going to leave this for now.  Someone else can deal with this if it's urgent, otherwise I'll do it tomorrow.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67562466
  
    I've opened a PR to upgrade the Boto version, which fixes this issue: #3737


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67123940
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24486/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67123939
  
      [Test build #24486 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24486/consoleFull) for   PR 2872 at commit [`4dc6756`](https://github.com/apache/spark/commit/4dc6756844e0effeef36a62eefdd96bb63cb3db9).
     * This patch **fails Python style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67123879
  
      [Test build #24486 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24486/consoleFull) for   PR 2872 at commit [`4dc6756`](https://github.com/apache/spark/commit/4dc6756844e0effeef36a62eefdd96bb63cb3db9).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67132162
  
      [Test build #24488 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24488/consoleFull) for   PR 2872 at commit [`be9cb43`](https://github.com/apache/spark/commit/be9cb43e48637d45b8a2936b2cdfb1971bf556a0).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by brdw <gi...@git.apache.org>.
Github user brdw commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-65801064
  
    I'd love to see this as well. We have a strict vpc policy.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by Nypias <gi...@git.apache.org>.
Github user Nypias commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-64077726
  
    I would be interested in merging this patch as well :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by mvj101 <gi...@git.apache.org>.
Github user mvj101 commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67437100
  
    https://github.com/apache/spark/pull/3728


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67118614
  
    Even though we don't have Jenkins tests for the EC2 scripts, I'm just going to have Jenkins run this so that I can avoid an inadvertent build break.
    
    Jenkins, this is ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67101983
  
    Overall, this looks good to me.  I left a couple of nitpicky comments, but besides that + documentation, I'd be happy to merge this.
    
    To address a question asked upthread:
    
    > Is VPC support slated for the next maintenance release? Support for VPCs is definitely needed for a lot of us, and it'd be great if we didn't have to patch it ourselves.
    
    This probably won't be merged into Spark 1.2.1/1.1.2 since our policy is to not add new features in maintenance releases.  However, newer versions of Spark EC2 are capable of launching clusters with older Spark versions, so you'd be able to use Spark 1.3.0's scripts to launch clusters in your VPC using, say, Spark 1.2.0.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2872#discussion_r21873371
  
    --- Diff: ec2/spark_ec2.py ---
    @@ -303,12 +307,17 @@ def launch_cluster(conn, opts, cluster_name):
                 user_data_content = user_data_file.read()
     
         print "Setting up security groups..."
    -    master_group = get_or_make_group(conn, cluster_name + "-master")
    -    slave_group = get_or_make_group(conn, cluster_name + "-slaves")
    +    master_group = get_or_make_group(conn, cluster_name + "-master", opts.vpc_id)
    +    slave_group = get_or_make_group(conn, cluster_name + "-slaves", opts.vpc_id)
         authorized_address = opts.authorized_address
         if master_group.rules == []:  # Group was just now created
    -        master_group.authorize(src_group=master_group)
    -        master_group.authorize(src_group=slave_group)
    +        if opts.vpc_id == None:
    +            master_group.authorize(src_group=master_group)
    +            master_group.authorize(src_group=slave_group)
    +        else:
    +            master_group.authorize(ip_protocol='icmp', from_port=-1, to_port=-1, src_group=slave_group)
    --- End diff --
    
    I guess that we need to have separate logic for the VPC / non-VPC security group rule creation, since according to Amazon's [Differences Between Security Groups for EC2-Classic and EC2-VPC](http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_SecurityGroups.html#VPC_Security_Group_Differences) guide, when using EC2-VPC:
    
    > When you add a rule to a security group, you must specify a protocol, and it can be any protocol with a standard protocol number, or all protocols (see Protocol Numbers).
    
    In the non-VPC case, I guess that the
    
    ```
    master_group.authorize(src_group=master_group)
    master_group.authorize(src_group=slave_group)
    ```
    
    lines are authorizing _all_ inbound traffic from instances belonging to `master` group, regardless of the protocol / ports of that traffic.
    
    However, it looks like the VPC case here only authorizes TCP traffic.  I don't think that we rely on UDP traffic anywhere, but for consistency's sake it would be good if both the VPC and non-VPC branches here created equivalent rules.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by mvj101 <gi...@git.apache.org>.
Github user mvj101 commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67115653
  
    Thanks, I believe I've updated the code according to your comments.
    
    Mike


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by jeffsteinmetz <gi...@git.apache.org>.
Github user jeffsteinmetz commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-66229322
  
    The EC2 docs could also be updated to include these new switches.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67101443
  
    Since this is an often-requested feature, we should mention this in the EC2 documentation page: https://github.com/apache/spark/blob/master/docs/ec2-scripts.md


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by jontg <gi...@git.apache.org>.
Github user jontg commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-66840618
  
    @tylerprete That might occur if your VPC is not set up to auto-assign DNS records.  If you can, that is where I would suggest beginning an investigation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-59884935
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67429749
  
    It looks like this PR may have broken the ability to launch spot clusters:
    
    ```python
    Traceback (most recent call last):
      File "./spark_ec2.py", line 1147, in <module>
        main()
      File "./spark_ec2.py", line 1139, in main
        real_main()
      File "./spark_ec2.py", line 988, in real_main
        (master_nodes, slave_nodes) = launch_cluster(conn, opts, cluster_name)
      File "./spark_ec2.py", line 437, in launch_cluster
        user_data=user_data_content)
    TypeError: request_spot_instances() got an unexpected keyword argument 'security_group_ids'
    ```
    
    It looks like the latest version of Boto supports this argument (http://boto.readthedocs.org/en/latest/ref/ec2.html#boto.ec2.connection.EC2Connection.request_spot_instances), but not ours.  We're using boto 2.4.1, which was released on May 16, 2012, but this feature was only added August 2012: https://github.com/boto/boto/commit/145a8994e516cd573854091d1a747a7e7b5a4498
    
    I might be able to fix this by just upgrading to a newer version of Boto.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by jontg <gi...@git.apache.org>.
Github user jontg commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-63700947
  
    +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2872#discussion_r21873877
  
    --- Diff: ec2/spark_ec2.py ---
    @@ -162,6 +162,10 @@ def parse_args():
         parser.add_option(
             "--copy-aws-credentials", action="store_true", default=False,
             help="Add AWS credentials to hadoop configuration to allow Spark to access S3")
    +    parser.add_option(
    +        "--subnet-id", default=None, help="VPC Subnet id where to launch instances")
    +    parser.add_option(
    +        "--vpc-id", default=None, help="VPC ID where to launch instances")
    --- End diff --
    
    Similarly, how about "VPC to launch instances in?"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67434751
  
    Actually, I'm going to revert this for now.  Looks like the `boto` update will take a bit more work than I thought.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2872#discussion_r21873865
  
    --- Diff: ec2/spark_ec2.py ---
    @@ -162,6 +162,10 @@ def parse_args():
         parser.add_option(
             "--copy-aws-credentials", action="store_true", default=False,
             help="Add AWS credentials to hadoop configuration to allow Spark to access S3")
    +    parser.add_option(
    +        "--subnet-id", default=None, help="VPC Subnet id where to launch instances")
    --- End diff --
    
    Minor grammar / style nit, but what do you think about "VPC subnet to launch instances in" as the help text?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2872#issuecomment-67124660
  
      [Test build #24488 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24488/consoleFull) for   PR 2872 at commit [`be9cb43`](https://github.com/apache/spark/commit/be9cb43e48637d45b8a2936b2cdfb1971bf556a0).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3405] add subnet-id and vpc-id options ...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2872#discussion_r21873430
  
    --- Diff: ec2/spark_ec2.py ---
    @@ -341,11 +355,11 @@ def launch_cluster(conn, opts, cluster_name):
         if opts.ami is None:
             opts.ami = get_spark_ami(opts)
     
    -    additional_groups = []
    +    additional_group_ids = []
         if opts.additional_security_group:
    -        additional_groups = [sg
    -                             for sg in conn.get_all_security_groups()
    -                             if opts.additional_security_group in (sg.name, sg.id)]
    +        additional_group_ids = [sg.id
    --- End diff --
    
    If this change from `sg` to `sg.id` is due to https://github.com/boto/boto/issues/350, can you add a comment referencing that issue?  Something like 
    
    ```
    # we use group ids to work around https://github.com/boto/boto/issues/350
    ```
    
    would be fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org