You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by obermeier <gi...@git.apache.org> on 2017/10/01 14:55:36 UTC

[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6

GitHub user obermeier opened a pull request:

    https://github.com/apache/spark/pull/19408

    [SPARK-22180][CORE] Allow IPv6

    External applications like Apache Cassandra are able to deal with IPv6 addresses. Libraries like spark-cassandra-connector combine Apache Cassandra with Apache Spark.
    This combination is very useful IMHO.
    
    One problem is that `org.apache.spark.util.Utils.parseHostPort(hostPort:` `String)` takes the last colon to sepperate the port from host path. This conflicts with literal IPv6 addresses.
    
    I think we can take `hostPort` as literal IPv6 address if it contains tow ore more colons. If IPv6 addresses are enclosed in square brackets port definition is still possible.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/obermeier/spark issue/SPARK-22180

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19408.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19408
    
----
commit 3783f85b18540ed8746c078ad9c2f12d7167be9d
Author: Stefan Obermeier <ma...@stefan-obermeier.de>
Date:   2017-10-01T14:28:58Z

    [SPARK-22180][CORE] Allow IPv6

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by obermeier <gi...@git.apache.org>.
Github user obermeier commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    If Spark runs in YARN Cluster this issue still exists


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19408#discussion_r150787836
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -982,7 +982,19 @@ private[spark] object Utils extends Logging {
           return cached
         }
     
    -    val indx: Int = hostPort.lastIndexOf(':')
    +    val indx: Int =
    +      // Interpret hostPort as literal IPv6 address if it contains multiple colons.
    +      // IPv6 addresses  enclosed in square brackets like [::1]:123 are not included.
    +      // scalastyle:off SingleSpaceBetweenRParenAndLCurlyBrace
    +      if (hostPort.matches("(([0-9a-fA-F]*):([0-9a-fA-F]*)){2,}")) {
    +      // scalastyle:on SingleSpaceBetweenRParenAndLCurlyBrace
    +        -1
    +      } else {
    +        // Else last colon defines start of port definition
    +        hostPort.lastIndexOf(':')
    +      }
    +
    +
         // This is potentially broken - when dealing with ipv6 addresses for example, sigh ...
         // but then hadoop does not support ipv6 right now.
         // For now, we assume that if port exists, then it is valid - not check if it is an int > 0
    --- End diff --
    
    does this comment still valid?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19408#discussion_r142038519
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -981,7 +981,13 @@ private[spark] object Utils extends Logging {
           return cached
         }
     
    -    val indx: Int = hostPort.lastIndexOf(':')
    +    val indx: Int =
    +      // Interpret hostPort as literal IPv6 address if it contains two ore more colons
    --- End diff --
    
    Nit: ore
    You might note here that you're checking that you _don't_ have a `[::1]:123` IPv6 address here -- the braces are key.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    **[Test build #85062 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85062/testReport)** for PR 19408 at commit [`68c3221`](https://github.com/apache/spark/commit/68c322129305a35c9d3e04f8cacc011be5fbaec4).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    Are you planning to fully address the IPv6 issues? If not, why do we choose to adapt this single function separately?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by obermeier <gi...@git.apache.org>.
Github user obermeier commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    I total agree with you.
    What do you think about just adding a log message if the given string is obviously not a valid host name.
    Because the given _NumberFormatException_ much later after the parsing component was a little bit confusing.
    
    
    ```
    java.lang.NumberFormatException: For input string: "f904"
    	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    	at java.lang.Integer.parseInt(Integer.java:580)
    	at java.lang.Integer.parseInt(Integer.java:615)
    	at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272)
    	at scala.collection.immutable.StringOps.toInt(StringOps.scala:29)
    	at org.apache.spark.util.Utils$.parseHostPort(Utils.scala:935)
    	at org.apache.spark.scheduler.cluster.YarnScheduler.getRackForHost(YarnScheduler.scala:36)
    	at org.apache.spark.scheduler.TaskSetManager$$anonfun$org$apache$spark$scheduler$TaskSetManager$$addPendingTask$1.apply(TaskSetManager.scala:206)
    	at org.apache.spark.scheduler.TaskSetManager$$anonfun$org$apache$spark$scheduler$TaskSetManager$$addPendingTask$1.apply(TaskSetManager.scala:187)
    	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
    	at org.apache.spark.scheduler.TaskSetManager.org$apache$spark$scheduler$TaskSetManager$$addPendingTask(TaskSetManager.scala:187)
    	at org.apache.spark.scheduler.TaskSetManager$$anonfun$1.apply$mcVI$sp(TaskSetManager.scala:166)
    	at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
    	at org.apache.spark.scheduler.TaskSetManager.<init>(TaskSetManager.scala:165)
    	at org.apache.spark.scheduler.TaskSchedulerImpl.createTaskSetManager(TaskSchedulerImpl.scala:205)
    	at org.apache.spark.scheduler.TaskSchedulerImpl.submitTasks(TaskSchedulerImpl.scala:169)
    	at org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1058)
    	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:933)
    	at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:873)
    	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1626)
    	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1618)
    	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1607)
    	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    **[Test build #84982 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84982/testReport)** for PR 19408 at commit [`1400299`](https://github.com/apache/spark/commit/1400299808631da0196a61f3588ede786dd0b041).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by mridulm <gi...@git.apache.org>.
Github user mridulm commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    To rephrase @jiangxb1987's question - supporting IPv6 is a much larger effort, which spark currently does not. We should be addressing that as the problem to solve, and fix this as part of IPv6 support : eliminating individual exceptions could simply result in spark platform going into inconsistent states, without any telemetry in the logs on why (because we removed/'fixed' the expected exceptions).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    **[Test build #85062 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85062/testReport)** for PR 19408 at commit [`68c3221`](https://github.com/apache/spark/commit/68c322129305a35c9d3e04f8cacc011be5fbaec4).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...

Posted by obermeier <gi...@git.apache.org>.
Github user obermeier commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19408#discussion_r150798868
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -982,7 +982,19 @@ private[spark] object Utils extends Logging {
           return cached
         }
     
    -    val indx: Int = hostPort.lastIndexOf(':')
    +    val indx: Int =
    +      // Interpret hostPort as literal IPv6 address if it contains multiple colons.
    +      // IPv6 addresses  enclosed in square brackets like [::1]:123 are not included.
    +      // scalastyle:off SingleSpaceBetweenRParenAndLCurlyBrace
    +      if (hostPort.matches("(([0-9a-fA-F]*):([0-9a-fA-F]*)){2,}")) {
    +      // scalastyle:on SingleSpaceBetweenRParenAndLCurlyBrace
    +        -1
    +      } else {
    +        // Else last colon defines start of port definition
    +        hostPort.lastIndexOf(':')
    +      }
    +
    +
         // This is potentially broken - when dealing with ipv6 addresses for example, sigh ...
         // but then hadoop does not support ipv6 right now.
         // For now, we assume that if port exists, then it is valid - not check if it is an int > 0
    --- End diff --
    
    I think yes because some checks are not implemented 
    E.g. no check how many colons are used,  double colon should appear only once ...


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...

Posted by obermeier <gi...@git.apache.org>.
Github user obermeier closed the pull request at:

    https://github.com/apache/spark/pull/19408


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...

Posted by obermeier <gi...@git.apache.org>.
Github user obermeier commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19408#discussion_r150909772
  
    --- Diff: core/src/test/scala/org/apache/spark/util/UtilsSuite.scala ---
    @@ -1146,6 +1146,20 @@ class UtilsSuite extends SparkFunSuite with ResetSystemProperties with Logging {
         }
       }
     
    +  test("parseHostPort") {
    +    assert(Utils.parseHostPort("abc:123")          === (("abc", 123)))
    +    assert(Utils.parseHostPort("example.com")      === (("example.com", 0)))
    +    assert(Utils.parseHostPort("example.com:123")  === (("example.com", 123)))
    +    assert(Utils.parseHostPort("127.0.0.1")        === (("127.0.0.1", 0)))
    +    assert(Utils.parseHostPort("127.0.0.1:123")    === (("127.0.0.1", 123)))
    +    assert(Utils.parseHostPort("2001:db8::1")      === (("2001:db8::1", 0)))
    +    assert(Utils.parseHostPort("2001:DB8::1")      === (("2001:DB8::1", 0)))
    +    assert(Utils.parseHostPort("2001:dB8::1")      === (("2001:dB8::1", 0)))
    +    assert(Utils.parseHostPort("0:0:0:0:0:0:0:0")  === (("0:0:0:0:0:0:0:0", 0)))
    +    assert(Utils.parseHostPort("::1")              === (("::1", 0)))
    +    assert(Utils.parseHostPort("[::1]:123")        === (("[::1]", 123)))
    +    assert(Utils.parseHostPort("[2001:db8:42::1]:123")  === (("[2001:db8:42::1]", 123)))
    --- End diff --
    
    What is the prefered way to handle this kind of parse errors in Spark?
    Changing the signature of this method to something like :Try[..], :Option ... is no option!?
    Error log messages?
    Unchecked Exceptions?
    ...


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85062/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...

Posted by obermeier <gi...@git.apache.org>.
GitHub user obermeier reopened a pull request:

    https://github.com/apache/spark/pull/19408

    [SPARK-22180][CORE] Allow IPv6 address in org.apache.spark.util.Utils.parseHostPort

    External applications like Apache Cassandra are able to deal with IPv6 addresses. Libraries like spark-cassandra-connector combine Apache Cassandra with Apache Spark.
    This combination is very useful IMHO.
    
    One problem is that `org.apache.spark.util.Utils.parseHostPort(hostPort:` `String)` takes the last colon to sepperate the port from host path. This conflicts with literal IPv6 addresses.
    
    I think we can take `hostPort` as literal IPv6 address if it contains tow ore more colons. If IPv6 addresses are enclosed in square brackets port definition is still possible.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/obermeier/spark issue/SPARK-22180

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19408.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19408
    
----
commit 453e104c3e6ed6a5ca310f599c274d6c66a3d3c8
Author: Stefan Obermeier <ma...@...>
Date:   2017-10-01T14:28:58Z

    [SPARK-22180][CORE] Allow IPv6 address in org.apache.spark.util.Utils.parseHostPort
    
    ## What changes were proposed in this pull request?
    
    Take ```hostPort``` as literal IPv6 address if it contains tow ore more colons. If IPv6 addresses are enclosed in square brackets port definition is still possible.
    
    ## How was this patch tested?
    
    Added a new test case into UtilsSuite
    
    Remove comment

commit 1e6623f518b43df4079bdc2680288063c6be13c6
Author: obermeier <ob...@...>
Date:   2017-12-14T22:52:21Z

    Merge branch 'master' into issue/SPARK-22180

commit 1400299808631da0196a61f3588ede786dd0b041
Author: Stefan Obermeier <ma...@...>
Date:   2017-12-15T07:39:45Z

    Fix build problem

commit 68c322129305a35c9d3e04f8cacc011be5fbaec4
Author: Stefan Obermeier <ma...@...>
Date:   2017-12-18T13:51:04Z

    Fix style checks violation.
    
    Remove whitespace at end of line.

commit 8220d95a99f4564e735f22947cb1cb698613efa5
Author: Stefan Obermeier <sc...@...>
Date:   2018-10-09T21:54:20Z

    Add log message if hostname:port is not valid

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19408#discussion_r150743565
  
    --- Diff: core/src/test/scala/org/apache/spark/util/UtilsSuite.scala ---
    @@ -1146,6 +1146,20 @@ class UtilsSuite extends SparkFunSuite with ResetSystemProperties with Logging {
         }
       }
     
    +  test("parseHostPort") {
    +    assert(Utils.parseHostPort("abc:123")          === (("abc", 123)))
    +    assert(Utils.parseHostPort("example.com")      === (("example.com", 0)))
    +    assert(Utils.parseHostPort("example.com:123")  === (("example.com", 123)))
    +    assert(Utils.parseHostPort("127.0.0.1")        === (("127.0.0.1", 0)))
    +    assert(Utils.parseHostPort("127.0.0.1:123")    === (("127.0.0.1", 123)))
    +    assert(Utils.parseHostPort("2001:db8::1")      === (("2001:db8::1", 0)))
    +    assert(Utils.parseHostPort("2001:DB8::1")      === (("2001:DB8::1", 0)))
    +    assert(Utils.parseHostPort("2001:dB8::1")      === (("2001:dB8::1", 0)))
    +    assert(Utils.parseHostPort("0:0:0:0:0:0:0:0")  === (("0:0:0:0:0:0:0:0", 0)))
    +    assert(Utils.parseHostPort("::1")              === (("::1", 0)))
    +    assert(Utils.parseHostPort("[::1]:123")        === (("[::1]", 123)))
    +    assert(Utils.parseHostPort("[2001:db8:42::1]:123")  === (("[2001:db8:42::1]", 123)))
    --- End diff --
    
    nit: we should add more test cases to cover the invalid cases.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19408#discussion_r142038575
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -981,7 +981,13 @@ private[spark] object Utils extends Logging {
           return cached
         }
     
    -    val indx: Int = hostPort.lastIndexOf(':')
    +    val indx: Int =
    +      // Interpret hostPort as literal IPv6 address if it contains two ore more colons
    +      // scalastyle:off SingleSpaceBetweenRParenAndLCurlyBrace
    +      if (hostPort.matches("(([0-9a-f]*):([0-9a-f]*)){2,}")) -1
    +      // Else last colon defines start of port definition
    +      else hostPort.lastIndexOf(':')
    --- End diff --
    
    Final nit, use braces for both parts of the if-else


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19408#discussion_r142038556
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -981,7 +981,13 @@ private[spark] object Utils extends Logging {
           return cached
         }
     
    -    val indx: Int = hostPort.lastIndexOf(':')
    +    val indx: Int =
    +      // Interpret hostPort as literal IPv6 address if it contains two ore more colons
    +      // scalastyle:off SingleSpaceBetweenRParenAndLCurlyBrace
    --- End diff --
    
    Turn the rule back on after the block?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    Sounds good.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84982/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by obermeier <gi...@git.apache.org>.
Github user obermeier commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    I chose this function because I had some exceptions like this [1] if I used IPv6 hosts.
    In this example ```org.apache.spark.util.Utils$.parseHostPort``` decided to use f904 as port but it was the last 16 bit chunk of an IPv6 address.
    
    I do not have an overview over the Spark code so I am currently not able to provide an general IPv6 solution.
    
    I think this code snipets will improve the situation and enables us to use IPv6 hosts in some cases.
    
    [1]
    ```
    java.lang.NumberFormatException: For input string: "f904"
    	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    	at java.lang.Integer.parseInt(Integer.java:580)
    	at java.lang.Integer.parseInt(Integer.java:615)
    	at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272)
    	at scala.collection.immutable.StringOps.toInt(StringOps.scala:29)
    	at org.apache.spark.util.Utils$.parseHostPort(Utils.scala:935)
    	at org.apache.spark.scheduler.cluster.YarnScheduler.getRackForHost(YarnScheduler.scala:36)
    	at org.apache.spark.scheduler.TaskSetManager$$anonfun$org$apache$spark$scheduler$TaskSetManager$$addPendingTask$1.apply(TaskSetManager.scala:206)
    	at org.apache.spark.scheduler.TaskSetManager$$anonfun$org$apache$spark$scheduler$TaskSetManager$$addPendingTask$1.apply(TaskSetManager.scala:187)
    	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
    	at org.apache.spark.scheduler.TaskSetManager.org$apache$spark$scheduler$TaskSetManager$$addPendingTask(TaskSetManager.scala:187)
    	at org.apache.spark.scheduler.TaskSetManager$$anonfun$1.apply$mcVI$sp(TaskSetManager.scala:166)
    	at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
    	at org.apache.spark.scheduler.TaskSetManager.<init>(TaskSetManager.scala:165)
    	at org.apache.spark.scheduler.TaskSchedulerImpl.createTaskSetManager(TaskSchedulerImpl.scala:205)
    	at org.apache.spark.scheduler.TaskSchedulerImpl.submitTasks(TaskSchedulerImpl.scala:169)
    	at org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1058)
    	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:933)
    	at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:873)
    	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1626)
    	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1618)
    	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1607)
    	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by obermeier <gi...@git.apache.org>.
Github user obermeier commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    This issue seems to be fixed in Spark 2.3.2


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...

Posted by obermeier <gi...@git.apache.org>.
Github user obermeier commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19408#discussion_r150907312
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -982,7 +982,19 @@ private[spark] object Utils extends Logging {
           return cached
         }
     
    -    val indx: Int = hostPort.lastIndexOf(':')
    +    val indx: Int =
    +      // Interpret hostPort as literal IPv6 address if it contains multiple colons.
    +      // IPv6 addresses  enclosed in square brackets like [::1]:123 are not included.
    +      // scalastyle:off SingleSpaceBetweenRParenAndLCurlyBrace
    +      if (hostPort.matches("(([0-9a-fA-F]*):([0-9a-fA-F]*)){2,}")) {
    +      // scalastyle:on SingleSpaceBetweenRParenAndLCurlyBrace
    +        -1
    +      } else {
    +        // Else last colon defines start of port definition
    +        hostPort.lastIndexOf(':')
    +      }
    +
    +
         // This is potentially broken - when dealing with ipv6 addresses for example, sigh ...
         // but then hadoop does not support ipv6 right now.
         // For now, we assume that if port exists, then it is valid - not check if it is an int > 0
    --- End diff --
    
    I removed this comment


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...

Posted by obermeier <gi...@git.apache.org>.
Github user obermeier commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19408#discussion_r142047496
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -981,7 +981,13 @@ private[spark] object Utils extends Logging {
           return cached
         }
     
    -    val indx: Int = hostPort.lastIndexOf(':')
    +    val indx: Int =
    +      // Interpret hostPort as literal IPv6 address if it contains two ore more colons
    +      // scalastyle:off SingleSpaceBetweenRParenAndLCurlyBrace
    +      if (hostPort.matches("(([0-9a-f]*):([0-9a-f]*)){2,}")) -1
    --- End diff --
    
    Yes a real parser would be much better!! 
    I hope the methods will check the input. Like the name resolver... 
    
    At this point I thought more about the separation of the port.
    I think it is important to check if two colons exists, otherwise this expression accepts hostnames like abc:123


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    @obermeier Could you please rebase this with the latest master? Thanks!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by obermeier <gi...@git.apache.org>.
Github user obermeier commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    Done


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    Fully agree with @mridulm 's big picture suggestion, and I also think supporting IPv6 should be designed as a integral feature, instead of just putting together some PRs.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19408
  
    **[Test build #84982 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84982/testReport)** for PR 19408 at commit [`1400299`](https://github.com/apache/spark/commit/1400299808631da0196a61f3588ede786dd0b041).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19408#discussion_r142038567
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -981,7 +981,13 @@ private[spark] object Utils extends Logging {
           return cached
         }
     
    -    val indx: Int = hostPort.lastIndexOf(':')
    +    val indx: Int =
    +      // Interpret hostPort as literal IPv6 address if it contains two ore more colons
    +      // scalastyle:off SingleSpaceBetweenRParenAndLCurlyBrace
    +      if (hostPort.matches("(([0-9a-f]*):([0-9a-f]*)){2,}")) -1
    --- End diff --
    
    Hex digits can be uppercase right?
    Should the pattern it not be more like `[0-9a-f]*(:[0-9a-f]*)+` match a number, then colon-number colon-number pairs, not number-colon-number number-colon-number sequences?
    It might end up being equivalent because the match is for 0 or more digits.
    
    This allows some strings that it shouldn't like "::::", but, the purpose isn't to catch every possible case I guess. It would fail name resolution.
    
    I thought Inet6Address would just provide parsing for this but I guess not.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org