You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by obermeier <gi...@git.apache.org> on 2017/10/01 14:55:36 UTC
[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6
GitHub user obermeier opened a pull request:
https://github.com/apache/spark/pull/19408
[SPARK-22180][CORE] Allow IPv6
External applications like Apache Cassandra are able to deal with IPv6 addresses. Libraries like spark-cassandra-connector combine Apache Cassandra with Apache Spark.
This combination is very useful IMHO.
One problem is that `org.apache.spark.util.Utils.parseHostPort(hostPort:` `String)` takes the last colon to sepperate the port from host path. This conflicts with literal IPv6 addresses.
I think we can take `hostPort` as literal IPv6 address if it contains tow ore more colons. If IPv6 addresses are enclosed in square brackets port definition is still possible.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/obermeier/spark issue/SPARK-22180
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19408.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19408
----
commit 3783f85b18540ed8746c078ad9c2f12d7167be9d
Author: Stefan Obermeier <ma...@stefan-obermeier.de>
Date: 2017-10-01T14:28:58Z
[SPARK-22180][CORE] Allow IPv6
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by obermeier <gi...@git.apache.org>.
Github user obermeier commented on the issue:
https://github.com/apache/spark/pull/19408
If Spark runs in YARN Cluster this issue still exists
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/19408#discussion_r150787836
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -982,7 +982,19 @@ private[spark] object Utils extends Logging {
return cached
}
- val indx: Int = hostPort.lastIndexOf(':')
+ val indx: Int =
+ // Interpret hostPort as literal IPv6 address if it contains multiple colons.
+ // IPv6 addresses enclosed in square brackets like [::1]:123 are not included.
+ // scalastyle:off SingleSpaceBetweenRParenAndLCurlyBrace
+ if (hostPort.matches("(([0-9a-fA-F]*):([0-9a-fA-F]*)){2,}")) {
+ // scalastyle:on SingleSpaceBetweenRParenAndLCurlyBrace
+ -1
+ } else {
+ // Else last colon defines start of port definition
+ hostPort.lastIndexOf(':')
+ }
+
+
// This is potentially broken - when dealing with ipv6 addresses for example, sigh ...
// but then hadoop does not support ipv6 right now.
// For now, we assume that if port exists, then it is valid - not check if it is an int > 0
--- End diff --
does this comment still valid?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...
Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/19408#discussion_r142038519
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -981,7 +981,13 @@ private[spark] object Utils extends Logging {
return cached
}
- val indx: Int = hostPort.lastIndexOf(':')
+ val indx: Int =
+ // Interpret hostPort as literal IPv6 address if it contains two ore more colons
--- End diff --
Nit: ore
You might note here that you're checking that you _don't_ have a `[::1]:123` IPv6 address here -- the braces are key.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19408
**[Test build #85062 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85062/testReport)** for PR 19408 at commit [`68c3221`](https://github.com/apache/spark/commit/68c322129305a35c9d3e04f8cacc011be5fbaec4).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/19408
Are you planning to fully address the IPv6 issues? If not, why do we choose to adapt this single function separately?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by obermeier <gi...@git.apache.org>.
Github user obermeier commented on the issue:
https://github.com/apache/spark/pull/19408
I total agree with you.
What do you think about just adding a log message if the given string is obviously not a valid host name.
Because the given _NumberFormatException_ much later after the parsing component was a little bit confusing.
```
java.lang.NumberFormatException: For input string: "f904"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.parseInt(Integer.java:615)
at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272)
at scala.collection.immutable.StringOps.toInt(StringOps.scala:29)
at org.apache.spark.util.Utils$.parseHostPort(Utils.scala:935)
at org.apache.spark.scheduler.cluster.YarnScheduler.getRackForHost(YarnScheduler.scala:36)
at org.apache.spark.scheduler.TaskSetManager$$anonfun$org$apache$spark$scheduler$TaskSetManager$$addPendingTask$1.apply(TaskSetManager.scala:206)
at org.apache.spark.scheduler.TaskSetManager$$anonfun$org$apache$spark$scheduler$TaskSetManager$$addPendingTask$1.apply(TaskSetManager.scala:187)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.TaskSetManager.org$apache$spark$scheduler$TaskSetManager$$addPendingTask(TaskSetManager.scala:187)
at org.apache.spark.scheduler.TaskSetManager$$anonfun$1.apply$mcVI$sp(TaskSetManager.scala:166)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
at org.apache.spark.scheduler.TaskSetManager.<init>(TaskSetManager.scala:165)
at org.apache.spark.scheduler.TaskSchedulerImpl.createTaskSetManager(TaskSchedulerImpl.scala:205)
at org.apache.spark.scheduler.TaskSchedulerImpl.submitTasks(TaskSchedulerImpl.scala:169)
at org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1058)
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:933)
at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:873)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1626)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1618)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1607)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19408
**[Test build #84982 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84982/testReport)** for PR 19408 at commit [`1400299`](https://github.com/apache/spark/commit/1400299808631da0196a61f3588ede786dd0b041).
* This patch **fails Scala style tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by mridulm <gi...@git.apache.org>.
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/19408
To rephrase @jiangxb1987's question - supporting IPv6 is a much larger effort, which spark currently does not. We should be addressing that as the problem to solve, and fix this as part of IPv6 support : eliminating individual exceptions could simply result in spark platform going into inconsistent states, without any telemetry in the logs on why (because we removed/'fixed' the expected exceptions).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19408
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19408
**[Test build #85062 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85062/testReport)** for PR 19408 at commit [`68c3221`](https://github.com/apache/spark/commit/68c322129305a35c9d3e04f8cacc011be5fbaec4).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...
Posted by obermeier <gi...@git.apache.org>.
Github user obermeier commented on a diff in the pull request:
https://github.com/apache/spark/pull/19408#discussion_r150798868
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -982,7 +982,19 @@ private[spark] object Utils extends Logging {
return cached
}
- val indx: Int = hostPort.lastIndexOf(':')
+ val indx: Int =
+ // Interpret hostPort as literal IPv6 address if it contains multiple colons.
+ // IPv6 addresses enclosed in square brackets like [::1]:123 are not included.
+ // scalastyle:off SingleSpaceBetweenRParenAndLCurlyBrace
+ if (hostPort.matches("(([0-9a-fA-F]*):([0-9a-fA-F]*)){2,}")) {
+ // scalastyle:on SingleSpaceBetweenRParenAndLCurlyBrace
+ -1
+ } else {
+ // Else last colon defines start of port definition
+ hostPort.lastIndexOf(':')
+ }
+
+
// This is potentially broken - when dealing with ipv6 addresses for example, sigh ...
// but then hadoop does not support ipv6 right now.
// For now, we assume that if port exists, then it is valid - not check if it is an int > 0
--- End diff --
I think yes because some checks are not implemented
E.g. no check how many colons are used, double colon should appear only once ...
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...
Posted by obermeier <gi...@git.apache.org>.
Github user obermeier closed the pull request at:
https://github.com/apache/spark/pull/19408
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19408
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...
Posted by obermeier <gi...@git.apache.org>.
Github user obermeier commented on a diff in the pull request:
https://github.com/apache/spark/pull/19408#discussion_r150909772
--- Diff: core/src/test/scala/org/apache/spark/util/UtilsSuite.scala ---
@@ -1146,6 +1146,20 @@ class UtilsSuite extends SparkFunSuite with ResetSystemProperties with Logging {
}
}
+ test("parseHostPort") {
+ assert(Utils.parseHostPort("abc:123") === (("abc", 123)))
+ assert(Utils.parseHostPort("example.com") === (("example.com", 0)))
+ assert(Utils.parseHostPort("example.com:123") === (("example.com", 123)))
+ assert(Utils.parseHostPort("127.0.0.1") === (("127.0.0.1", 0)))
+ assert(Utils.parseHostPort("127.0.0.1:123") === (("127.0.0.1", 123)))
+ assert(Utils.parseHostPort("2001:db8::1") === (("2001:db8::1", 0)))
+ assert(Utils.parseHostPort("2001:DB8::1") === (("2001:DB8::1", 0)))
+ assert(Utils.parseHostPort("2001:dB8::1") === (("2001:dB8::1", 0)))
+ assert(Utils.parseHostPort("0:0:0:0:0:0:0:0") === (("0:0:0:0:0:0:0:0", 0)))
+ assert(Utils.parseHostPort("::1") === (("::1", 0)))
+ assert(Utils.parseHostPort("[::1]:123") === (("[::1]", 123)))
+ assert(Utils.parseHostPort("[2001:db8:42::1]:123") === (("[2001:db8:42::1]", 123)))
--- End diff --
What is the prefered way to handle this kind of parse errors in Spark?
Changing the signature of this method to something like :Try[..], :Option ... is no option!?
Error log messages?
Unchecked Exceptions?
...
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19408
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85062/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19408
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...
Posted by obermeier <gi...@git.apache.org>.
GitHub user obermeier reopened a pull request:
https://github.com/apache/spark/pull/19408
[SPARK-22180][CORE] Allow IPv6 address in org.apache.spark.util.Utils.parseHostPort
External applications like Apache Cassandra are able to deal with IPv6 addresses. Libraries like spark-cassandra-connector combine Apache Cassandra with Apache Spark.
This combination is very useful IMHO.
One problem is that `org.apache.spark.util.Utils.parseHostPort(hostPort:` `String)` takes the last colon to sepperate the port from host path. This conflicts with literal IPv6 addresses.
I think we can take `hostPort` as literal IPv6 address if it contains tow ore more colons. If IPv6 addresses are enclosed in square brackets port definition is still possible.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/obermeier/spark issue/SPARK-22180
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19408.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19408
----
commit 453e104c3e6ed6a5ca310f599c274d6c66a3d3c8
Author: Stefan Obermeier <ma...@...>
Date: 2017-10-01T14:28:58Z
[SPARK-22180][CORE] Allow IPv6 address in org.apache.spark.util.Utils.parseHostPort
## What changes were proposed in this pull request?
Take ```hostPort``` as literal IPv6 address if it contains tow ore more colons. If IPv6 addresses are enclosed in square brackets port definition is still possible.
## How was this patch tested?
Added a new test case into UtilsSuite
Remove comment
commit 1e6623f518b43df4079bdc2680288063c6be13c6
Author: obermeier <ob...@...>
Date: 2017-12-14T22:52:21Z
Merge branch 'master' into issue/SPARK-22180
commit 1400299808631da0196a61f3588ede786dd0b041
Author: Stefan Obermeier <ma...@...>
Date: 2017-12-15T07:39:45Z
Fix build problem
commit 68c322129305a35c9d3e04f8cacc011be5fbaec4
Author: Stefan Obermeier <ma...@...>
Date: 2017-12-18T13:51:04Z
Fix style checks violation.
Remove whitespace at end of line.
commit 8220d95a99f4564e735f22947cb1cb698613efa5
Author: Stefan Obermeier <sc...@...>
Date: 2018-10-09T21:54:20Z
Add log message if hostname:port is not valid
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...
Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/19408#discussion_r150743565
--- Diff: core/src/test/scala/org/apache/spark/util/UtilsSuite.scala ---
@@ -1146,6 +1146,20 @@ class UtilsSuite extends SparkFunSuite with ResetSystemProperties with Logging {
}
}
+ test("parseHostPort") {
+ assert(Utils.parseHostPort("abc:123") === (("abc", 123)))
+ assert(Utils.parseHostPort("example.com") === (("example.com", 0)))
+ assert(Utils.parseHostPort("example.com:123") === (("example.com", 123)))
+ assert(Utils.parseHostPort("127.0.0.1") === (("127.0.0.1", 0)))
+ assert(Utils.parseHostPort("127.0.0.1:123") === (("127.0.0.1", 123)))
+ assert(Utils.parseHostPort("2001:db8::1") === (("2001:db8::1", 0)))
+ assert(Utils.parseHostPort("2001:DB8::1") === (("2001:DB8::1", 0)))
+ assert(Utils.parseHostPort("2001:dB8::1") === (("2001:dB8::1", 0)))
+ assert(Utils.parseHostPort("0:0:0:0:0:0:0:0") === (("0:0:0:0:0:0:0:0", 0)))
+ assert(Utils.parseHostPort("::1") === (("::1", 0)))
+ assert(Utils.parseHostPort("[::1]:123") === (("[::1]", 123)))
+ assert(Utils.parseHostPort("[2001:db8:42::1]:123") === (("[2001:db8:42::1]", 123)))
--- End diff --
nit: we should add more test cases to cover the invalid cases.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19408
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...
Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/19408#discussion_r142038575
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -981,7 +981,13 @@ private[spark] object Utils extends Logging {
return cached
}
- val indx: Int = hostPort.lastIndexOf(':')
+ val indx: Int =
+ // Interpret hostPort as literal IPv6 address if it contains two ore more colons
+ // scalastyle:off SingleSpaceBetweenRParenAndLCurlyBrace
+ if (hostPort.matches("(([0-9a-f]*):([0-9a-f]*)){2,}")) -1
+ // Else last colon defines start of port definition
+ else hostPort.lastIndexOf(':')
--- End diff --
Final nit, use braces for both parts of the if-else
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...
Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/19408#discussion_r142038556
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -981,7 +981,13 @@ private[spark] object Utils extends Logging {
return cached
}
- val indx: Int = hostPort.lastIndexOf(':')
+ val indx: Int =
+ // Interpret hostPort as literal IPv6 address if it contains two ore more colons
+ // scalastyle:off SingleSpaceBetweenRParenAndLCurlyBrace
--- End diff --
Turn the rule back on after the block?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19408
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/19408
Sounds good.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19408
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19408
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84982/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/19408
ok to test
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by obermeier <gi...@git.apache.org>.
Github user obermeier commented on the issue:
https://github.com/apache/spark/pull/19408
I chose this function because I had some exceptions like this [1] if I used IPv6 hosts.
In this example ```org.apache.spark.util.Utils$.parseHostPort``` decided to use f904 as port but it was the last 16 bit chunk of an IPv6 address.
I do not have an overview over the Spark code so I am currently not able to provide an general IPv6 solution.
I think this code snipets will improve the situation and enables us to use IPv6 hosts in some cases.
[1]
```
java.lang.NumberFormatException: For input string: "f904"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.parseInt(Integer.java:615)
at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272)
at scala.collection.immutable.StringOps.toInt(StringOps.scala:29)
at org.apache.spark.util.Utils$.parseHostPort(Utils.scala:935)
at org.apache.spark.scheduler.cluster.YarnScheduler.getRackForHost(YarnScheduler.scala:36)
at org.apache.spark.scheduler.TaskSetManager$$anonfun$org$apache$spark$scheduler$TaskSetManager$$addPendingTask$1.apply(TaskSetManager.scala:206)
at org.apache.spark.scheduler.TaskSetManager$$anonfun$org$apache$spark$scheduler$TaskSetManager$$addPendingTask$1.apply(TaskSetManager.scala:187)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.TaskSetManager.org$apache$spark$scheduler$TaskSetManager$$addPendingTask(TaskSetManager.scala:187)
at org.apache.spark.scheduler.TaskSetManager$$anonfun$1.apply$mcVI$sp(TaskSetManager.scala:166)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
at org.apache.spark.scheduler.TaskSetManager.<init>(TaskSetManager.scala:165)
at org.apache.spark.scheduler.TaskSchedulerImpl.createTaskSetManager(TaskSchedulerImpl.scala:205)
at org.apache.spark.scheduler.TaskSchedulerImpl.submitTasks(TaskSchedulerImpl.scala:169)
at org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1058)
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:933)
at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:873)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1626)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1618)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1607)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19408
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by obermeier <gi...@git.apache.org>.
Github user obermeier commented on the issue:
https://github.com/apache/spark/pull/19408
This issue seems to be fixed in Spark 2.3.2
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...
Posted by obermeier <gi...@git.apache.org>.
Github user obermeier commented on a diff in the pull request:
https://github.com/apache/spark/pull/19408#discussion_r150907312
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -982,7 +982,19 @@ private[spark] object Utils extends Logging {
return cached
}
- val indx: Int = hostPort.lastIndexOf(':')
+ val indx: Int =
+ // Interpret hostPort as literal IPv6 address if it contains multiple colons.
+ // IPv6 addresses enclosed in square brackets like [::1]:123 are not included.
+ // scalastyle:off SingleSpaceBetweenRParenAndLCurlyBrace
+ if (hostPort.matches("(([0-9a-fA-F]*):([0-9a-fA-F]*)){2,}")) {
+ // scalastyle:on SingleSpaceBetweenRParenAndLCurlyBrace
+ -1
+ } else {
+ // Else last colon defines start of port definition
+ hostPort.lastIndexOf(':')
+ }
+
+
// This is potentially broken - when dealing with ipv6 addresses for example, sigh ...
// but then hadoop does not support ipv6 right now.
// For now, we assume that if port exists, then it is valid - not check if it is an int > 0
--- End diff --
I removed this comment
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19408
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19408
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19408
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...
Posted by obermeier <gi...@git.apache.org>.
Github user obermeier commented on a diff in the pull request:
https://github.com/apache/spark/pull/19408#discussion_r142047496
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -981,7 +981,13 @@ private[spark] object Utils extends Logging {
return cached
}
- val indx: Int = hostPort.lastIndexOf(':')
+ val indx: Int =
+ // Interpret hostPort as literal IPv6 address if it contains two ore more colons
+ // scalastyle:off SingleSpaceBetweenRParenAndLCurlyBrace
+ if (hostPort.matches("(([0-9a-f]*):([0-9a-f]*)){2,}")) -1
--- End diff --
Yes a real parser would be much better!!
I hope the methods will check the input. Like the name resolver...
At this point I thought more about the separation of the port.
I think it is important to check if two colons exists, otherwise this expression accepts hostnames like abc:123
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/19408
@obermeier Could you please rebase this with the latest master? Thanks!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by obermeier <gi...@git.apache.org>.
Github user obermeier commented on the issue:
https://github.com/apache/spark/pull/19408
Done
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19408
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/19408
Fully agree with @mridulm 's big picture suggestion, and I also think supporting IPv6 should be designed as a integral feature, instead of just putting together some PRs.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19408
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apache.spa...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19408
**[Test build #84982 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84982/testReport)** for PR 19408 at commit [`1400299`](https://github.com/apache/spark/commit/1400299808631da0196a61f3588ede786dd0b041).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19408: [SPARK-22180][CORE] Allow IPv6 address in org.apa...
Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/19408#discussion_r142038567
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -981,7 +981,13 @@ private[spark] object Utils extends Logging {
return cached
}
- val indx: Int = hostPort.lastIndexOf(':')
+ val indx: Int =
+ // Interpret hostPort as literal IPv6 address if it contains two ore more colons
+ // scalastyle:off SingleSpaceBetweenRParenAndLCurlyBrace
+ if (hostPort.matches("(([0-9a-f]*):([0-9a-f]*)){2,}")) -1
--- End diff --
Hex digits can be uppercase right?
Should the pattern it not be more like `[0-9a-f]*(:[0-9a-f]*)+` match a number, then colon-number colon-number pairs, not number-colon-number number-colon-number sequences?
It might end up being equivalent because the match is for 0 or more digits.
This allows some strings that it shouldn't like "::::", but, the purpose isn't to catch every possible case I guess. It would fail name resolution.
I thought Inet6Address would just provide parsing for this but I guess not.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org