You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by cloud-fan <gi...@git.apache.org> on 2018/06/12 22:46:21 UTC
[GitHub] spark pull request #17702: [SPARK-20408][SQL] Get the glob path in parallel ...
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/17702#discussion_r194911415
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala ---
@@ -252,6 +252,18 @@ class SparkHadoopUtil extends Logging {
if (isGlobPath(pattern)) globPath(fs, pattern) else Seq(pattern)
}
+ def expandGlobPath(fs: FileSystem, pattern: Path): Seq[String] = {
+ val arr = pattern.toString.split("/")
--- End diff --
we should not parse the path string ourselves, it's too risky, we may miss some special cases like windows path, escape character, etc. Let's take a look at `org.apache.hadoop.fs.Globber` and see if we can reuse some parser API there.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org