You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by peter-toth <gi...@git.apache.org> on 2018/10/01 19:35:49 UTC
[GitHub] spark pull request #22603: SPARK-25062: clean up BlockLocations in InMemoryF...
GitHub user peter-toth opened a pull request:
https://github.com/apache/spark/pull/22603
SPARK-25062: clean up BlockLocations in InMemoryFileIndex
## What changes were proposed in this pull request?
`InMemoryFileIndex` caches `FileStatus` objects to paths. Each `FileStatus` object can contain several `BlockLocations`. Depending on the parallel discovery threshold (`spark.sql.sources.parallelPartitionDiscovery.threshold`) the file listing can happen on the driver or on the executors. If the listing happens on the executors the block location objects are converted to simple `BlockLocation` objects to ensure serialization requirements. If it happens on the driver then there is no conversion and depending on the file system a `BlockLocation` object can be a subclass like `HdfsBlockLocation` and consume more memory. This PR adds the conversion to the latter case and decreases memory consumption.
## How was this patch tested?
Added unit test.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/peter-toth/spark SPARK-25062
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22603.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22603
----
commit 5735dba9a43c336843ccc531831105d6c23b4586
Author: Peter Toth <pe...@...>
Date: 2018-09-30T13:01:36Z
SPARK-25062: clean up BlockLocations in InMemoryFileIndex
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/22603
Could you review this, @cloud-fan , @gatorsmile , @HyukjinKwon ?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/22603
@peter-toth . What is your Apache JIRA user id? I need to assign you to the resolved SPARK-25062, but I cannot find your id and user name `Peter Toth`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...
Posted by peter-toth <gi...@git.apache.org>.
Github user peter-toth commented on the issue:
https://github.com/apache/spark/pull/22603
Thanks @cloud-fan for the review. I've fixed your findings.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...
Posted by peter-toth <gi...@git.apache.org>.
Github user peter-toth commented on the issue:
https://github.com/apache/spark/pull/22603
Thanks @dongjoon-hyun for the review. I've fixed your findings.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22603
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96856/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22603
**[Test build #96932 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96932/testReport)** for PR 22603 at commit [`7b0bc56`](https://github.com/apache/spark/commit/7b0bc568baa69e74fc98ac34f13b6c54b3d4f7a7).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22603: [SPARK-25062][SQL] Clean up BlockLocations in InM...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/22603#discussion_r222410955
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileIndexSuite.scala ---
@@ -248,6 +248,25 @@ class FileIndexSuite extends SharedSQLContext {
assert(spark.read.parquet(path.getAbsolutePath).schema.exists(_.name == colToUnescape))
}
}
+
+ test("SPARK-25062 - InMemoryCache stores only simple BlockLocations") {
+ withSQLConf("fs.file.impl" -> classOf[SpecialBlockLocationFileSystem].getName) {
+ withTempDir { dir =>
+ val file = new File(dir, "text.txt")
+ stringToFile(file, "text")
+
+ val inMemoryFileIndex = new InMemoryFileIndex(
+ spark, Seq(new Path(file.getCanonicalPath)), Map.empty, None) {
+ def leafFileStatuses = leafFiles.map(_._2)
--- End diff --
nit, `def leafFileStatuses = leafFiles.values`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22603: SPARK-25062: clean up BlockLocations in InMemoryF...
Posted by peter-toth <gi...@git.apache.org>.
Github user peter-toth commented on a diff in the pull request:
https://github.com/apache/spark/pull/22603#discussion_r221898450
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala ---
@@ -315,7 +315,12 @@ object InMemoryFileIndex extends Logging {
// which is very slow on some file system (RawLocalFileSystem, which is launch a
// subprocess and parse the stdout).
try {
- val locations = fs.getFileBlockLocations(f, 0, f.getLen)
+ val locations = fs.getFileBlockLocations(f, 0, f.getLen).map(
+ loc => if (loc.getClass == classOf[BlockLocation]) {
--- End diff --
:thumbsup:
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22603
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96932/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22603: SPARK-25062: clean up BlockLocations in InMemoryF...
Posted by peter-toth <gi...@git.apache.org>.
Github user peter-toth commented on a diff in the pull request:
https://github.com/apache/spark/pull/22603#discussion_r221890344
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala ---
@@ -315,7 +315,12 @@ object InMemoryFileIndex extends Logging {
// which is very slow on some file system (RawLocalFileSystem, which is launch a
// subprocess and parse the stdout).
try {
- val locations = fs.getFileBlockLocations(f, 0, f.getLen)
+ val locations = fs.getFileBlockLocations(f, 0, f.getLen).map(
+ loc => if (loc.getClass == classOf[BlockLocation]) {
--- End diff --
Thanks @mgaido91, but loc is always an instance of `BlockLocation` (might be a subclass such as `HdfsBlockLocation`) so isInstanceOf[BlockLocation] or pattern matching would return always true.
I want to test that the class of loc is exactly `BlockLocation` and if it is we don't need to convert it.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22603: [SPARK-25062][SQL] Clean up BlockLocations in InM...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/22603#discussion_r222397997
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala ---
@@ -315,7 +315,13 @@ object InMemoryFileIndex extends Logging {
// which is very slow on some file system (RawLocalFileSystem, which is launch a
// subprocess and parse the stdout).
try {
- val locations = fs.getFileBlockLocations(f, 0, f.getLen)
+ val locations = fs.getFileBlockLocations(f, 0, f.getLen).map { loc =>
--- End diff --
Hi, @peter-toth .
Could you add one line comment to explain this conversion?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: SPARK-25062: clean up BlockLocations in InMemoryFileInde...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22603
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22603
**[Test build #96932 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96932/testReport)** for PR 22603 at commit [`7b0bc56`](https://github.com/apache/spark/commit/7b0bc568baa69e74fc98ac34f13b6c54b3d4f7a7).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22603
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: SPARK-25062: clean up BlockLocations in InMemoryFileInde...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22603
**[Test build #96856 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96856/testReport)** for PR 22603 at commit [`45f0c81`](https://github.com/apache/spark/commit/45f0c8111e439ccf0591b0c7ae1cf9d2069458e3).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/22603
LGTM
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22603: [SPARK-25062][SQL] Clean up BlockLocations in InM...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22603#discussion_r223048148
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileIndexSuite.scala ---
@@ -257,3 +277,19 @@ class FakeParentPathFileSystem extends RawLocalFileSystem {
URI.create("mockFs://some-bucket")
}
}
+
+class SpecialBlockLocationFileSystem extends RawLocalFileSystem {
+
+ class SpecialBlockLocation(
+ names: Array[String],
+ hosts: Array[String],
+ offset: Long,
+ length: Long) extends BlockLocation(names, hosts, offset, length)
+
+ override def getFileBlockLocations(
+ file: FileStatus,
--- End diff --
ditto
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22603
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22603: [SPARK-25062][SQL] Clean up BlockLocations in InM...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/22603#discussion_r222412549
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileIndexSuite.scala ---
@@ -248,6 +248,25 @@ class FileIndexSuite extends SharedSQLContext {
assert(spark.read.parquet(path.getAbsolutePath).schema.exists(_.name == colToUnescape))
}
}
+
+ test("SPARK-25062 - InMemoryCache stores only simple BlockLocations") {
--- End diff --
`InMemoryCache` -> `InMemoryFileIndex`? And, `simple BlockLocations` may look unclear later.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/22603
@peter-toth . Could you address @cloud-fan 's comments?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/22603
@peter-toth . Could you comment on the JIRA?
- https://issues.apache.org/jira/browse/SPARK-25062?focusedCommentId=16641131&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16641131
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/22603
Congratulation for your first contribution, @peter-toth . And, thank you, @cloud-fan and @mgaido91 .
Merged to master.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22603: SPARK-25062: clean up BlockLocations in InMemoryF...
Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22603#discussion_r221876275
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala ---
@@ -315,7 +315,12 @@ object InMemoryFileIndex extends Logging {
// which is very slow on some file system (RawLocalFileSystem, which is launch a
// subprocess and parse the stdout).
try {
- val locations = fs.getFileBlockLocations(f, 0, f.getLen)
+ val locations = fs.getFileBlockLocations(f, 0, f.getLen).map(
+ loc => if (loc.getClass == classOf[BlockLocation]) {
--- End diff --
`lo.isInstanceOf[BlockLocation]`? Or even better, what about using pattern matching?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22603
**[Test build #97065 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97065/testReport)** for PR 22603 at commit [`a50ae71`](https://github.com/apache/spark/commit/a50ae71f4c9b035482df20d2565ae553cac350bc).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22603
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22603
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97065/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22603: SPARK-25062: clean up BlockLocations in InMemoryF...
Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22603#discussion_r221895032
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala ---
@@ -315,7 +315,12 @@ object InMemoryFileIndex extends Logging {
// which is very slow on some file system (RawLocalFileSystem, which is launch a
// subprocess and parse the stdout).
try {
- val locations = fs.getFileBlockLocations(f, 0, f.getLen)
+ val locations = fs.getFileBlockLocations(f, 0, f.getLen).map(
+ loc => if (loc.getClass == classOf[BlockLocation]) {
--- End diff --
ah right, sorry @peter-toth. Thanks. Anyway, please move `loc` to the previous line and use curly braces for map. I think that is the most widely spread syntax in the codebase. Thanks.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: SPARK-25062: clean up BlockLocations in InMemoryFileInde...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/22603
ok to test
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22603
**[Test build #96856 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96856/testReport)** for PR 22603 at commit [`45f0c81`](https://github.com/apache/spark/commit/45f0c8111e439ccf0591b0c7ae1cf9d2069458e3).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22603
**[Test build #97065 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97065/testReport)** for PR 22603 at commit [`a50ae71`](https://github.com/apache/spark/commit/a50ae71f4c9b035482df20d2565ae553cac350bc).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22603: [SPARK-25062][SQL] Clean up BlockLocations in InM...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/22603
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: SPARK-25062: clean up BlockLocations in InMemoryFileInde...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22603
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: SPARK-25062: clean up BlockLocations in InMemoryFileInde...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22603
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22603: [SPARK-25062][SQL] Clean up BlockLocations in InM...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22603#discussion_r223048108
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileIndexSuite.scala ---
@@ -257,3 +277,19 @@ class FakeParentPathFileSystem extends RawLocalFileSystem {
URI.create("mockFs://some-bucket")
}
}
+
+class SpecialBlockLocationFileSystem extends RawLocalFileSystem {
+
+ class SpecialBlockLocation(
+ names: Array[String],
--- End diff --
4 spaces indentation
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...
Posted by peter-toth <gi...@git.apache.org>.
Github user peter-toth commented on the issue:
https://github.com/apache/spark/pull/22603
Thanks @dongjoon-hyun , `petertoth` is my JIRA user id.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org