You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by peter-toth <gi...@git.apache.org> on 2018/10/01 19:35:49 UTC

[GitHub] spark pull request #22603: SPARK-25062: clean up BlockLocations in InMemoryF...

GitHub user peter-toth opened a pull request:

    https://github.com/apache/spark/pull/22603

    SPARK-25062: clean up BlockLocations in InMemoryFileIndex

    ## What changes were proposed in this pull request?
    
    `InMemoryFileIndex` caches `FileStatus` objects to paths. Each `FileStatus` object can contain several `BlockLocations`. Depending on the parallel discovery threshold (`spark.sql.sources.parallelPartitionDiscovery.threshold`) the file listing can happen on the driver or on the executors. If the listing happens on the executors the block location objects are converted to simple `BlockLocation` objects to ensure serialization requirements. If it happens on the driver then there is no conversion and depending on the file system a `BlockLocation` object can be a subclass like `HdfsBlockLocation` and consume more memory. This PR adds the conversion to the latter case and decreases memory consumption.
    
    ## How was this patch tested?
    
    Added unit test.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/peter-toth/spark SPARK-25062

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22603.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22603
    
----
commit 5735dba9a43c336843ccc531831105d6c23b4586
Author: Peter Toth <pe...@...>
Date:   2018-09-30T13:01:36Z

    SPARK-25062: clean up BlockLocations in InMemoryFileIndex

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    Could you review this, @cloud-fan , @gatorsmile , @HyukjinKwon ?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    @peter-toth . What is your Apache JIRA user id? I need to assign you to the resolved SPARK-25062, but I cannot find your id and user name `Peter Toth`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...

Posted by peter-toth <gi...@git.apache.org>.
Github user peter-toth commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    Thanks @cloud-fan for the review. I've fixed your findings.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...

Posted by peter-toth <gi...@git.apache.org>.
Github user peter-toth commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    Thanks @dongjoon-hyun for the review. I've fixed your findings.  


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96856/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    **[Test build #96932 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96932/testReport)** for PR 22603 at commit [`7b0bc56`](https://github.com/apache/spark/commit/7b0bc568baa69e74fc98ac34f13b6c54b3d4f7a7).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22603: [SPARK-25062][SQL] Clean up BlockLocations in InM...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22603#discussion_r222410955
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileIndexSuite.scala ---
    @@ -248,6 +248,25 @@ class FileIndexSuite extends SharedSQLContext {
           assert(spark.read.parquet(path.getAbsolutePath).schema.exists(_.name == colToUnescape))
         }
       }
    +
    +  test("SPARK-25062 - InMemoryCache stores only simple BlockLocations") {
    +    withSQLConf("fs.file.impl" -> classOf[SpecialBlockLocationFileSystem].getName) {
    +      withTempDir { dir =>
    +        val file = new File(dir, "text.txt")
    +        stringToFile(file, "text")
    +
    +        val inMemoryFileIndex = new InMemoryFileIndex(
    +          spark, Seq(new Path(file.getCanonicalPath)), Map.empty, None) {
    +          def leafFileStatuses = leafFiles.map(_._2)
    --- End diff --
    
    nit, `def leafFileStatuses = leafFiles.values`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22603: SPARK-25062: clean up BlockLocations in InMemoryF...

Posted by peter-toth <gi...@git.apache.org>.
Github user peter-toth commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22603#discussion_r221898450
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala ---
    @@ -315,7 +315,12 @@ object InMemoryFileIndex extends Logging {
             // which is very slow on some file system (RawLocalFileSystem, which is launch a
             // subprocess and parse the stdout).
             try {
    -          val locations = fs.getFileBlockLocations(f, 0, f.getLen)
    +          val locations = fs.getFileBlockLocations(f, 0, f.getLen).map(
    +            loc => if (loc.getClass == classOf[BlockLocation]) {
    --- End diff --
    
    :thumbsup:


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96932/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22603: SPARK-25062: clean up BlockLocations in InMemoryF...

Posted by peter-toth <gi...@git.apache.org>.
Github user peter-toth commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22603#discussion_r221890344
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala ---
    @@ -315,7 +315,12 @@ object InMemoryFileIndex extends Logging {
             // which is very slow on some file system (RawLocalFileSystem, which is launch a
             // subprocess and parse the stdout).
             try {
    -          val locations = fs.getFileBlockLocations(f, 0, f.getLen)
    +          val locations = fs.getFileBlockLocations(f, 0, f.getLen).map(
    +            loc => if (loc.getClass == classOf[BlockLocation]) {
    --- End diff --
    
    Thanks @mgaido91, but loc is always an instance of `BlockLocation` (might be a subclass such as `HdfsBlockLocation`) so isInstanceOf[BlockLocation] or pattern matching would return always true.
    I want to test that the class of loc is exactly `BlockLocation` and if it is we don't need to convert it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22603: [SPARK-25062][SQL] Clean up BlockLocations in InM...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22603#discussion_r222397997
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala ---
    @@ -315,7 +315,13 @@ object InMemoryFileIndex extends Logging {
             // which is very slow on some file system (RawLocalFileSystem, which is launch a
             // subprocess and parse the stdout).
             try {
    -          val locations = fs.getFileBlockLocations(f, 0, f.getLen)
    +          val locations = fs.getFileBlockLocations(f, 0, f.getLen).map { loc =>
    --- End diff --
    
    Hi, @peter-toth .
    Could you add one line comment to explain this conversion?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: SPARK-25062: clean up BlockLocations in InMemoryFileInde...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    **[Test build #96932 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96932/testReport)** for PR 22603 at commit [`7b0bc56`](https://github.com/apache/spark/commit/7b0bc568baa69e74fc98ac34f13b6c54b3d4f7a7).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: SPARK-25062: clean up BlockLocations in InMemoryFileInde...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    **[Test build #96856 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96856/testReport)** for PR 22603 at commit [`45f0c81`](https://github.com/apache/spark/commit/45f0c8111e439ccf0591b0c7ae1cf9d2069458e3).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22603: [SPARK-25062][SQL] Clean up BlockLocations in InM...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22603#discussion_r223048148
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileIndexSuite.scala ---
    @@ -257,3 +277,19 @@ class FakeParentPathFileSystem extends RawLocalFileSystem {
         URI.create("mockFs://some-bucket")
       }
     }
    +
    +class SpecialBlockLocationFileSystem extends RawLocalFileSystem {
    +
    +  class SpecialBlockLocation(
    +    names: Array[String],
    +    hosts: Array[String],
    +    offset: Long,
    +    length: Long) extends BlockLocation(names, hosts, offset, length)
    +
    +  override def getFileBlockLocations(
    +    file: FileStatus,
    --- End diff --
    
    ditto


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22603: [SPARK-25062][SQL] Clean up BlockLocations in InM...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22603#discussion_r222412549
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileIndexSuite.scala ---
    @@ -248,6 +248,25 @@ class FileIndexSuite extends SharedSQLContext {
           assert(spark.read.parquet(path.getAbsolutePath).schema.exists(_.name == colToUnescape))
         }
       }
    +
    +  test("SPARK-25062 - InMemoryCache stores only simple BlockLocations") {
    --- End diff --
    
    `InMemoryCache` -> `InMemoryFileIndex`? And, `simple BlockLocations` may look unclear later.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    @peter-toth . Could you address @cloud-fan 's comments?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    @peter-toth . Could you comment on the JIRA?
    
    - https://issues.apache.org/jira/browse/SPARK-25062?focusedCommentId=16641131&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16641131


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    Congratulation for your first contribution, @peter-toth . And, thank you, @cloud-fan and @mgaido91 .
    
    Merged to master.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22603: SPARK-25062: clean up BlockLocations in InMemoryF...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22603#discussion_r221876275
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala ---
    @@ -315,7 +315,12 @@ object InMemoryFileIndex extends Logging {
             // which is very slow on some file system (RawLocalFileSystem, which is launch a
             // subprocess and parse the stdout).
             try {
    -          val locations = fs.getFileBlockLocations(f, 0, f.getLen)
    +          val locations = fs.getFileBlockLocations(f, 0, f.getLen).map(
    +            loc => if (loc.getClass == classOf[BlockLocation]) {
    --- End diff --
    
    `lo.isInstanceOf[BlockLocation]`? Or even better, what about using pattern matching?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    **[Test build #97065 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97065/testReport)** for PR 22603 at commit [`a50ae71`](https://github.com/apache/spark/commit/a50ae71f4c9b035482df20d2565ae553cac350bc).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97065/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22603: SPARK-25062: clean up BlockLocations in InMemoryF...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22603#discussion_r221895032
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala ---
    @@ -315,7 +315,12 @@ object InMemoryFileIndex extends Logging {
             // which is very slow on some file system (RawLocalFileSystem, which is launch a
             // subprocess and parse the stdout).
             try {
    -          val locations = fs.getFileBlockLocations(f, 0, f.getLen)
    +          val locations = fs.getFileBlockLocations(f, 0, f.getLen).map(
    +            loc => if (loc.getClass == classOf[BlockLocation]) {
    --- End diff --
    
    ah right, sorry @peter-toth. Thanks. Anyway, please move `loc` to the previous line and use curly braces for map. I think that is the most widely spread syntax in the codebase. Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: SPARK-25062: clean up BlockLocations in InMemoryFileInde...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    **[Test build #96856 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96856/testReport)** for PR 22603 at commit [`45f0c81`](https://github.com/apache/spark/commit/45f0c8111e439ccf0591b0c7ae1cf9d2069458e3).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    **[Test build #97065 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97065/testReport)** for PR 22603 at commit [`a50ae71`](https://github.com/apache/spark/commit/a50ae71f4c9b035482df20d2565ae553cac350bc).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22603: [SPARK-25062][SQL] Clean up BlockLocations in InM...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/22603


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: SPARK-25062: clean up BlockLocations in InMemoryFileInde...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: SPARK-25062: clean up BlockLocations in InMemoryFileInde...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22603: [SPARK-25062][SQL] Clean up BlockLocations in InM...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22603#discussion_r223048108
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileIndexSuite.scala ---
    @@ -257,3 +277,19 @@ class FakeParentPathFileSystem extends RawLocalFileSystem {
         URI.create("mockFs://some-bucket")
       }
     }
    +
    +class SpecialBlockLocationFileSystem extends RawLocalFileSystem {
    +
    +  class SpecialBlockLocation(
    +    names: Array[String],
    --- End diff --
    
    4 spaces indentation


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22603: [SPARK-25062][SQL] Clean up BlockLocations in InMemoryFi...

Posted by peter-toth <gi...@git.apache.org>.
Github user peter-toth commented on the issue:

    https://github.com/apache/spark/pull/22603
  
    Thanks @dongjoon-hyun , `petertoth` is my JIRA user id.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org