You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2017/02/27 07:54:45 UTC

[jira] [Commented] (SPARK-19748) refresh for InMemoryFileIndex with FileStatusCache does not work correctly

    [ https://issues.apache.org/jira/browse/SPARK-19748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15885307#comment-15885307 ] 

Apache Spark commented on SPARK-19748:
--------------------------------------

User 'windpiger' has created a pull request for this issue:
https://github.com/apache/spark/pull/17079

> refresh for InMemoryFileIndex with FileStatusCache does not work correctly
> --------------------------------------------------------------------------
>
>                 Key: SPARK-19748
>                 URL: https://issues.apache.org/jira/browse/SPARK-19748
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Song Jun
>
> If we refresh a InMemoryFileIndex with a FileStatusCache, it will first use the FileStatusCache to generate the cachedLeafFiles etc, then call FileStatusCache.invalidateAll. the order to do these two actions is wrong, this lead to the refresh action does not take effect.
> {code}
>   override def refresh(): Unit = {
>     refresh0()
>     fileStatusCache.invalidateAll()
>   }
>   private def refresh0(): Unit = {
>     val files = listLeafFiles(rootPaths)
>     cachedLeafFiles =
>       new mutable.LinkedHashMap[Path, FileStatus]() ++= files.map(f => f.getPath -> f)
>     cachedLeafDirToChildrenFiles = files.toArray.groupBy(_.getPath.getParent)
>     cachedPartitionSpec = null
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org