You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2018/12/18 19:11:37 UTC

[GitHub] rxin commented on a change in pull request #23327: [SPARK-26222][SQL] Track file listing time

rxin commented on a change in pull request #23327: [SPARK-26222][SQL] Track file listing time
URL: https://github.com/apache/spark/pull/23327#discussion_r242662526
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileIndex.scala
 ##########
 @@ -74,12 +74,11 @@ trait FileIndex {
   def partitionSchema: StructType
 
   /**
-   * Returns an optional metadata operation time, in nanoseconds, for listing files.
+   * Returns an optional file listing phase summary.
    *
-   * We do file listing in query optimization (in order to get the proper statistics) and we want
-   * to account for file listing time in physical execution (as metrics). To do that, we save the
-   * file listing time in some implementations and physical execution calls it in this method
-   * to update the metrics.
+   * We call this in physical execution while we want to account for the file listing time as
+   * metrics. If partition pruning happened in query planning, the phase also contains this
+   * part of the cost, otherwise, it only contains file listing time of FileIndex initialize.
    */
-  def metadataOpsTimeNs: Option[Long] = None
 
 Review comment:
   I'd keep this to avoid changing too many. Just make it a function to return a value based on fileListingPhase

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org