You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Micah Kornfield (Jira)" <ji...@apache.org> on 2020/06/24 19:32:00 UTC

[jira] [Created] (SPARK-32095) [DataSource V2] Documentation on SupportsReportStatistics Outdated?

Micah Kornfield created SPARK-32095:
---------------------------------------

             Summary: [DataSource V2] Documentation on SupportsReportStatistics Outdated?
                 Key: SPARK-32095
                 URL: https://issues.apache.org/jira/browse/SPARK-32095
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.0.0, 2.4.6
            Reporter: Micah Kornfield


I was wondering if the documentation on SupportsReportStatistics [1][3] about its interaction with the planner and predicate pushdowns is still accurate. It says:

"Implementations that return more accurate statistics based on pushed operators will not improve query performance until the planner can push operators before getting stats."

 

Is this still accurate? When looking through the code it seems like there is now functionality that explicitly wants the operators pushed down [2]. Is the documentation for SupportsReportStatistics referring to something other than [2] or should it be updated?

 

[[1]https://spark.apache.org/docs/2.4.6/api/java/org/apache/spark/sql/sources/v2/reader/SupportsReportStatistics.html|https://spark.apache.org/docs/2.4.6/api/java/org/apache/spark/sql/sources/v2/reader/SupportsReportStatistics.html]

[2] [https://github.com/apache/spark/blob/d0800fc8e2e71a79bf0f72c3e4bc608ae34053e7/sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala#L86]

[3][https://spark.apache.org/docs/3.0.0-preview/api/java/org/apache/spark/sql/connector/read/SupportsReportStatistics.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org