You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Dongjoon Hyun (JIRA)" <ji...@apache.org> on 2019/04/11 15:56:00 UTC

[jira] [Resolved] (SPARK-27403) Failed to update the table size automatically even though spark.sql.statistics.size.autoUpdate.enabled is set as rue

     [ https://issues.apache.org/jira/browse/SPARK-27403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dongjoon Hyun resolved SPARK-27403.
-----------------------------------
       Resolution: Fixed
         Assignee: Sujith Chacko
    Fix Version/s: 3.0.0

This is resolved via https://github.com/apache/spark/pull/24315

> Failed to update the table size automatically even though spark.sql.statistics.size.autoUpdate.enabled is set as rue
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-27403
>                 URL: https://issues.apache.org/jira/browse/SPARK-27403
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.4.0, 2.4.1
>            Reporter: Sujith Chacko
>            Assignee: Sujith Chacko
>            Priority: Major
>             Fix For: 3.0.0
>
>
> system shall update the table stats automatically if user set spark.sql.statistics.size.autoUpdate.enabled as true, currently this property is not having any significance even if it is enabled or disabled. This feature is similar to Hives auto-gather feature where statistics are automatically computed by default if this feature is enabled.
> Reference:
> [https://cwiki.apache.org/confluence/display/Hive/StatsDev]
> Reproducing steps:
> scala> spark.sql("create table table1 (name string,age int) stored as parquet")
> scala> spark.sql("insert into table1 select 'a',29")
>  res2: org.apache.spark.sql.DataFrame = []
> scala> spark.sql("desc extended table1").show(false)
>  +-------------------------------+-----------------------------------------------------------++-------
> |col_name|data_type|comment|
> +-------------------------------+-----------------------------------------------------------++-------
> |name|string|null|
> |age|int|null|
> | | | |
> | # Detailed Table Information| | |
> |Database|default| |
> |Table|table1| |
> |Owner|Administrator| |
> |Created Time|Sun Apr 07 23:41:56 IST 2019| |
> |Last Access|Thu Jan 01 05:30:00 IST 1970| |
> |Created By|Spark 2.4.1| |
> |Type|MANAGED| |
> |Provider|hive| |
> |Table Properties|[transient_lastDdlTime=1554660716]| |
> |Location|file:/D:/spark-2.4.1-bin-hadoop2.7/bin/spark-warehouse/table1| |
> |Serde Library|org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe| |
> |InputFormat|org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat| |
> |OutputFormat|org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat| |
> |Storage Properties|[serialization.format=1]| |
> |Partition Provider|Catalog| |
> +-------------------------------+-----------------------------------------------------------++-------



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org