You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sujith Chacko (JIRA)" <ji...@apache.org> on 2019/04/07 18:31:00 UTC

[jira] [Commented] (SPARK-27403) Failed to update the table size automatically even though spark.sql.statistics.size.autoUpdate.enabled is set as rue

    [ https://issues.apache.org/jira/browse/SPARK-27403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16811940#comment-16811940 ] 

Sujith Chacko commented on SPARK-27403:
---------------------------------------

I will analyze further and raise a PR for handling the issue. please let me know for any suggestions. thanks

> Failed to update the table size automatically even though spark.sql.statistics.size.autoUpdate.enabled is set as rue
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-27403
>                 URL: https://issues.apache.org/jira/browse/SPARK-27403
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.1
>            Reporter: Sujith Chacko
>            Priority: Major
>
> system shall update the table stats automatiaclly if user set spark.sql.statistics.size.autoUpdate.enabled as true, currently this property is not having any significance even if it is anabled or disabled. This feature is similar to Hives auto-gather feature where statistics are automatically computed by default if this feature is enabled.
> Reference:
> [https://cwiki.apache.org/confluence/display/Hive/StatsDev]
> Reproducing steps:
> scala> spark.sql("create table table1 (name string,age int) stored as parquet")scala> spark.sql("insert into table1 select 'a',29")
>  res2: org.apache.spark.sql.DataFrame = []
> scala> spark.sql("desc extended table1").show(false)
>  +-----------------------------+-------------------------------------------------------------++-------
> |col_name|data_type|comment|
> +-----------------------------+-------------------------------------------------------------++-------
> |name|string|null|
> |age|int|null|
> | | | |
> | # Detailed Table Information| | |
> |Database|default| |
> |Table|table1| |
> |Owner|Administrator| |
> |Created Time|Sun Apr 07 23:41:56 IST 2019| |
> |Last Access|Thu Jan 01 05:30:00 IST 1970| |
> |Created By|Spark 2.4.1| |
> |Type|MANAGED| |
> |Provider|hive| |
> |Table Properties|[transient_lastDdlTime=1554660716]| |
> |Location|file:/D:/spark-2.4.1-bin-hadoop2.7/bin/spark-warehouse/table1| |
> |Serde Library|org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe| |
> |InputFormat|org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat| |
> |OutputFormat|org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat| |
> |Storage Properties|[serialization.format=1]| |
> |Partition Provider|Catalog| |
> +-----------------------------+-------------------------------------------------------------++-------



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org