You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sujith Chacko (Jira)" <ji...@apache.org> on 2019/08/30 13:47:00 UTC
[jira] [Commented] (SPARK-28930) Spark DESC FORMATTED TABLENAME
information display issues
[ https://issues.apache.org/jira/browse/SPARK-28930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16919563#comment-16919563 ]
Sujith Chacko commented on SPARK-28930:
---------------------------------------
@ [~jobitmathew] As i remember Issue 3 is already handled as part of SPARK-24812 some time back, need to recheck. other issues i will check and get back to you.
cc [~dongjoon]
> Spark DESC FORMATTED TABLENAME information display issues
> ---------------------------------------------------------
>
> Key: SPARK-28930
> URL: https://issues.apache.org/jira/browse/SPARK-28930
> Project: Spark
> Issue Type: Bug
> Components: Spark Shell, SQL
> Affects Versions: 2.4.3
> Reporter: jobit mathew
> Priority: Minor
>
> Spark DESC FORMATTED TABLENAME information display issues.Showing incorrect *Last Access time and* feeling some information displays can make it better.
> Test steps:
> 1. Open spark sql
> 2. Create table with partition
> CREATE EXTERNAL TABLE IF NOT EXISTS employees_info_extended ( id INT, name STRING, usd_flag STRING, salary DOUBLE, deductions MAP<STRING, DOUBLE>, address STRING ) PARTITIONED BY (entrytime STRING) STORED AS TEXTFILE location 'hdfs://hacluster/user/sparkhive/warehouse';
> 3. from spark sql check the table description
> desc formatted tablename;
> 4. From scala shell check the table description
> sql("desc formatted tablename").show()
> *Issue1:*
> If there is no comment for spark scala shell shows *"null" in small letters* but all other places Hive beeline/Spark beeline/Spark SQL it is showing in *CAPITAL "NULL*". Better to show same in all places.
>
> *scala>* sql("desc formatted employees_info_extended").show(false);
> +-----------------------------+---------------------------++-------
> |col_name|data_type|*comment*|
> +-----------------------------+---------------------------++-------
> |id|int|*null*|
> |name|string|*null*|
> |usd_flag|string|*null*|
> |salary|double|*null*|
> |deductions|map<string,double>|*null*|
> |address|string|null|
> |entrytime|string|null|
> | # Partition Information| | |
> | # col_name|data_type|comment|
> |entrytime|string|null|
> | | | |
> | # Detailed Table Information| | |
> |Database|sparkdb__| |
> |Table|employees_info_extended| |
> |Owner|root| |
> *|Created Time |Tue Aug 20 13:42:06 CST 2019| |*
> *|Last Access |Thu Jan 01 08:00:00 CST 1970| |*
> |Created By|Spark 2.4.3| |
> |Type|EXTERNAL| |
> |Provider|hive| |
> +-----------------------------+---------------------------++-------
> only showing top 20 rows
> *scala>*
> *Issue 2:*
> Spark SQL "desc formatted tablename" is not showing the header [# col_name,data_type,comment|#col_name,data_type,comment] in the top of the query result.But header is showing on top of partition description. For Better understanding show the header on Top of the query result.
> *spark-sql>* desc formatted employees_info_extended1;
> id int *NULL*
> name string *NULL*
> usd_flag string NULL
> salary double NULL
> deductions map<string,double> NULL
> address string NULL
> entrytime string NULL
> *
> ## Partition Information*
> ## col_name data_type comment*
> entrytime string *NULL*
> # Detailed Table Information
> Database sparkdb__
> Table employees_info_extended1
> Owner spark
> *Created Time Tue Aug 20 14:50:37 CST 2019*
> *Last Access Thu Jan 01 08:00:00 CST 1970*
> Created By Spark 2.3.2.0201
> Type EXTERNAL
> Provider hive
> Table Properties [transient_lastDdlTime=1566286655]
> Location hdfs://hacluster/user/sparkhive/warehouse
> Serde Library org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> InputFormat org.apache.hadoop.mapred.TextInputFormat
> OutputFormat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> Storage Properties [serialization.format=1]
> Partition Provider Catalog
> Time taken: 0.477 seconds, Fetched 27 row(s)
> *spark-sql>*
>
> *Issue 3:*
> I created the table on Aug 20.So it is showing created time correct .*But Last access time showing 1970 Jan 01*. It is not good to show Last access time earlier time than the created time.Better to show the correct date and time else show UNKNOWN.
> *[Created Time,Tue Aug 20 13:42:06 CST 2019,]*
> *[Last Access,Thu Jan 01 08:00:00 CST 1970,]*
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org