You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "philipse (Jira)" <ji...@apache.org> on 2020/05/14 10:31:00 UTC

[jira] [Updated] (SPARK-31710) result is the not the same when query and execute jobs

     [ https://issues.apache.org/jira/browse/SPARK-31710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

philipse updated SPARK-31710:
-----------------------------
    Description: 
Hi Team

Steps to reproduce.
{code:java}
create table test(id bigint);
insert into test select 1586318188000;
create table test1(id bigint) partitioned by (year string);
insert overwrite table test1 partition(year) select 234,cast(id as TIMESTAMP) from test;
{code}
let's check the result. 

Case 1:

*select * from test1;*

234 | 52238-06-04 13:06:400.0

--the result is wrong

Case 2:

*select 234,cast(id as TIMESTAMP) from test;*

 

java.lang.IllegalArgumentException: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff]
 at java.sql.Timestamp.valueOf(Timestamp.java:237)
 at org.apache.hive.jdbc.HiveBaseResultSet.evaluate(HiveBaseResultSet.java:441)
 at org.apache.hive.jdbc.HiveBaseResultSet.getColumnValue(HiveBaseResultSet.java:421)
 at org.apache.hive.jdbc.HiveBaseResultSet.getString(HiveBaseResultSet.java:530)
 at org.apache.hive.beeline.Rows$Row.<init>(Rows.java:166)
 at org.apache.hive.beeline.BufferedRows.<init>(BufferedRows.java:43)
 at org.apache.hive.beeline.BeeLine.print(BeeLine.java:1756)
 at org.apache.hive.beeline.Commands.execute(Commands.java:826)
 at org.apache.hive.beeline.Commands.sql(Commands.java:670)
 at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:974)
 at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:810)
 at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:767)
 at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:480)
 at org.apache.hive.beeline.BeeLine.main(BeeLine.java:463)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:226)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:141)
 Error: Unrecognized column type:TIMESTAMP_TYPE (state=,code=0)

 

I try hive,it works well,and the convert is fine and correct
{code:java}
select 234,cast(id as TIMESTAMP) from test;
 234   2020-04-08 11:56:28
{code}
Two questions:

q1:

if we forbid this convert,should we keep all cases the same?

q2:

if we allow the convert in some cases, should we decide the long length, for the code seems to force to convert to ns with times*1000000 nomatter how long the data is,if it convert to timestamp with incorrect length, we can raise the error.
{code:java}
// // converting seconds to us
private[this] def longToTimestamp(t: Long): Long = t * 1000000L{code}
 

Thanks!

 

  was:
Hi Team 

Steps to reproduce.
{code:java}
create table test(id bigint);
insert into test select 1586318188000;
create table test1(id bigint) partitioned by (year string);
insert overwrite table test1 partition(year) select 234,cast(id as TIMESTAMP) from test;
{code}
let's check the result. 

Case 1:

*select * from test1;*

234 | 52238-06-04 13:06:400.0

Case 2:

*select 234,cast(id as TIMESTAMP) from test;*

java.lang.IllegalArgumentException: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff]
 at java.sql.Timestamp.valueOf(Timestamp.java:237)
 at org.apache.hive.jdbc.HiveBaseResultSet.evaluate(HiveBaseResultSet.java:441)
 at org.apache.hive.jdbc.HiveBaseResultSet.getColumnValue(HiveBaseResultSet.java:421)
 at org.apache.hive.jdbc.HiveBaseResultSet.getString(HiveBaseResultSet.java:530)
 at org.apache.hive.beeline.Rows$Row.<init>(Rows.java:166)
 at org.apache.hive.beeline.BufferedRows.<init>(BufferedRows.java:43)
 at org.apache.hive.beeline.BeeLine.print(BeeLine.java:1756)
 at org.apache.hive.beeline.Commands.execute(Commands.java:826)
 at org.apache.hive.beeline.Commands.sql(Commands.java:670)
 at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:974)
 at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:810)
 at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:767)
 at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:480)
 at org.apache.hive.beeline.BeeLine.main(BeeLine.java:463)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:226)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:141)
Error: Unrecognized column type:TIMESTAMP_TYPE (state=,code=0)

 

I try hive,it works well,and the convert is correct

Two questions:

q1:

if we forbid this convert,should we keep all cases the same?

q2:

if we allow the convert in some cases, should we decide the long length, for the code seems to force to convert to ns with times*1000000 nomatter how long the data is,if it convert to timestamp with incorrect length, we can raise the error.
{code:java}
// // converting seconds to us
private[this] def longToTimestamp(t: Long): Long = t * 1000000L{code}
 

Thanks!

 


> result is the not the same when query and execute jobs
> ------------------------------------------------------
>
>                 Key: SPARK-31710
>                 URL: https://issues.apache.org/jira/browse/SPARK-31710
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.5
>         Environment: hdp:2.7.7
> spark:2.4.5
>            Reporter: philipse
>            Priority: Major
>
> Hi Team
> Steps to reproduce.
> {code:java}
> create table test(id bigint);
> insert into test select 1586318188000;
> create table test1(id bigint) partitioned by (year string);
> insert overwrite table test1 partition(year) select 234,cast(id as TIMESTAMP) from test;
> {code}
> let's check the result. 
> Case 1:
> *select * from test1;*
> 234 | 52238-06-04 13:06:400.0
> --the result is wrong
> Case 2:
> *select 234,cast(id as TIMESTAMP) from test;*
>  
> java.lang.IllegalArgumentException: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff]
>  at java.sql.Timestamp.valueOf(Timestamp.java:237)
>  at org.apache.hive.jdbc.HiveBaseResultSet.evaluate(HiveBaseResultSet.java:441)
>  at org.apache.hive.jdbc.HiveBaseResultSet.getColumnValue(HiveBaseResultSet.java:421)
>  at org.apache.hive.jdbc.HiveBaseResultSet.getString(HiveBaseResultSet.java:530)
>  at org.apache.hive.beeline.Rows$Row.<init>(Rows.java:166)
>  at org.apache.hive.beeline.BufferedRows.<init>(BufferedRows.java:43)
>  at org.apache.hive.beeline.BeeLine.print(BeeLine.java:1756)
>  at org.apache.hive.beeline.Commands.execute(Commands.java:826)
>  at org.apache.hive.beeline.Commands.sql(Commands.java:670)
>  at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:974)
>  at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:810)
>  at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:767)
>  at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:480)
>  at org.apache.hive.beeline.BeeLine.main(BeeLine.java:463)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at org.apache.hadoop.util.RunJar.run(RunJar.java:226)
>  at org.apache.hadoop.util.RunJar.main(RunJar.java:141)
>  Error: Unrecognized column type:TIMESTAMP_TYPE (state=,code=0)
>  
> I try hive,it works well,and the convert is fine and correct
> {code:java}
> select 234,cast(id as TIMESTAMP) from test;
>  234   2020-04-08 11:56:28
> {code}
> Two questions:
> q1:
> if we forbid this convert,should we keep all cases the same?
> q2:
> if we allow the convert in some cases, should we decide the long length, for the code seems to force to convert to ns with times*1000000 nomatter how long the data is,if it convert to timestamp with incorrect length, we can raise the error.
> {code:java}
> // // converting seconds to us
> private[this] def longToTimestamp(t: Long): Long = t * 1000000L{code}
>  
> Thanks!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org