You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2013/12/07 10:54:36 UTC
[jira] [Commented] (HIVE-5979) Failure in cast to timestamps.

    [ https://issues.apache.org/jira/browse/HIVE-5979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842163#comment-13842163 ] 

Gopal V commented on HIVE-5979:
-------------------------------

(Pasted from an email)

The nano second sql timestamp stuff in Java is horribly broken for usability.

https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFTimestampFieldLong.java#L52

Read my comments there on how it handles -ve timestamps and sub-second timings.

Because of the way integer division works in Java, you can end with rounding towards zero - this causes hell with the restriction that setNanos() has to always be positive.

On top of that it uses 1 integer and 1 long to store the time always (unix-epoch seconds + nanos) - the millisecond fraction is stored in the nanos field, so the setNanos() overwrites the millisecond fraction of time always, which is why the getNanos() is added to it.

http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/sql/Timestamp.java#Timestamp.setTime%28long%29

That makes sense, until you realize that a negative millisecond timing is stored as a -1ve second + positive nanosecond time.

So when you mix that with the negative modulo in Java, you end up with a fairly ugly kludge which needs to take care of a several edge cases related to the java.sql.Timestamp implementation.

> Failure in cast to timestamps.
> ------------------------------
>
>                 Key: HIVE-5979
>                 URL: https://issues.apache.org/jira/browse/HIVE-5979
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Jitendra Nath Pandey
>            Assignee: Jitendra Nath Pandey
>
> Query ran:
> {code}
> select cast(t as timestamp), cast(si as timestamp),
>        cast(i as timestamp), cast(b as timestamp),
>        cast(f as string), cast(d as timestamp),
>        cast(bo as timestamp), cast(b * 0 as timestamp),
>        cast(ts as timestamp), cast(s as timestamp),
>        cast(substr(s, 1, 1) as timestamp)
> from Table1;
> {code}
> Running this query with hive.vectorized.execution.enabled=true fails with the following exception:
> {noformat}
> 13/12/05 07:56:36 ERROR tez.TezJobMonitor: Status: Failed
> Vertex failed, vertexName=Map 1, vertexId=vertex_1386227234886_0482_1_00, diagnostics=[Task failed, taskId=task_1386227234886_0482_1_00_000000, diagnostics=[AttemptID:attempt_1386227234886_0482_1_00_000000_0 Info:Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
>         at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.processRow(MapRecordProcessor.java:205)
>         at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:171)
>         at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:112)
>         at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:201)
>         at org.apache.hadoop.mapred.YarnTezDagChild$4.run(YarnTezDagChild.java:484)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>         at org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:474)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
>         at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
>         at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.processRow(MapRecordProcessor.java:193)
>         ... 8 more
> Caused by: java.lang.IllegalArgumentException: nanos > 999999999 or < 0
>         at java.sql.Timestamp.setNanos(Timestamp.java:383)
>         at org.apache.hadoop.hive.ql.exec.vector.TimestampUtils.assignTimeInNanoSec(TimestampUtils.java:27)
>         at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$1.writeValue(VectorExpressionWriterFactory.java:412)
>         at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:162)
>         at org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:152)
>         at org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.processOp(VectorFileSinkOperator.java:85)
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:786)
>         at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:129)
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:786)
>         at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:93)
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:786)
>         at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
>         ... 9 more
> {noformat}
> Full log is attached.
> Schema for the table is as follows:
> {code}
> hive> desc Table1;
> OK
> t                   	tinyint             	from deserializer
> si                  	smallint            	from deserializer
> i                   	int                 	from deserializer
> b                   	bigint              	from deserializer
> f                   	float               	from deserializer
> d                   	double              	from deserializer
> bo                  	boolean             	from deserializer
> s                   	string              	from deserializer
> s2                  	string              	from deserializer
> ts                  	timestamp           	from deserializer
> Time taken: 0.521 seconds, Fetched: 10 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)