You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by tonsat <to...@gmail.com> on 2014/09/30 18:45:16 UTC

timestamp not implemented yet

We have installed spark 1.1 stand alone master mode. when we are trying to
access parquet format table. we are getting below error and one of the field
is defined as timestamp. Based on information provided in apache.spark.org
spark supports parquet and pretty much all hive datatypes including
timestamp. 
Any help appreciated.
BTW Spark server up and running no issues and Table with TextFile Format
with Timestamp field working fine.

14/09/30 12:27:07 WARN thrift.ThriftCLIService: Error fetching results:
org.apache.hive.service.cli.HiveSQLException:
java.lang.UnsupportedOperationException: timestamp not implemented yet
        at
org.apache.spark.sql.hive.thriftserver.server.SparkSQLOperationManager$$anon$1.run(SparkSQLOperationManager.scala:201)
        at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:193)
        at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:175)
        at
org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:150)
        at
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:207)
        at
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1133)
        at
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1118)
        at
org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
        at
org.apache.hive.service.auth.TUGIContainingProcessor$1.run(TUGIContainingProcessor.java:58)
        at
org.apache.hive.service.auth.TUGIContainingProcessor$1.run(TUGIContainingProcessor.java:55)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at
org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:526)
        at
org.apache.hive.service.auth.TUGIContainingProcessor.process(TUGIContainingProcessor.java:55)
        at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
14/09/30 12:27:07 INFO cli.CLIService: SessionHandle
[32f1c10a-ae27-4759-b2ff-0b8cc0321222]: closeSession()




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/timestamp-not-implemented-yet-tp15414.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: timestamp not implemented yet

Posted by Michael Armbrust <mi...@databricks.com>.

That is a pretty reasonable workaround.  Also, please feel free to file a
JIRA when you find gaps in functionality like this that are impacting your
workloads:

https://issues.apache.org/jira/browse/SPARK/

On Wed, Oct 1, 2014 at 5:09 PM, barge.nilesh <ba...@gmail.com> wrote:

> Parquet format seems to be comparatively better for analytic load, it has
> performance & compression benefits for large analytic workload.
> A workaround could be to use long datatype to store epoch timestamp value.
> If you already have existing parquet files (impala tables) then you may
> need
> to consider doing some migration.
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/timestamp-not-implemented-yet-tp15414p15571.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Re: timestamp not implemented yet

Posted by "barge.nilesh" <ba...@gmail.com>.

Parquet format seems to be comparatively better for analytic load, it has
performance & compression benefits for large analytic workload.
A workaround could be to use long datatype to store epoch timestamp value.
If you already have existing parquet files (impala tables) then you may need
to consider doing some migration.




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/timestamp-not-implemented-yet-tp15414p15571.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: timestamp not implemented yet

Posted by tonsat <to...@gmail.com>.

Thank you Nilesh originally tables was created in impala same table trying to
access through spark-sql. Any idea what is the best file format we should be
using running spark jobs if we have date field?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/timestamp-not-implemented-yet-tp15414p15419.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: timestamp not implemented yet

Posted by "barge.nilesh" <ba...@gmail.com>.

Spark 1.1 comes with Hive 0.12 and Hive 0.12, for parquet format, doesn't
support timestamp datatype.

https://cwiki.apache.org/confluence/display/Hive/Parquet#Parquet-Limitations
<https://cwiki.apache.org/confluence/display/Hive/Parquet#Parquet-Limitations>  





--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/timestamp-not-implemented-yet-tp15414p15417.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org