You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Dominic Ricard (JIRA)" <ji...@apache.org> on 2017/06/12 20:50:00 UTC

[jira] [Created] (SPARK-21067) Thrift Server - CTAS fail with Unable to move source

Dominic Ricard created SPARK-21067:
--------------------------------------

             Summary: Thrift Server - CTAS fail with Unable to move source
                 Key: SPARK-21067
                 URL: https://issues.apache.org/jira/browse/SPARK-21067
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.1.1
         Environment: Yarn
Hive MetaStore
HDFS (HA)
            Reporter: Dominic Ricard


After upgrading our Thrift cluster to 2.1.1, we ran into an issue where CTAS would fail, sometimes...

Most of the time, the CTAS would work only once after starting the thrift server. After that, dropping the table and re-issuing the same CTAS would fail with the following message (Sometime, it fails right away, sometime it work for a long period of time):

{noformat}
Error: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source hdfs://nameservice1//tmp/hive-staging/thrift_hive_2017-06-12_16-56-18_464_7598877199323198104-31/-ext-10000/part-00000 to destination hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-00000; (state=,code=0)
{noformat}

We have already found the following Jira (https://issues.apache.org/jira/browse/SPARK-11021) which state that the {{hive.exec.stagingdir}} had to be added in order for Spark to be able to handle CREATE TABLE properly as of 2.0. As you can see in the error, we have ours set to "/tmp/hive-staging/\{user.name\}"

Same issue with INSERT statements:
{noformat}
CREATE TABLE IF NOT EXISTS dricard.test (col1 int); INSERT INTO TABLE dricard.test SELECT 1;
Error: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source hdfs://nameservice1/tmp/hive-staging/thrift_hive_2017-06-12_20-41-12_964_3086448130033637241-16/-ext-10000/part-00000 to destination hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-00000; (state=,code=0)
{noformat}

This worked fine in 1.6.2, which we currently run in our Production Environment but since 2.0+, we haven't been able to CREATE TABLE consistently on the cluster.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org