You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Arun (JIRA)" <ji...@apache.org> on 2018/02/09 13:44:00 UTC

[jira] [Commented] (SPARK-12378) CREATE EXTERNAL TABLE AS SELECT EXPORT AWS S3 ERROR

    [ https://issues.apache.org/jira/browse/SPARK-12378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16358393#comment-16358393 ] 

Arun commented on SPARK-12378:
------------------------------

I am also getting the same issue when I am trying to insert data in hive from spark.

My table is an external table stores in AWS S3.

Although the data gets inserted in the table, but it gives this message:

 
{code:java}
-chgrp: '' does not match expected pattern for group
Usage: hadoop fs [generic options] -chgrp [-R] GROUP PATH...
18/02/09 13:25:56 ERROR KeyProviderCache: Could not find uri with key [dfs.encryption.key.provider.uri] to create a keyProvider !!
-chgrp: '' does not match expected pattern for group
Usage: hadoop fs [generic options] -chgrp [-R] GROUP PATH...{code}
Any resolution please?

> CREATE EXTERNAL TABLE AS SELECT EXPORT AWS S3 ERROR
> ---------------------------------------------------
>
>                 Key: SPARK-12378
>                 URL: https://issues.apache.org/jira/browse/SPARK-12378
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.5.2
>         Environment: AWS EMR 4.2.0
> Just Master Running m3.xlarge
> Applications:
> Hive 1.0.0
> Spark 1.5.2
>            Reporter: CESAR MICHELETTI
>            Priority: Major
>
> I am receive the bellow error during try exporting data to AWS S3, in spark-sql.
> Command:
> CREATE external TABLE export 
>  ROW FORMAT DELIMITED FIELDS TERMINATED BY '\054'
> -- lines terminated by '\n' 
>  STORED AS TEXTFILE
>  LOCATION 's3://xxx/yyy'
>  AS
> SELECT 
> xxx
> ....
> (complete query)
> ;
> Error:
> -chgrp: '' does not match expected pattern for group
> Usage: hadoop fs [generic options] -chgrp [-R] GROUP PATH...
> -chgrp: '' does not match expected pattern for group
> Usage: hadoop fs [generic options] -chgrp [-R] GROUP PATH...
> 15/12/16 21:09:25 ERROR SparkSQLDriver: Failed in [CREATE external TABLE csvexport
> ...
> (create table + query)
> ...
> java.lang.reflect.InvocationTargetException
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.spark.sql.hive.client.Shim_v0_14.loadTable(HiveShim.scala:441)
>         at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$loadTable$1.apply$mcV$sp(ClientWrapper.scala:489)
>         at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$loadTable$1.apply(ClientWrapper.scala:489)
>         at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$loadTable$1.apply(ClientWrapper.scala:489)
>         at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$withHiveState$1.apply(ClientWrapper.scala:256)
>         at org.apache.spark.sql.hive.client.ClientWrapper.retryLocked(ClientWrapper.scala:211)
>         at org.apache.spark.sql.hive.client.ClientWrapper.withHiveState(ClientWrapper.scala:248)
>         at org.apache.spark.sql.hive.client.ClientWrapper.loadTable(ClientWrapper.scala:488)
>         at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:243)
>         at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:127)
>         at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:263)
>         at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:140)
>         at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:138)
>         at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
>         at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:138)
>         at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:933)
>         at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:933)
>         at org.apache.spark.sql.hive.execution.CreateTableAsSelect.run(CreateTableAsSelect.scala:89)
>         at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:57)
>         at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:57)
>         at org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:69)
>         at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:140)
>         at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:138)
>         at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
>         at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:138)
>         at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:933)
>         at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:933)
>         at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:144)
>         at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:129)
>         at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51)
>         at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:725)
>         at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:62)
>         at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:308)
>         at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>         at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:226)
>         at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
>         at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
>         at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
>         at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table. Invalid method name: 'alter_table_with_cascade'
>         at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:500)
>         at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:484)
>         at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1668)
>         ... 49 more
> Caused by: org.apache.thrift.TApplicationException: Invalid method name: 'alter_table_with_cascade'
>         at org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
>         at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
>         at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_alter_table_with_cascade(ThriftHiveMetastore.java:1374)
>         at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.alter_table_with_cascade(ThriftHiveMetastore.java:1358)
>         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table(HiveMetaStoreClient.java:340)
>         at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.alter_table(SessionHiveMetaStoreClient.java:251)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156)
>         at com.sun.proxy.$Proxy29.alter_table(Unknown Source)
>         at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:496)
>         ... 51 more
> java.lang.reflect.InvocationTargetException
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.spark.sql.hive.client.Shim_v0_14.loadTable(HiveShim.scala:441)
>         at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$loadTable$1.apply$mcV$sp(ClientWrapper.scala:489)
>         at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$loadTable$1.apply(ClientWrapper.scala:489)
>         at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$loadTable$1.apply(ClientWrapper.scala:489)
>         at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$withHiveState$1.apply(ClientWrapper.scala:256)
>         at org.apache.spark.sql.hive.client.ClientWrapper.retryLocked(ClientWrapper.scala:211)
>         at org.apache.spark.sql.hive.client.ClientWrapper.withHiveState(ClientWrapper.scala:248)
>         at org.apache.spark.sql.hive.client.ClientWrapper.loadTable(ClientWrapper.scala:488)
>         at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:243)
>         at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:127)
>         at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:263)
>         at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:140)
>         at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:138)
>         at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
>         at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:138)
>         at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:933)
>         at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:933)
>         at org.apache.spark.sql.hive.execution.CreateTableAsSelect.run(CreateTableAsSelect.scala:89)
>         at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:57)
>         at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:57)
>         at org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:69)
>         at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:140)
>         at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:138)
>         at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
>         at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:138)
>         at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:933)
>         at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:933)
>         at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:144)
>         at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:129)
>         at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51)
>         at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:725)
>         at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:62)
>         at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:308)
>         at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>         at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:226)
>         at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
>         at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
>         at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
>         at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table. Invalid method name: 'alter_table_with_cascade'
>         at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:500)
>         at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:484)
>         at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1668)
>         ... 49 more
> Caused by: org.apache.thrift.TApplicationException: Invalid method name: 'alter_table_with_cascade'
>         at org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
>         at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
>         at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_alter_table_with_cascade(ThriftHiveMetastore.java:1374)
>         at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.alter_table_with_cascade(ThriftHiveMetastore.java:1358)
>         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table(HiveMetaStoreClient.java:340)
>         at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.alter_table(SessionHiveMetaStoreClient.java:251)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156)
>         at com.sun.proxy.$Proxy29.alter_table(Unknown Source)
>         at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:496)
>         ... 51 more
> Detail:
> The file exported is generated in S3 bucket, but the process terminate with error...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org