You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Cheng Lian (JIRA)" <ji...@apache.org> on 2014/11/28 09:30:12 UTC

[jira] [Updated] (SPARK-4645) Asynchronous execution in HiveThriftServer2 with Hive 0.13.1 doesn't play well with Simba ODBC driver

     [ https://issues.apache.org/jira/browse/SPARK-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cheng Lian updated SPARK-4645:
------------------------------
    Description: 
Hive 0.13.1 enables asynchronous execution for {{SQLOperation}} by default. So does Spark SQL HiveThriftServer2 when built with Hive 0.13.1. This works well for normal JDBC clients like BeeLine, but throws exception when using Simba ODBC driver v0.1.0000.

Simba ODBC driver tries to execute two statement while connecting to Spark SQL HiveThriftServer2:

- {{use `default`}}
- {{set -v}}

However, HiveThriftServer2 throws exception when executing them:
{code}
14/11/28 15:18:37 ERROR SparkExecuteStatementOperation: Error executing query:
org.apache.spark.sql.execution.QueryExecutionException: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Java heap space
	at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:309)
	at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:276)
	at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult$lzycompute(NativeCommand.scala:35)
	at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult(NativeCommand.scala:35)
	at org.apache.spark.sql.execution.Command$class.execute(commands.scala:46)
	at org.apache.spark.sql.hive.execution.NativeCommand.execute(NativeCommand.scala:30)
	at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425)
	at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425)
	at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58)
	at org.apache.spark.sql.SchemaRDD.<init>(SchemaRDD.scala:108)
	at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:94)
	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$runInternal(Shim13.scala:84)
	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(Shim13.scala:224)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
	at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:493)
	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(Shim13.scala:234)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
14/11/28 15:18:37 ERROR SparkExecuteStatementOperation: Error running hive query: 
org.apache.hive.service.cli.HiveSQLException: org.apache.spark.sql.execution.QueryExecutionException: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Java heap space
	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$runInternal(Shim13.scala:104)
	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(Shim13.scala:224)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
	at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:493)
	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(Shim13.scala:234)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
{code}

  was:
Hive 0.13.1 enables asynchronous execution for {{SQLOperation}} by default. So does Spark SQL HiveThriftServer2 when built with Hive 0.13.1. This works well for normal JDBC clients like BeeLine, but throws exception when using Simba ODBC driver.

Simba ODBC driver tries to execute two statement while connecting to Spark SQL HiveThriftServer2:

- {{use `default`}}
- {{set -v}}

However, HiveThriftServer2 throws exception when executing them:
{code}
14/11/28 15:18:37 ERROR SparkExecuteStatementOperation: Error executing query:
org.apache.spark.sql.execution.QueryExecutionException: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Java heap space
	at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:309)
	at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:276)
	at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult$lzycompute(NativeCommand.scala:35)
	at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult(NativeCommand.scala:35)
	at org.apache.spark.sql.execution.Command$class.execute(commands.scala:46)
	at org.apache.spark.sql.hive.execution.NativeCommand.execute(NativeCommand.scala:30)
	at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425)
	at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425)
	at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58)
	at org.apache.spark.sql.SchemaRDD.<init>(SchemaRDD.scala:108)
	at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:94)
	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$runInternal(Shim13.scala:84)
	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(Shim13.scala:224)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
	at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:493)
	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(Shim13.scala:234)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
14/11/28 15:18:37 ERROR SparkExecuteStatementOperation: Error running hive query: 
org.apache.hive.service.cli.HiveSQLException: org.apache.spark.sql.execution.QueryExecutionException: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Java heap space
	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$runInternal(Shim13.scala:104)
	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(Shim13.scala:224)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
	at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:493)
	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(Shim13.scala:234)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
{code}


> Asynchronous execution in HiveThriftServer2 with Hive 0.13.1 doesn't play well with Simba ODBC driver
> -----------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-4645
>                 URL: https://issues.apache.org/jira/browse/SPARK-4645
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Cheng Lian
>            Priority: Blocker
>
> Hive 0.13.1 enables asynchronous execution for {{SQLOperation}} by default. So does Spark SQL HiveThriftServer2 when built with Hive 0.13.1. This works well for normal JDBC clients like BeeLine, but throws exception when using Simba ODBC driver v0.1.0000.
> Simba ODBC driver tries to execute two statement while connecting to Spark SQL HiveThriftServer2:
> - {{use `default`}}
> - {{set -v}}
> However, HiveThriftServer2 throws exception when executing them:
> {code}
> 14/11/28 15:18:37 ERROR SparkExecuteStatementOperation: Error executing query:
> org.apache.spark.sql.execution.QueryExecutionException: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Java heap space
> 	at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:309)
> 	at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:276)
> 	at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult$lzycompute(NativeCommand.scala:35)
> 	at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult(NativeCommand.scala:35)
> 	at org.apache.spark.sql.execution.Command$class.execute(commands.scala:46)
> 	at org.apache.spark.sql.hive.execution.NativeCommand.execute(NativeCommand.scala:30)
> 	at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425)
> 	at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425)
> 	at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58)
> 	at org.apache.spark.sql.SchemaRDD.<init>(SchemaRDD.scala:108)
> 	at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:94)
> 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$runInternal(Shim13.scala:84)
> 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(Shim13.scala:224)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> 	at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:493)
> 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(Shim13.scala:234)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)
> 14/11/28 15:18:37 ERROR SparkExecuteStatementOperation: Error running hive query: 
> org.apache.hive.service.cli.HiveSQLException: org.apache.spark.sql.execution.QueryExecutionException: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Java heap space
> 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$runInternal(Shim13.scala:104)
> 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(Shim13.scala:224)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> 	at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:493)
> 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(Shim13.scala:234)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org