You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2016/12/23 13:35:58 UTC

[jira] [Resolved] (SPARK-18879) Spark SQL support for Hive hooks regressed

     [ https://issues.apache.org/jira/browse/SPARK-18879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved SPARK-18879.
-------------------------------
    Resolution: Not A Problem

To my understanding, this is not a function that Spark supports.

> Spark SQL support for Hive hooks regressed
> ------------------------------------------
>
>                 Key: SPARK-18879
>                 URL: https://issues.apache.org/jira/browse/SPARK-18879
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0, 2.0.2
>            Reporter: Atul Payapilly
>
> As per the stack trace from this post: http://ihorbobak.com/index.php/2015/05/08/113/
> run on Spark 1.3.1
> hive.exec.pre.hooks Class not found:org.apache.hadoop.hive.ql.hooks.ATSHook
> FAILED: Hive Internal Error: java.lang.ClassNotFoundException(org.apache.hadoop.hive.ql.hooks.ATSHook)
> java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.hooks.ATSHook
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>     at java.lang.Class.forName0(Native Method)
>     at java.lang.Class.forName(Class.java:270)
>     at org.apache.hadoop.hive.ql.hooks.HookUtils.getHooks(HookUtils.java:59)
>     at org.apache.hadoop.hive.ql.Driver.getHooks(Driver.java:1172)
>     at org.apache.hadoop.hive.ql.Driver.getHooks(Driver.java:1156)
>     at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1206)
>     at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)
>     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
>     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
>     at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:318)
>     at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:290)
>     at org.apache.spark.sql.hive.execution.HiveNativeCommand.run(HiveNativeCommand.scala:33)
>     at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:54)
>     at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:54)
>     at org.apache.spark.sql.execution.ExecutedCommand.execute(commands.scala:64)
>     at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:1099)
>     at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:1099)
>     at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:147)
>     at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:130)
>     at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51)
>     at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:101)
>     at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:164)
>     at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
>     at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:218)
>     at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:233)
>     at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:344)
>     at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
>     at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
>     at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>     at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>     at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:55)
>     at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
>     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:744)
> It looks like Spark used to rely on the Hive Driver for execution and supported hive hooks. The current code path does not rely on the Hive Driver and support for Hive hooks regressed. This is problematic, for example, there is no way to tell which partitions were updated as part of a query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org