You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Daniel Dai (JIRA)" <ji...@apache.org> on 2018/09/28 07:44:00 UTC

[jira] [Updated] (HIVE-20652) JdbcStorageHandler push join of two different datasource to jdbc driver

     [ https://issues.apache.org/jira/browse/HIVE-20652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated HIVE-20652:
------------------------------
    Attachment: external_jdbc_table2.q

> JdbcStorageHandler push join of two different datasource to jdbc driver
> -----------------------------------------------------------------------
>
>                 Key: HIVE-20652
>                 URL: https://issues.apache.org/jira/browse/HIVE-20652
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Planning
>            Reporter: Daniel Dai
>            Priority: Major
>         Attachments: external_jdbc_table2.q
>
>
> Test case attached. The following query fail:
> {code}
> SELECT * FROM ext_auth1 JOIN ext_auth2 ON ext_auth1.ikey = ext_auth2.ikey
> {code}
> Error message:
> {code}
> 2018-09-28T00:36:23,860 DEBUG [17b954d9-3250-45a9-995e-1b3f8277a681 main] dao.GenericJdbcDatabaseAccessor: Query to execute is [SELECT *
> FROM (SELECT *
> FROM "SIMPLE_DERBY_TABLE1"
> WHERE "ikey" IS NOT NULL) AS "t"
> INNER JOIN (SELECT *
> FROM "SIMPLE_DERBY_TABLE2"
> WHERE "ikey" IS NOT NULL) AS "t0" ON "t"."ikey" = "t0"."ikey" {LIMIT 1}]
> 2018-09-28T00:36:23,864 ERROR [17b954d9-3250-45a9-995e-1b3f8277a681 main] dao.GenericJdbcDatabaseAccessor: Error while trying to get column names.
> java.sql.SQLSyntaxErrorException: Table/View 'SIMPLE_DERBY_TABLE2' does not exist.
>         at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) ~[derby-10.14.1.0.jar:?]
>         at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source) ~[derby-10.14.1.0.jar:?]
>         at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source) ~[derby-10.14.1.0.jar:?]
>         at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source) ~[derby-10.14.1.0.jar:?]
>         at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source) ~[derby-10.14.1.0.jar:?]
>         at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source) ~[derby-10.14.1.0.jar:?]
>         at org.apache.derby.impl.jdbc.EmbedPreparedStatement.<init>(Unknown Source) ~[derby-10.14.1.0.jar:?]
>         at org.apache.derby.impl.jdbc.EmbedPreparedStatement42.<init>(Unknown Source) ~[derby-10.14.1.0.jar:?]
>         at org.apache.derby.jdbc.Driver42.newEmbedPreparedStatement(Unknown Source) ~[derby-10.14.1.0.jar:?]
>         at org.apache.derby.impl.jdbc.EmbedConnection.prepareStatement(Unknown Source) ~[derby-10.14.1.0.jar:?]
>         at org.apache.derby.impl.jdbc.EmbedConnection.prepareStatement(Unknown Source) ~[derby-10.14.1.0.jar:?]
>         at org.apache.commons.dbcp.DelegatingConnection.prepareStatement(DelegatingConnection.java:281) ~[commons-dbcp-1.4.jar:1.4]
>         at org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.prepareStatement(PoolingDataSource.java:313) ~[commons-dbcp-1.4.jar:1.4]
>         at org.apache.hive.storage.jdbc.dao.GenericJdbcDatabaseAccessor.getColumnNames(GenericJdbcDatabaseAccessor.java:74) [hive-jdbc-handler-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hive.storage.jdbc.JdbcSerDe.initialize(JdbcSerDe.java:78) [hive-jdbc-handler-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:54) [hive-serde-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:540) [hive-serde-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:90) [hive-metastore-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:77) [hive-metastore-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:295) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:277) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genTablePlan(SemanticAnalyzer.java:11100) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11468) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11427) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:525) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12319) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:356) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1814) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) [hive-cli-4.0.0-SNAPSHOT.jar:?]
>         at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) [hive-cli-4.0.0-SNAPSHOT.jar:?]
>         at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) [hive-cli-4.0.0-SNAPSHOT.jar:?]
>         at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335) [hive-cli-4.0.0-SNAPSHOT.jar:?]
> {code}
> Hive is pushing the join into jdbc driver though the table refer to different data source.
> A related case is when I change the join query into union all:
> {code}
> SELECT * FROM ext_auth1 UNION ALL SELECT * FROM ext_auth2
> {code}
> Calcite complain about unknown operation:
> {code}
> java.lang.AssertionError: Relational expression HepRelVertex#387 has calling-convention JDBC.DERBY but does not implement the required interface 'interface org.apache.calcite.adapter.jdbc.JdbcRel' of that convention
> 	at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1475)
> 	at org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:859)
> 	at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:879)
> 	at org.apache.calcite.plan.volcano.VolcanoPlanner.changeTraits(VolcanoPlanner.java:544)
> 	at org.apache.calcite.plan.RelOptRule.convert(RelOptRule.java:572)
> 	at org.apache.calcite.plan.RelOptRule.lambda$convertList$2(RelOptRule.java:607)
> 	at com.google.common.collect.Lists$TransformingRandomAccessList$1.transform(Lists.java:640)
> 	at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
> 	at java.util.AbstractCollection.toArray(AbstractCollection.java:141)
> 	at com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:239)
> 	at org.apache.calcite.rel.core.SetOp.<init>(SetOp.java:61)
> 	at org.apache.calcite.rel.core.Union.<init>(Union.java:43)
> 	at org.apache.calcite.adapter.jdbc.JdbcRules$JdbcUnion.<init>(JdbcRules.java:708)
> 	at org.apache.calcite.adapter.jdbc.JdbcRules$JdbcUnionRule.convert(JdbcRules.java:697)
> 	at org.apache.hadoop.hive.ql.optimizer.calcite.rules.jdbc.JDBCUnionPushDownRule.onMatch(JDBCUnionPushDownRule.java:80)
> 	at org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:315)
> 	at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556)
> 	at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415)
> 	at org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280)
> 	at org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74)
> 	at org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211)
> 	at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198)
> 	at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2348)
> 	at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1917)
> 	at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1670)
> 	at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> 	at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> 	at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> 	at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> 	at org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1429)
> 	at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:476)
> 	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12319)
> 	at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:356)
> 	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> 	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> 	at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> 	at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> 	at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1814)
> 	at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
> 	at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214)
> 	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
> 	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
> 	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
> 	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335)
> 	at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1397)
> 	at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1371)
> 	at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:182)
> 	at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104)
> 	at org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:59)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> 	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> 	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 	at org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:92)
> 	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> 	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> 	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> 	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> 	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> 	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> 	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> 	at org.junit.runners.Suite.runChild(Suite.java:127)
> 	at org.junit.runners.Suite.runChild(Suite.java:26)
> 	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> 	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> 	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> 	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> 	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> 	at org.apache.hadoop.hive.cli.control.CliAdapter$1$1.evaluate(CliAdapter.java:73)
> 	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> 	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
> 	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
> 	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
> 	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)