You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Eric Wohlstadter (JIRA)" <ji...@apache.org> on 2018/10/30 02:24:00 UTC

[jira] [Reopened] (HIVE-20552) Get Schema from LogicalPlan faster

     [ https://issues.apache.org/jira/browse/HIVE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Wohlstadter reopened HIVE-20552:
-------------------------------------

Reopened. The patch is failing for me with:
{code:java}
Caused by: java.lang.NullPointerException: null^M
        at org.apache.hadoop.hive.ql.parse.QBSubQuery.<init>(QBSubQuery.java:489)^M
        at org.apache.hadoop.hive.ql.parse.SubQueryUtils.buildSubQuery(SubQueryUtils.java:249)^M
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.subqueryRestrictionCheck(CalcitePlanner.java:3141)^M
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genSubQueryRelNode(CalcitePlanner.java:3322)^M
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genFilterRelNode(CalcitePlanner.java:3379)^M
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genFilterLogicalPlan(CalcitePlanner.java:3434)^M
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:4970)^M
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1722)^M
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1670)^M
        at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)^M
        at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052)^M
        at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)^M
        at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)^M
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1431)^M
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genLogicalPlan(CalcitePlanner.java:393)^M
        at org.apache.hadoop.hive.ql.parse.ParseUtils.parseQueryAndGetSchema(ParseUtils.java:554)^M
        at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:254)^M
        at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:206)^M
        at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)^M
        at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)^M
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)^M
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)^M
        at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)^M
        at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)^M
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)^M
        at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)^M
        at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)^M
        at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)^M
        at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)^M
        ... 16 more{code}

> Get Schema from LogicalPlan faster
> ----------------------------------
>
>                 Key: HIVE-20552
>                 URL: https://issues.apache.org/jira/browse/HIVE-20552
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Teddy Choi
>            Assignee: Teddy Choi
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0, 3.2.0
>
>         Attachments: HIVE-20552.1.patch, HIVE-20552.2.patch, HIVE-20552.3.patch
>
>
> To get the schema of a query faster, it currently needs to compile, optimize, and generate a TezPlan, which creates extra overhead when only the LogicalPlan is needed.
> 1. Copy the method \{{HiveMaterializedViewsRegistry.parseQuery}}, making it \{{public static}} and putting it in a utility class. 
> 2. Change the return statement of the method to \{{return analyzer.getResultSchema();}}
> 3. Change the return type of the method to \{{List<FieldSchema>}}
> 4. Call the new method from \{{GenericUDTFGetSplits.createPlanFragment}} replacing the current code which does this:
> {code}
>  if(num == 0) {
>  //Schema only
>  return new PlanFragment(null, schema, null);
>  }
> {code}
> moving the call earlier in \{{getPlanFragment}} ... right after the HiveConf is created ... bypassing the code that uses \{{HiveTxnManager}} and \{{Driver}}.
> 5. Convert the \{{List<FieldSchema>}} to \{{org.apache.hadoop.hive.llap.Schema}}.
> 6. return from \{{getPlanFragment}} by returning \{{new PlanFragment(null, schema, null)}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)