You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Arina Ielchiieva (JIRA)" <ji...@apache.org> on 2017/05/31 16:32:04 UTC

[jira] [Comment Edited] (DRILL-5538) Exclude ProjectRemoveRule during PHYSICAL phase if it comes from storage plugins

    [ https://issues.apache.org/jira/browse/DRILL-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16031464#comment-16031464 ] 

Arina Ielchiieva edited comment on DRILL-5538 at 5/31/17 4:31 PM:
------------------------------------------------------------------

[~jni] I have discussed this problem with Calcite team (you can see conversation in CALCITE-1584 Jira). Basically, Julian is saying that 
{quote}
Calcite doesn't promise to retain field names. 95% of the time it can and does, but 5% of the time it really can't retain field names without seriously compromising the Volcano engine.
{quote}
Agree that removing this rule from Drill at first point might seems too rough decision. Regarding impact on JDBC storage plugin, I am not sure what it can be, from the code I can't tell if it needs this rule or not. I guess we might need to think of some other options here.
For example, will it make sense to separate alias assigning into separate stage from project (scan -> project -> alias)? Not sure if it's possible though. And I guess it is an overhead :)
One other option we might replace ProjectRemoveRule in JDBC plugin with DrillProjectRemoveRule. It would be the same as in Calcite except of its isTrivial method would consider not only row types but also column names when deciding if project is trivial or not. Actually, previously in Calcite this method was considering column names too but this logic was removed.


was (Author: arina):
[~jni] I have discussed this problem with Calcite team (you can see conversation in CALCITE-1584 Jira). Basically, Julian is saying that 
{quote}
Calcite doesn't promise to retain field names. 95% of the time it can and does, but 5% of the time it really can't retain field names without seriously compromising the Volcano engine.
{quote}
Agree that removing this rule from Drill at first point might seems too rough decision. Regarding impact on JDBC storage plugin, I am not sure what it can be, from the code I can't tell if it needs this rule or not. I guess we might need to think of some other options here.
For example, will it make sense to separate alias assigning into separate stage from project (scan -> project -> alias)? Not sure if it's possible though. And I guess is an overhead :)
One other option we might replace ProjectRemoveRule in JDBC plugin with DrillProjectRemoveRule. It would be the same as in Calcite except of its isTrivial method would consider not only row types but also column names when deciding if project is trivial or not. Actually, previously in Calcite this method was considering column names too but this logic was removed.

> Exclude ProjectRemoveRule during PHYSICAL phase if it comes from storage plugins
> --------------------------------------------------------------------------------
>
>                 Key: DRILL-5538
>                 URL: https://issues.apache.org/jira/browse/DRILL-5538
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 1.10.0
>            Reporter: Arina Ielchiieva
>            Assignee: Arina Ielchiieva
>
> When [RDBMS storage plugin|https://drill.apache.org/docs/rdbms-storage-plugin/]  is enabled, during query execution certain JDBC rules are added.
> One of the rules is [ProjectRemoveRule|https://github.com/apache/drill/blob/master/contrib/storage-jdbc/src/main/java/org/apache/drill/exec/store/jdbc/JdbcStoragePlugin.java#L140]. Drill also uses this rule but during phases when it considers it useful, for example, during LOGICAL and JOIN_PLANNING. On the contrary, storage plugin rules are added to any phase of query planning. Thus it results to project stage to be removed when actually it is needed.
> Sometimes when ProjectRemoveRule decides that project is trivial and removes it, during this stage Drill added column alias or removed implicit columns.
> For example, with RDBMS plugin enabled, alias is not displayed for simple query:
> {noformat}
> 0: jdbc:drill:zk=local> create temporary table t as select * from sys.version;
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
> +-----------+----------------------------+
> | Fragment  | Number of records written  |
> +-----------+----------------------------+
> | 0_0       | 1                          |
> +-----------+----------------------------+
> 1 row selected (0.623 seconds)
> 0: jdbc:drill:zk=local> select version as current_version from t;
> +------------------+
> |     version      |
> +------------------+
> | 1.11.0-SNAPSHOT  |
> +------------------+
> 1 row selected (0.28 seconds)
> {noformat}
> Proposed fix is to exclude ProjectRemoveRule during PHYSICAL phase if it comes from storage plugins to prevent Drill losing column alias or displaying implicit columns.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)