You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2018/11/02 01:10:00 UTC

[jira] [Commented] (IMPALA-7785) GROUP BY clause not analyzed prior to rewrite step

    [ https://issues.apache.org/jira/browse/IMPALA-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672439#comment-16672439 ] 

Tim Armstrong commented on IMPALA-7785:
---------------------------------------

Seems similar to IMPALA-7083

> GROUP BY clause not analyzed prior to rewrite step
> --------------------------------------------------
>
>                 Key: IMPALA-7785
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7785
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 3.0
>            Reporter: Paul Rogers
>            Priority: Minor
>
> The FE fails to analyze a {{GROUP BY}} clause prior to invoking the rewrite rules, causing the rules to fail to do any rewrites.
> For the {{SELECT}} list, the analyzer processes each expression and marks it as analyzed.
> The rewrite rules, however, tend to skip unanalyzed nodes. (And, according to IMPALA-7754, often are not re-analyzed after a rewrite.)
> Consider this simple query:
> {code:sql}
> SELECT case when string_col is not null then string_col else 'foo' end                                    
> FROM functional.alltypestiny                         
> GROUP BY case when string_col is not null then string_col else 'foo' end                                     
> {code}
> This query works. Now, using the new feature in IMPALA-7655 with a query that will be rewritten to the above:
> {code:sql}
> SELECT coalesce(string_col, 'foo')                                    
> FROM functional.alltypes                                                  
> GROUP BY coalesce(string_col, 'foo')                                         
> {code}
> The above is rewritten using the new conditional function rewrite rules. Result:
> {noformat}
> org.apache.impala.common.AnalysisException:
>   select list expression not produced by aggregation output
>   (missing from GROUP BY clause?):
>   CASE WHEN string_col IS NOT NULL THEN string_col ELSE 'foo' END
> {noformat}
> The reason is the check used in multiple rewrite rules:
> {code:java}
>   public Expr apply(Expr expr, Analyzer analyzer) throws AnalysisException {              
>     if (!expr.isAnalyzed()) return expr;                                                  
> {code}
> Step though the code. The {{coalesce()}} expression in the {{SELECT}} clause is analyzed, the one in the {{GROUP BY}} is not. This creates a problem because SQL semantics require the identical expression in both clause for them to match. (It also means no other rewrite rules, at least not those with this check, are invoked, leading to an unintended code path.)
> This query makes it a bit clearer:
> {code:sql}
> SELECT 1 + 2
> FROM functional.alltypestiny
> GROUP BY 1 + 2
> {code}
> This works. But, if we use test code to inspect the "rewritten" {{GROUP BY}}, we find that it is still at "1 + 2" while the {{SELECT}} expression has been rewritten to "3".
> Seems that, when working with rewrites, we must be very careful because, as the code currently is written, we rewrite some clauses but not others. Then, we have to know when it is safe to have the SELECT clause differ from the GROUP BY clause. (Looks like it is OK for constants to differ, but not for functions...)
> VERY confusing, would be better to just fix the darn thing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org