You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2018/11/17 20:46:00 UTC

[jira] [Comment Edited] (IMPALA-7855) Excessive type widening leads to unnecessary casts

    [ https://issues.apache.org/jira/browse/IMPALA-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16690692#comment-16690692 ] 

Paul Rogers edited comment on IMPALA-7855 at 11/17/18 8:45 PM:
---------------------------------------------------------------

A related issue occurs in {{ExprRewriterTest.TestToSql()}} in the CTAS test. (This test will be made into a separate method, {{TestCTASToSql()}}). When run with the "integrated rewrite" feature enabled, we get into this odd situation:

 * Analyze the {{CreateTableAsSelect}} statement. Create a temporary copy of the associated {{SELECT}} statement.
 * Rewrite the {{SELECT}} statement from {{SELECT 1 + 1}} (both {{TINYINT}}) to {{SELECT 2}} (as type {{SMALLINT}}.)
 * Perform column substitutions, reset and reanalyze. Because the value is 2, it takes the type TINYINT.
 * Create the base table expressions using the newly rewritten value ({{TINYINT}}) though the result expression is still {{SMALLINT}}.
 * Use the base expressions from the above (type as {{TINY}}) to declare the target table column.
 * Now, try to map the result expression {{SMALLINT}} into the newly created table column {{TINYINT}}. Fails with a overflow error.

The problem here is in one, or both, of two places:

 * The constant folding rule changes the type of the rewritten expression to {{SMALLINT}} (a conservative guess), though the {{NumericLiteral.analyzeImpl()}} method will choose {{TINYINT}} for the same value.
 * The {{Expr.substituteImpl()}} method forces a re-analyze of numeric literals, losing the wider type created by the constant folding rule.

There is another inconsistency which has not (yet) manifested as a test failure (probably because we have no suitable tests):

 * After the above, the base table expression for a {{SELECT}} statement has one schema ({{TINYINT}}), the result expression has another ({{SMALLINT}}).

While the inconsistency in types may seem a minor issue, it does lead to analysis failures and does need to be addressed.

Perhaps two fixes are needed:

 * When rewriting a numeric literal in the constant folding rule, apply the rules from {{NumericLiteral}} to override the type guessed by the constant evaluation.
 * Modify the {{substituteImpl}} method to a) don't reset numeric literals, or, more generally, b) don't reset expressions that did not change (or their children did not change.)


was (Author: paul.rogers):
A related issue occurs in {{ExprRewriterTest.TestToSql()}} in the CTAS test. (This test will be made into a separate method, {{TestCTASToSql()}}). When run with the "integrated rewrite" feature enabled, we get into this odd situation:

* Analyze the CreateTableAsSelect statement. Create a temporary copy of the associated {{SELECT}} statement.
* Rewrite the {{SELECT}} statement from {{SELECT 1 + 1}} (both {{TINYINT}}) to {{SELECT 2}} (as type {{SMALLINT}}.)
* Perform column substitutions, reset and reanalyze. Because the value is 2, it takes the type TINYINT.
* Create the base table expressions using the newly rewritten value ({{TINYINT}}) though the result expression is still {{SMALLINT}}.
* Use the base expressions from the above (type as {{TINY}}) to declare the target table column.
* Now, try to map the result expression {{SMALLINT}} into the newly created table column {{TINYINT}}. Fails with a overflow error.

The problem here is in one, or both, of two places:

* The constant folding rule changes the type of the rewritten expression to {{SMALLINT}} (a conservative guess), though the {{NumericLiteral.analyzeImpl()}} method will choose {{TINYINT}} for the same value.
* The {{Expr.trySubstitute()} method forces a re-analyze of numeric literals, losing the wider type created by the constant folding rule.

There is another inconsistency which has not (yet) manifested as a test failure (probably because we have no suitable tests):

* After the above, the base table expression for a {{SELECT}} statement has one schema ({{TINYINT}}), the result expression has another ({{SMALLINT}}).

While the inconsistency in types may seem a minor issue, it does lead to analysis failures and does need to be addressed.

Perhaps two fixes are needed:

* When rewriting a numeric literal in the constant folding rule, apply the rules from {{NumericLiteral}} to override the type guessed by the constant evaluation.
* Modify the {{trySubstitute}} method to a) don't reset numeric literals, or, more generally, b) don't reset expressions that did not change (or their children did not change.)

> Excessive type widening leads to unnecessary casts
> --------------------------------------------------
>
>                 Key: IMPALA-7855
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7855
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>    Affects Versions: Impala 3.0
>            Reporter: Paul Rogers
>            Priority: Minor
>
> When writing unit tests, created the following query:
> {code:sql}
> with
>   query1 (a, b) as (
>     select 1 + 1 + id, 2 + 3 + int_col from functional.alltypestiny)
> insert into functional.alltypestiny (id, int_col)
>   partition (month = 5, year = 2018)
>   select * from query1
> {code}
> The above fails with the following error:
> {noformat}
> ERROR: AnalysisException: Possible loss of precision for target table 'functional.alltypestiny'.
> Expression 'query1.a' (type: BIGINT) would need to be cast to INT for column 'id'
> {noformat}
> The following does work (for planning, may not actually execute):
> {code:sql}
> with
>   query1 (a, b) as (
>     select
>       cast(1 + 1 + id as int),
>       cast(2 + 3 + int_col as int)
>     from functional.alltypestiny)
> insert into functional.alltypestiny (id, int_col)
>   partition (month = 5, year = 2018)
>   select * from query1
> {code}
> What this says is the the planner selected type {{BIGINT}} for the (rewritten) expression {{2 + id}} where {{id}} is of type {{INT}}. {{BIGINT}} is a conservative guess: adding 2 to the largest {{INT}} could overflow and require a {{BIGINT}}.
> Yet, for such a simple case, such aggressive type promotion may be overly cautious.
> To verify that this is an issue, let's try something similar with Postgres to see if it is as aggressive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org