You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2019/01/02 11:24:00 UTC

[jira] [Commented] (IMPALA-7896) Literals should not need explicit analyze step

    [ https://issues.apache.org/jira/browse/IMPALA-7896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16731962#comment-16731962 ] 

ASF subversion and git services commented on IMPALA-7896:
---------------------------------------------------------

Commit 27577dd652554dda5a03016e2d1e3ab66fe6b1f5 in impala's branch refs/heads/master from Paul Rogers
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=27577dd ]

IMPALA-7902: NumericLiteral fixes, refactoring

The work to clean up the rewriter logic must start with a stable AST,
which must start with sprucing up some issues with the leaf nodes. This
CR tackles the NumericLiteral used to hold numbers.

IMPALA-7896: Literals should not need explicit analyze step

Partial fix: removes the need to analyze a numeric literal: analyze() is
a no-op. This eliminates the need to do a "fake" analysis with a null
analyzer: numeric literals are now created analyzed. This is useful
because the catalog module creates numeric literals outside of a query
(and outside of an analyzer.)

A literal is immutable except for type. Modified the constructor to set
the type and cost, then mark the node as analyzed. A later call to
analyze() has nothing to do.

Code that created and dummy-analyzed numeric literals changed to use
static create() methods resulting in simpler literal creation, and
eliminates the special "analyzer == null" checks in analyze().

IMPALA-7886: NumericLiteral constructor fails to round values to
             Decimal type
IMPALA-7887: NumericLiteral fails to detect numeric overflow
IMPALA-7888: Incorrect NumericLiteral overflow checks for FLOAT,
             DOUBLE
IMPALA-7891: Analyzer does not detect numeric overflow in CAST
IMPALA-7894: Parser does not catch double overflow

These are all caused by the somewhat cluttered state of the numeric
range check code after years of incremental changes. This patch
centralizes all checks into a series of constants and methods for
uniformity.  All values are set in the constructor which now checks
that the value is legal for the type. Cast operations verify that the
cast is valid. Multiple semi-parallel versions of the same logic is
replaced by calls to a single implementation.

The numeric checks now follow the SQL standard which says that
implementations should fail if a cast would trucate the most significant
digits, but round when truncating the least significant.

IMPALA-7865: Repeated type widening of arithmetic expressions

Partial fix. Replaces the "is explicit cast" flag in the numeric literal
with the explicit type. This allows reseting an implicit type back to
the explciit type if an arithmetic expression is analyzed multiple
times. A later patch will feed this type information into the type
inference mechanism to complete the fix.

Finally, adds a set of new exceptions that begin to unify error
reporting.  These handle casts (SqlCastException), value validation
(InvalidValueException) and unsupported features
(UnsupportedFeatureException.) These all derive from AnalysisException
for backward compatibility. Tests use the new exceptions to check for
expected errors rather than parsing text strings (which tend to
change.)

Testing:

* Added unit tests just for numeric literals. Refactored code to
  simplify the tests.
* Added a test case for the obscure case in Decimal V1 of an implicit
  cast overflow.
* The depth-check tests needed one extra level of nesting to trigger
  failure.
* Ran all FE tests.

Change-Id: I484600747b2871d3a6fe9153751973af9a8534f2
Reviewed-on: http://gerrit.cloudera.org:8080/12001
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Literals should not need explicit analyze step
> ----------------------------------------------
>
>                 Key: IMPALA-7896
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7896
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>    Affects Versions: Impala 3.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Minor
>
> The Impala FE has the concept of a _lteral_ (string, boolean, null, number.) Originally, literals could only be created as part of the AST. Hence, all literals are subclasses of {{LiteralExpr}} which are {{ExprNodes}}. The analysis step is used to set the type of the literal numbers, when not known at create time. If literals were used only in the AST, this would be fine, they could be analyzed with an analyzer.
> In fact, as the code has evolved, {{LiteralExpr}} nodes are created via the catalog, which has no analyzer. To fudge the issue, the {{LiteralExpr.create()}} function does analysis with a null analyzer. This, in turn, means that the {{analyze()}} code needs to special case a null analyzer. This, in turn, leads to brittle, error prone code.
> Since literals are immutable (except, sadly, for type), it is better that they start analyzed. Since the only attribute which must be set is the type, and the type can be known at create time, we have the {{analyze()}} be an optional no-op, leading to cleaner semantics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org