You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2019/06/06 21:27:00 UTC

[jira] [Commented] (ASTERIXDB-2574) Inconsistency in min() and max() with respect to arrays/records

    [ https://issues.apache.org/jira/browse/ASTERIXDB-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858103#comment-16858103 ] 

ASF subversion and git services commented on ASTERIXDB-2574:
------------------------------------------------------------

Commit 77450e6a1ad081d2dc2d00b2847f45d8b9a407e5 in asterixdb's branch refs/heads/master from Ali Alsuliman
[ https://gitbox.apache.org/repos/asf?p=asterixdb.git;h=77450e6 ]

[ASTERIXDB-2574][COMP] Fix min/max functions

- user model changes: no
- storage format changes: no
- interface changes: no

This change is mainly for 2 things. The first thing is to not throw
an exception when the type of the aggregated field is invalid for min/max
(e.g. record or rectange) or min/max get incompatible data like string
and int. The result in this case would be NULL. The second thing is to
enable comparing ARRAYs correctly by using logical comparison. When
a partition runs into type invalidity, it will output NULL. The global
aggregator interprets NULL received from a partition as type invalidity
and outputs NULL as the final result. Both SQL and SQL++ will do that.
Special treatment is needed for scalar and distinct version of SQL since
SQL min/max ignores NULL values and continue aggregation and the scalar
and distinct version of SQL are normally setup as a global aggregator
since they behave like the global aggregator in a two-step aggregation.
Currently, there is only a local min and max functions. The other
min/max functions are used for everything, the global function of
two-step aggregation, and for scalar and distinct min/max. In order to
differentiate, a global min/max functions are added that will be used
for the two-step aggregation.

Details:
- fixed listify to open up elements when adding them to the collection
and the collection item type is of type ANY and changed the type inferer
of listify to enable that.
- fixed AbstractCollectionType to make sure itemType is never null.
- changed MinMaxAggTypeComputer to not throw an exception but return
NULL for invalid types.
- changed min/max descriptors to implement inferer to propagate the
type of the field and pass that when getting a comparator.
- switched min/max comparison to the logical comparison.
- refactored method inequalityUndefined to be shared by logical comparison
and min/max functions.
- added global max/min functions to enable differentiating between
scalar min/max, distinct min/max and two-step min/max (global & local).
- code clean-up for LogicalScalarBinaryComparator; created two INSTANCES
and re-used.

Change-Id: I1231cfe558099d167bae0b2fa7fc4879b756baf0
Reviewed-on: https://asterix-gerrit.ics.uci.edu/3427
Contrib: Jenkins <je...@fulliautomatix.ics.uci.edu>
Tested-by: Jenkins <je...@fulliautomatix.ics.uci.edu>
Integration-Tests: Jenkins <je...@fulliautomatix.ics.uci.edu>
Reviewed-by: Dmitry Lychagin <dm...@couchbase.com>


> Inconsistency in min() and max() with respect to arrays/records
> ---------------------------------------------------------------
>
>                 Key: ASTERIXDB-2574
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2574
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: FUN - Functions
>    Affects Versions: 0.9.4.1
>            Reporter: Ali Alsuliman
>            Assignee: Ali Alsuliman
>            Priority: Major
>
> min() and max() functions used on record/array fields sometimes throw an exception and other times process input. That's because if the field type is known the compiler will validate and check the field is numeric, string or datetime. If the field type is not known (ANY), it would pass and the function runtime would process the input data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)