You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2019/03/29 15:10:00 UTC
[jira] [Commented] (ASTERIXDB-2483) Out of Memory error doing
aggregation - need a rewrite
[ https://issues.apache.org/jira/browse/ASTERIXDB-2483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16805077#comment-16805077 ]
ASF subversion and git services commented on ASTERIXDB-2483:
------------------------------------------------------------
Commit 0cca97d57d047427c4ed27e3817870fd6325437e in asterixdb's branch refs/heads/master from Dmitry Lychagin
[ https://gitbox.apache.org/repos/asf?p=asterixdb.git;h=0cca97d ]
[ASTERIXDB-2483][COMP][FUN] Eliminate listify for distinct aggregates
- user model changes: no
- storage format changes: no
- interface changes: no
Details:
- Move distinct aggregate rewriting from SqlppQueryRewriter
to RewriteDistinctAggregateRule in the optimizer
- Add runtime for scalar distinct aggregates
- Fix ExtractCommonOperatorsRule handling of binary operators
- Additional tests for distinct aggregates
Change-Id: If13ea2696e9e0a8a639db684656e5642991c1f99
Reviewed-on: https://asterix-gerrit.ics.uci.edu/3293
Reviewed-by: Ali Alsuliman <al...@gmail.com>
Tested-by: Dmitry Lychagin <dm...@couchbase.com>
> Out of Memory error doing aggregation - need a rewrite
> ------------------------------------------------------
>
> Key: ASTERIXDB-2483
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2483
> Project: Apache AsterixDB
> Issue Type: Bug
> Components: COMP - Compiler, RT - Runtime, SQL - Translator SQL++
> Affects Versions: 0.9.5
> Environment: Linux
> Reporter: Michael J. Carey
> Assignee: Dmitry Lychagin
> Priority: Critical
>
> This is the schema:
> {noformat}
> CREATE TYPE Test AS open { unique2: int64 };
> CREATE DATASET wisconsin_5gb(Test) PRIMARY KEY unique2;
> {noformat}
> This is the query:
> {noformat}
> SELECT
> min(t.oddOnePercent) as min,
> max(t.oddOnePercent) as max,
> count(distinct t.oddOnePercent) as cnt
> FROM wisconsin_5gb t;
> {noformat}
> The plan for this query:
> {noformat}
> distribute result [$$46]
> -- DISTRIBUTE_RESULT |UNPARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE |UNPARTITIONED|
> project ([$$46])
> -- STREAM_PROJECT |UNPARTITIONED|
> assign [$$46] <- [{"min": $$48, "max": $$49, "cnt": $$50}]
> -- ASSIGN |UNPARTITIONED|
> project ([$$48, $$49, $$50])
> -- STREAM_PROJECT |UNPARTITIONED|
> subplan {
> aggregate [$$50] <- [agg-sql-sum($$53)]
> -- AGGREGATE |LOCAL|
> aggregate [$$53] <- [agg-sql-count($$43)]
> -- AGGREGATE |LOCAL|
> distinct ([$$43])
> -- MICRO_PRE_SORTED_DISTINCT_BY |LOCAL|
> order (ASC, $$43)
> -- IN_MEMORY_STABLE_SORT [$$43(ASC)] |LOCAL|
> assign [$$43] <- [$$52.getField("oddOnePercent")]
> -- ASSIGN |UNPARTITIONED|
> assign [$$52] <- [$#4.getField(0)]
> -- ASSIGN |UNPARTITIONED|
> unnest $#4 <- scan-collection($$28)
> -- UNNEST |UNPARTITIONED|
> nested tuple source
> -- NESTED_TUPLE_SOURCE |UNPARTITIONED|
> }
> -- SUBPLAN |UNPARTITIONED|
> aggregate [$$28, $$48, $$49] <- [listify($$27), agg-sql-min($$33), agg-sql-max($$33)]
> -- AGGREGATE |UNPARTITIONED|
> exchange
> -- RANDOM_MERGE_EXCHANGE |PARTITIONED|
> project ([$$27, $$33])
> -- STREAM_PROJECT |PARTITIONED|
> assign [$$33, $$27] <- [$$t.getField("oddOnePercent"), {"t": $$t}]
> -- ASSIGN |PARTITIONED|
> project ([$$t])
> -- STREAM_PROJECT |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE |PARTITIONED|
> data-scan []<-[$$47, $$t] <- Default.wisconsin_5gb
> -- DATASOURCE_SCAN |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE |PARTITIONED|
> empty-tuple-source
> -- EMPTY_TUPLE_SOURCE |PARTITIONED|
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)