You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by Hyunsik Choi <hy...@apache.org> on 2014/02/18 13:03:45 UTC
Review Request 18210: TAJO-601: Improve distinct aggregation query
processing.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18210/
-----------------------------------------------------------
Review request for Tajo.
Bugs: TAJO-601
https://issues.apache.org/jira/browse/TAJO-601
Repository: tajo
Description
-------
Currently, distinct aggregation queries are executed as follows:
* the first stage: it just shuffles tuples by hashing grouping keys.
* the second stage: it sorts them and executes sort aggregation.
This way executes queries including distinct aggregation functions with only two stages. But, it leads to large intermediate data during shuffle phase.
This kind of query can be rewritten as two queries:
[Original query]
----------
SELECT grp1, grp2, count(*) as total, count(distinct grp3) as distinct_col from rel1 group by grp1, grp2;
----------
[Rewritten query]
----------
SELECT grp1, grp2, sum(cnt) as total, count(grp3) as distinct_col from (
SELECT grp1, grp2, grp3, count(*) as cnt from rel1 group by grp1, grp2, grp3) tmp1 group by grp1, grp2
) table1;
----------
I'm expecting that this rewrite will significantly reduce the intermediate data volume and query response time in most cases.
Diffs
-----
tajo-common/src/main/java/org/apache/tajo/util/TUtil.java cc694d4
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/eval/EvalTreeUtil.java da05739
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumDoubleDistinct.java PRE-CREATION
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloat.java 10fd720
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloatDistinct.java PRE-CREATION
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumIntDistinct.java PRE-CREATION
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumLongDistinct.java PRE-CREATION
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/ExprsVerifier.java b14c448
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/LogicalPlanner.java f7c0bfa
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PlannerUtil.java 624518b
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PreLogicalPlanVerifier.java 6dac031
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/DataChannel.java efa1e05
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java f390b52
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java 91f658d
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/SeqScanExec.java a0c0eeb
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/FilterPushDownRule.java 399903c
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/PartitionedTableRewriter.java e5f7fb4
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/ProjectionPushDownRule.java 633d0c1
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMaster.java ae6d5eb
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMasterManagerService.java 3c30e38
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/eval/TestEvalTreeUtil.java d756242
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestGroupByQuery.java 1f80bce
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestExecutionBlockCursor.java 053c028
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestGlobalPlanner.java 2d3124d
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct.sql 6fe604e
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct2.sql 6bf8a8a
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation1.sql PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation2.sql PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation3.sql PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation4.sql PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation5.sql PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregationWithHaving1.sql PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct.result f2ad32a
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct2.result 9164120
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation1.result PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation2.result PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation3.result PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation4.result PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation5.result PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregationWithHaving1.result PRE-CREATION
Diff: https://reviews.apache.org/r/18210/diff/
Testing
-------
mvn clean install
Thanks,
Hyunsik Choi
Re: Review Request 18210: TAJO-601: Improve distinct aggregation query
processing.
Posted by Hyunsik Choi <hy...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18210/#review34693
-----------------------------------------------------------
The current approach has shown poor performance. You can see the current approach in the description of this issue.
This patch improves the performance of distinct aggregation. Unlike the current approach, in the this patch, GlobalPlanner builds three phase plan using two hash shuffles. Then, GlobalPlanner adds an enforcer of sort aggregation to the final execution block. As a result, it can reduce significantly intermediate data volume according to the cardinality of grouping columns.
This patch also allows Tajo to support multiple distinct functions. For example, the following query works well.
select l_orderkey, count(distinct l_partkey), sum(distinct l_partkey) from lineitem group by l_orderkey;
But, the current patch still has some limitations. The above query includes there are two count distinct functions: count(distinct), sum(distinct). They use the same distinct column 'l_partkey', so it works well. In contrast, the following case where there are two or more distinct columns is not supported yet.
select l_orderkey, count(distinct l_partkey), sum(distinct l_linenumber) from lineitem group by l_orderkey;
If you submit such a query, you will see the following messages: "different DISTINCT columns are not supported yet: l_partkey, l_linenumber". In order to support this kind of queries, we need additional physical executors. I'll add this feature later in another Jira issue.
- Hyunsik Choi
On Feb. 18, 2014, 9:03 p.m., Hyunsik Choi wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18210/
> -----------------------------------------------------------
>
> (Updated Feb. 18, 2014, 9:03 p.m.)
>
>
> Review request for Tajo.
>
>
> Bugs: TAJO-601
> https://issues.apache.org/jira/browse/TAJO-601
>
>
> Repository: tajo
>
>
> Description
> -------
>
> Currently, distinct aggregation queries are executed as follows:
> * the first stage: it just shuffles tuples by hashing grouping keys.
> * the second stage: it sorts them and executes sort aggregation.
>
> This way executes queries including distinct aggregation functions with only two stages. But, it leads to large intermediate data during shuffle phase.
>
> This kind of query can be rewritten as two queries:
>
> [Original query]
> ----------
> SELECT grp1, grp2, count(*) as total, count(distinct grp3) as distinct_col from rel1 group by grp1, grp2;
> ----------
>
> [Rewritten query]
> ----------
> SELECT grp1, grp2, sum(cnt) as total, count(grp3) as distinct_col from (
> SELECT grp1, grp2, grp3, count(*) as cnt from rel1 group by grp1, grp2, grp3) tmp1 group by grp1, grp2
> ) table1;
> ----------
>
> I'm expecting that this rewrite will significantly reduce the intermediate data volume and query response time in most cases.
>
>
> Diffs
> -----
>
> tajo-common/src/main/java/org/apache/tajo/util/TUtil.java cc694d4
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/eval/EvalTreeUtil.java da05739
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumDoubleDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloat.java 10fd720
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloatDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumIntDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumLongDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/ExprsVerifier.java b14c448
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/LogicalPlanner.java f7c0bfa
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PlannerUtil.java 624518b
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PreLogicalPlanVerifier.java 6dac031
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/DataChannel.java efa1e05
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java f390b52
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java 91f658d
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/SeqScanExec.java a0c0eeb
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/FilterPushDownRule.java 399903c
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/PartitionedTableRewriter.java e5f7fb4
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/ProjectionPushDownRule.java 633d0c1
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMaster.java ae6d5eb
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMasterManagerService.java 3c30e38
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/eval/TestEvalTreeUtil.java d756242
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestGroupByQuery.java 1f80bce
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestExecutionBlockCursor.java 053c028
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestGlobalPlanner.java 2d3124d
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct.sql 6fe604e
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct2.sql 6bf8a8a
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation1.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation2.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation3.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation4.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation5.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregationWithHaving1.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct.result f2ad32a
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct2.result 9164120
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation1.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation2.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation3.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation4.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation5.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregationWithHaving1.result PRE-CREATION
>
> Diff: https://reviews.apache.org/r/18210/diff/
>
>
> Testing
> -------
>
> mvn clean install
>
>
> Thanks,
>
> Hyunsik Choi
>
>
Re: Review Request 18210: TAJO-601: Improve distinct aggregation query
processing.
Posted by Hyunsik Choi <hy...@apache.org>.
> On Feb. 20, 2014, 10:51 a.m., Jung JaeHwa wrote:
> > Hyunsik, thank you for waiting.
> >
> > I tested the patch on my local cluster.
> > But validation for different columns doesn't work as expected. For example, following queries finished without the PlanningException.
> >
> > - select count(distinct id), sum(distinct score) from table1
> > - select id, count(distinct id), sum(distinct name) from table1 group by id
> >
> > For reference, I created a table which written at tajo wiki.
> >
> > Anyway, I found that it has never been called. Please, check this situation.
> >
> > And if that's okay with you, I want to suggest unit test cases for unsupported queries.
> > But if you think that it's waste of resource, may be disregarded. :)
Could you check the patch once again? I've tried your test, but I can see the following messages:
tajo> select count(distinct l_orderkey), sum(distinct l_partkey) from lineitem;
different DISTINCT columns are not supported yet: l_orderkey, l_partkey
tajo> select id, count(distinct l_orderkey), sum(distinct l_partkey) from lineitem group by id;
different DISTINCT columns are not supported yet: l_orderkey, l_partkey
tajo> select count(distinct id), sum(distinct score) from table1;
different DISTINCT columns are not supported yet: id, score
tajo> select id, count(distinct id), sum(distinct name) from table1 group by id;
different DISTINCT columns are not supported yet: id, name
Thanks!
- Hyunsik
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18210/#review34962
-----------------------------------------------------------
On Feb. 18, 2014, 9:03 p.m., Hyunsik Choi wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18210/
> -----------------------------------------------------------
>
> (Updated Feb. 18, 2014, 9:03 p.m.)
>
>
> Review request for Tajo.
>
>
> Bugs: TAJO-601
> https://issues.apache.org/jira/browse/TAJO-601
>
>
> Repository: tajo
>
>
> Description
> -------
>
> Currently, distinct aggregation queries are executed as follows:
> * the first stage: it just shuffles tuples by hashing grouping keys.
> * the second stage: it sorts them and executes sort aggregation.
>
> This way executes queries including distinct aggregation functions with only two stages. But, it leads to large intermediate data during shuffle phase.
>
> This kind of query can be rewritten as two queries:
>
> [Original query]
> ----------
> SELECT grp1, grp2, count(*) as total, count(distinct grp3) as distinct_col from rel1 group by grp1, grp2;
> ----------
>
> [Rewritten query]
> ----------
> SELECT grp1, grp2, sum(cnt) as total, count(grp3) as distinct_col from (
> SELECT grp1, grp2, grp3, count(*) as cnt from rel1 group by grp1, grp2, grp3) tmp1 group by grp1, grp2
> ) table1;
> ----------
>
> I'm expecting that this rewrite will significantly reduce the intermediate data volume and query response time in most cases.
>
>
> Diffs
> -----
>
> tajo-common/src/main/java/org/apache/tajo/util/TUtil.java cc694d4
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/eval/EvalTreeUtil.java da05739
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumDoubleDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloat.java 10fd720
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloatDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumIntDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumLongDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/ExprsVerifier.java b14c448
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/LogicalPlanner.java f7c0bfa
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PlannerUtil.java 624518b
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PreLogicalPlanVerifier.java 6dac031
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/DataChannel.java efa1e05
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java f390b52
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java 91f658d
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/SeqScanExec.java a0c0eeb
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/FilterPushDownRule.java 399903c
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/PartitionedTableRewriter.java e5f7fb4
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/ProjectionPushDownRule.java 633d0c1
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMaster.java ae6d5eb
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMasterManagerService.java 3c30e38
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/eval/TestEvalTreeUtil.java d756242
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestGroupByQuery.java 1f80bce
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestExecutionBlockCursor.java 053c028
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestGlobalPlanner.java 2d3124d
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct.sql 6fe604e
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct2.sql 6bf8a8a
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation1.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation2.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation3.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation4.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation5.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregationWithHaving1.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct.result f2ad32a
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct2.result 9164120
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation1.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation2.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation3.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation4.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation5.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregationWithHaving1.result PRE-CREATION
>
> Diff: https://reviews.apache.org/r/18210/diff/
>
>
> Testing
> -------
>
> mvn clean install
>
>
> Thanks,
>
> Hyunsik Choi
>
>
Re: Review Request 18210: TAJO-601: Improve distinct aggregation query
processing.
Posted by Jung JaeHwa <jh...@gruter.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18210/#review34962
-----------------------------------------------------------
Hyunsik, thank you for waiting.
I tested the patch on my local cluster.
But validation for different columns doesn't work as expected. For example, following queries finished without the PlanningException.
- select count(distinct id), sum(distinct score) from table1
- select id, count(distinct id), sum(distinct name) from table1 group by id
For reference, I created a table which written at tajo wiki.
Anyway, I found that it has never been called. Please, check this situation.
And if that's okay with you, I want to suggest unit test cases for unsupported queries.
But if you think that it's waste of resource, may be disregarded. :)
- Jung JaeHwa
On Feb. 18, 2014, 12:03 p.m., Hyunsik Choi wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18210/
> -----------------------------------------------------------
>
> (Updated Feb. 18, 2014, 12:03 p.m.)
>
>
> Review request for Tajo.
>
>
> Bugs: TAJO-601
> https://issues.apache.org/jira/browse/TAJO-601
>
>
> Repository: tajo
>
>
> Description
> -------
>
> Currently, distinct aggregation queries are executed as follows:
> * the first stage: it just shuffles tuples by hashing grouping keys.
> * the second stage: it sorts them and executes sort aggregation.
>
> This way executes queries including distinct aggregation functions with only two stages. But, it leads to large intermediate data during shuffle phase.
>
> This kind of query can be rewritten as two queries:
>
> [Original query]
> ----------
> SELECT grp1, grp2, count(*) as total, count(distinct grp3) as distinct_col from rel1 group by grp1, grp2;
> ----------
>
> [Rewritten query]
> ----------
> SELECT grp1, grp2, sum(cnt) as total, count(grp3) as distinct_col from (
> SELECT grp1, grp2, grp3, count(*) as cnt from rel1 group by grp1, grp2, grp3) tmp1 group by grp1, grp2
> ) table1;
> ----------
>
> I'm expecting that this rewrite will significantly reduce the intermediate data volume and query response time in most cases.
>
>
> Diffs
> -----
>
> tajo-common/src/main/java/org/apache/tajo/util/TUtil.java cc694d4
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/eval/EvalTreeUtil.java da05739
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumDoubleDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloat.java 10fd720
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloatDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumIntDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumLongDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/ExprsVerifier.java b14c448
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/LogicalPlanner.java f7c0bfa
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PlannerUtil.java 624518b
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PreLogicalPlanVerifier.java 6dac031
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/DataChannel.java efa1e05
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java f390b52
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java 91f658d
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/SeqScanExec.java a0c0eeb
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/FilterPushDownRule.java 399903c
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/PartitionedTableRewriter.java e5f7fb4
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/ProjectionPushDownRule.java 633d0c1
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMaster.java ae6d5eb
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMasterManagerService.java 3c30e38
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/eval/TestEvalTreeUtil.java d756242
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestGroupByQuery.java 1f80bce
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestExecutionBlockCursor.java 053c028
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestGlobalPlanner.java 2d3124d
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct.sql 6fe604e
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct2.sql 6bf8a8a
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation1.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation2.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation3.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation4.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation5.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregationWithHaving1.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct.result f2ad32a
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct2.result 9164120
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation1.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation2.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation3.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation4.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation5.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregationWithHaving1.result PRE-CREATION
>
> Diff: https://reviews.apache.org/r/18210/diff/
>
>
> Testing
> -------
>
> mvn clean install
>
>
> Thanks,
>
> Hyunsik Choi
>
>
Re: Review Request 18210: TAJO-601: Improve distinct aggregation query
processing.
Posted by Hyunsik Choi <hy...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18210/#review34987
-----------------------------------------------------------
Thank you for the review. I've fixed all of them you mentioned. And, I've committed it to master branch.
- Hyunsik Choi
On Feb. 20, 2014, 2:28 p.m., Hyunsik Choi wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18210/
> -----------------------------------------------------------
>
> (Updated Feb. 20, 2014, 2:28 p.m.)
>
>
> Review request for Tajo.
>
>
> Bugs: TAJO-601
> https://issues.apache.org/jira/browse/TAJO-601
>
>
> Repository: tajo
>
>
> Description
> -------
>
> Currently, distinct aggregation queries are executed as follows:
> * the first stage: it just shuffles tuples by hashing grouping keys.
> * the second stage: it sorts them and executes sort aggregation.
>
> This way executes queries including distinct aggregation functions with only two stages. But, it leads to large intermediate data during shuffle phase.
>
> This kind of query can be rewritten as two queries:
>
> [Original query]
> ----------
> SELECT grp1, grp2, count(*) as total, count(distinct grp3) as distinct_col from rel1 group by grp1, grp2;
> ----------
>
> [Rewritten query]
> ----------
> SELECT grp1, grp2, sum(cnt) as total, count(grp3) as distinct_col from (
> SELECT grp1, grp2, grp3, count(*) as cnt from rel1 group by grp1, grp2, grp3) tmp1 group by grp1, grp2
> ) table1;
> ----------
>
> I'm expecting that this rewrite will significantly reduce the intermediate data volume and query response time in most cases.
>
>
> Diffs
> -----
>
> tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/SortSpec.java 3ef73d5c5385b40fcfb3b0ecbbc35b783224c760
> tajo-common/src/main/java/org/apache/tajo/util/TUtil.java cc694d43f42f68945cf53a7b8b9bbdca97a4f205
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/eval/EvalTreeUtil.java da05739b8feff0e04b1762f8000b1f3818c773a2
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/eval/FunctionEval.java 0555bdec8aff6fa79c02b640c81ad55d4666b90a
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumDoubleDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloat.java 10fd7205f29c82adf87816737598ce762ee0ebc9
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloatDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumIntDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumLongDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/ExprsVerifier.java b14c448ee5b3ce0dfca67c6a9b942f1803cc91f9
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/LogicalPlanner.java f7c0bfab78cb3416e7a2ed263cc362917023e3ca
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PhysicalPlannerImpl.java 67f56303e04787bf950c4a9a703faec58fb74cd4
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PlannerUtil.java 7d5e2fc7e085cc36527383a208277384035263e7
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PreLogicalPlanVerifier.java 6dac031218c650b9c1c86811b4552fe6d82da0c1
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/enforce/Enforcer.java dd46996eca7eb9c38f87d97813f5dcc7220429ed
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/DataChannel.java 9f5c6bf9dd7b549308724ce1e8044aff1630cef1
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java f390b52f378a2d7e84e40876df4a4b416af912ef
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java 91f658dab395620f5a891f51407b3676b07a8fa5
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java 791781e526c54f216152e935682bc2c3147a9e0c
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/SeqScanExec.java 53a1c24197c40c77153f79f90c05882c90aae957
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/FilterPushDownRule.java 399903c66bb8a62074facd0bbbe9b3b8e891c067
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/PartitionedTableRewriter.java e5f7fb40414e0b2e2e40bccebe24069ee4d9301b
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/ProjectionPushDownRule.java 633d0c1857533b02c4ecc6913c740fd2e3722845
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMaster.java ae6d5ebb97f8c4287ffd11262b2932d2f8b1250c
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMasterManagerService.java 3c30e3854abaa891f72b368144942164e5dffab7
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java 56c26797aad1dbe95945567961e9425fef72fa96
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/eval/TestEvalTreeUtil.java d7562426647a6a9d6aae5207a67ddcdd03d0ee3a
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestGroupByQuery.java 1f80bce23c74e3abdcbf9bc0553ec30244d6bd93
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestExecutionBlockCursor.java 053c02833e80dd931807fa6314965e687d7b26c0
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestGlobalPlanner.java 2d3124d7e9d7853b0f872eee1016cbae504c9c6b
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct.sql
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct2.sql
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation3.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation4.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation5.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregationWithHaving1.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregationWithUnion1.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct.result
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct2.result
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation3.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation4.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation5.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregationWithHaving1.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregationWithUnion1.result PRE-CREATION
> tajo-storage/src/main/java/org/apache/tajo/storage/RawFile.java c3a7525154e0f36d51dcca211949f21f57a9f1c8
>
> Diff: https://reviews.apache.org/r/18210/diff/
>
>
> Testing
> -------
>
> mvn clean install
>
>
> Thanks,
>
> Hyunsik Choi
>
>
Re: Review Request 18210: TAJO-601: Improve distinct aggregation query
processing.
Posted by Jung JaeHwa <jh...@gruter.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18210/#review34986
-----------------------------------------------------------
Ship it!
+1 for the patch.
Sorry, Hyunsik.
I found a misconfiguration on my local cluster, PlanningException works as expected.
- Jung JaeHwa
On Feb. 20, 2014, 5:28 a.m., Hyunsik Choi wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18210/
> -----------------------------------------------------------
>
> (Updated Feb. 20, 2014, 5:28 a.m.)
>
>
> Review request for Tajo.
>
>
> Bugs: TAJO-601
> https://issues.apache.org/jira/browse/TAJO-601
>
>
> Repository: tajo
>
>
> Description
> -------
>
> Currently, distinct aggregation queries are executed as follows:
> * the first stage: it just shuffles tuples by hashing grouping keys.
> * the second stage: it sorts them and executes sort aggregation.
>
> This way executes queries including distinct aggregation functions with only two stages. But, it leads to large intermediate data during shuffle phase.
>
> This kind of query can be rewritten as two queries:
>
> [Original query]
> ----------
> SELECT grp1, grp2, count(*) as total, count(distinct grp3) as distinct_col from rel1 group by grp1, grp2;
> ----------
>
> [Rewritten query]
> ----------
> SELECT grp1, grp2, sum(cnt) as total, count(grp3) as distinct_col from (
> SELECT grp1, grp2, grp3, count(*) as cnt from rel1 group by grp1, grp2, grp3) tmp1 group by grp1, grp2
> ) table1;
> ----------
>
> I'm expecting that this rewrite will significantly reduce the intermediate data volume and query response time in most cases.
>
>
> Diffs
> -----
>
> tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/SortSpec.java 3ef73d5c5385b40fcfb3b0ecbbc35b783224c760
> tajo-common/src/main/java/org/apache/tajo/util/TUtil.java cc694d43f42f68945cf53a7b8b9bbdca97a4f205
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/eval/EvalTreeUtil.java da05739b8feff0e04b1762f8000b1f3818c773a2
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/eval/FunctionEval.java 0555bdec8aff6fa79c02b640c81ad55d4666b90a
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumDoubleDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloat.java 10fd7205f29c82adf87816737598ce762ee0ebc9
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloatDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumIntDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumLongDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/ExprsVerifier.java b14c448ee5b3ce0dfca67c6a9b942f1803cc91f9
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/LogicalPlanner.java f7c0bfab78cb3416e7a2ed263cc362917023e3ca
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PhysicalPlannerImpl.java 67f56303e04787bf950c4a9a703faec58fb74cd4
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PlannerUtil.java 7d5e2fc7e085cc36527383a208277384035263e7
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PreLogicalPlanVerifier.java 6dac031218c650b9c1c86811b4552fe6d82da0c1
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/enforce/Enforcer.java dd46996eca7eb9c38f87d97813f5dcc7220429ed
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/DataChannel.java 9f5c6bf9dd7b549308724ce1e8044aff1630cef1
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java f390b52f378a2d7e84e40876df4a4b416af912ef
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java 91f658dab395620f5a891f51407b3676b07a8fa5
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java 791781e526c54f216152e935682bc2c3147a9e0c
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/SeqScanExec.java 53a1c24197c40c77153f79f90c05882c90aae957
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/FilterPushDownRule.java 399903c66bb8a62074facd0bbbe9b3b8e891c067
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/PartitionedTableRewriter.java e5f7fb40414e0b2e2e40bccebe24069ee4d9301b
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/ProjectionPushDownRule.java 633d0c1857533b02c4ecc6913c740fd2e3722845
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMaster.java ae6d5ebb97f8c4287ffd11262b2932d2f8b1250c
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMasterManagerService.java 3c30e3854abaa891f72b368144942164e5dffab7
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java 56c26797aad1dbe95945567961e9425fef72fa96
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/eval/TestEvalTreeUtil.java d7562426647a6a9d6aae5207a67ddcdd03d0ee3a
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestGroupByQuery.java 1f80bce23c74e3abdcbf9bc0553ec30244d6bd93
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestExecutionBlockCursor.java 053c02833e80dd931807fa6314965e687d7b26c0
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestGlobalPlanner.java 2d3124d7e9d7853b0f872eee1016cbae504c9c6b
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct.sql
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct2.sql
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation3.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation4.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation5.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregationWithHaving1.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregationWithUnion1.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct.result
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct2.result
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation3.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation4.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation5.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregationWithHaving1.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregationWithUnion1.result PRE-CREATION
> tajo-storage/src/main/java/org/apache/tajo/storage/RawFile.java c3a7525154e0f36d51dcca211949f21f57a9f1c8
>
> Diff: https://reviews.apache.org/r/18210/diff/
>
>
> Testing
> -------
>
> mvn clean install
>
>
> Thanks,
>
> Hyunsik Choi
>
>
Re: Review Request 18210: TAJO-601: Improve distinct aggregation query
processing.
Posted by Hyunsik Choi <hy...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18210/
-----------------------------------------------------------
(Updated Feb. 20, 2014, 2:28 p.m.)
Review request for Tajo.
Changes
-------
rebased against the latest revision.
Bugs: TAJO-601
https://issues.apache.org/jira/browse/TAJO-601
Repository: tajo
Description
-------
Currently, distinct aggregation queries are executed as follows:
* the first stage: it just shuffles tuples by hashing grouping keys.
* the second stage: it sorts them and executes sort aggregation.
This way executes queries including distinct aggregation functions with only two stages. But, it leads to large intermediate data during shuffle phase.
This kind of query can be rewritten as two queries:
[Original query]
----------
SELECT grp1, grp2, count(*) as total, count(distinct grp3) as distinct_col from rel1 group by grp1, grp2;
----------
[Rewritten query]
----------
SELECT grp1, grp2, sum(cnt) as total, count(grp3) as distinct_col from (
SELECT grp1, grp2, grp3, count(*) as cnt from rel1 group by grp1, grp2, grp3) tmp1 group by grp1, grp2
) table1;
----------
I'm expecting that this rewrite will significantly reduce the intermediate data volume and query response time in most cases.
Diffs (updated)
-----
tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/SortSpec.java 3ef73d5c5385b40fcfb3b0ecbbc35b783224c760
tajo-common/src/main/java/org/apache/tajo/util/TUtil.java cc694d43f42f68945cf53a7b8b9bbdca97a4f205
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/eval/EvalTreeUtil.java da05739b8feff0e04b1762f8000b1f3818c773a2
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/eval/FunctionEval.java 0555bdec8aff6fa79c02b640c81ad55d4666b90a
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumDoubleDistinct.java PRE-CREATION
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloat.java 10fd7205f29c82adf87816737598ce762ee0ebc9
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloatDistinct.java PRE-CREATION
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumIntDistinct.java PRE-CREATION
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumLongDistinct.java PRE-CREATION
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/ExprsVerifier.java b14c448ee5b3ce0dfca67c6a9b942f1803cc91f9
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/LogicalPlanner.java f7c0bfab78cb3416e7a2ed263cc362917023e3ca
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PhysicalPlannerImpl.java 67f56303e04787bf950c4a9a703faec58fb74cd4
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PlannerUtil.java 7d5e2fc7e085cc36527383a208277384035263e7
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PreLogicalPlanVerifier.java 6dac031218c650b9c1c86811b4552fe6d82da0c1
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/enforce/Enforcer.java dd46996eca7eb9c38f87d97813f5dcc7220429ed
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/DataChannel.java 9f5c6bf9dd7b549308724ce1e8044aff1630cef1
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java f390b52f378a2d7e84e40876df4a4b416af912ef
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java 91f658dab395620f5a891f51407b3676b07a8fa5
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java 791781e526c54f216152e935682bc2c3147a9e0c
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/SeqScanExec.java 53a1c24197c40c77153f79f90c05882c90aae957
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/FilterPushDownRule.java 399903c66bb8a62074facd0bbbe9b3b8e891c067
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/PartitionedTableRewriter.java e5f7fb40414e0b2e2e40bccebe24069ee4d9301b
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/ProjectionPushDownRule.java 633d0c1857533b02c4ecc6913c740fd2e3722845
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMaster.java ae6d5ebb97f8c4287ffd11262b2932d2f8b1250c
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMasterManagerService.java 3c30e3854abaa891f72b368144942164e5dffab7
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java 56c26797aad1dbe95945567961e9425fef72fa96
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/eval/TestEvalTreeUtil.java d7562426647a6a9d6aae5207a67ddcdd03d0ee3a
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestGroupByQuery.java 1f80bce23c74e3abdcbf9bc0553ec30244d6bd93
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestExecutionBlockCursor.java 053c02833e80dd931807fa6314965e687d7b26c0
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestGlobalPlanner.java 2d3124d7e9d7853b0f872eee1016cbae504c9c6b
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct.sql
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct2.sql
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation3.sql PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation4.sql PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation5.sql PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregationWithHaving1.sql PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregationWithUnion1.sql PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct.result
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct2.result
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation3.result PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation4.result PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation5.result PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregationWithHaving1.result PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregationWithUnion1.result PRE-CREATION
tajo-storage/src/main/java/org/apache/tajo/storage/RawFile.java c3a7525154e0f36d51dcca211949f21f57a9f1c8
Diff: https://reviews.apache.org/r/18210/diff/
Testing
-------
mvn clean install
Thanks,
Hyunsik Choi
Re: Review Request 18210: TAJO-601: Improve distinct aggregation query
processing.
Posted by Jung JaeHwa <jh...@gruter.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18210/#review34855
-----------------------------------------------------------
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumIntDistinct.java
<https://reviews.apache.org/r/18210/#comment65247>
It needs to update as follows:
INT8 sum(value INT4)
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumLongDistinct.java
<https://reviews.apache.org/r/18210/#comment65248>
It needs to update as follows:
INT8 sum(value INT8)
- Jung JaeHwa
On Feb. 18, 2014, 12:03 p.m., Hyunsik Choi wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18210/
> -----------------------------------------------------------
>
> (Updated Feb. 18, 2014, 12:03 p.m.)
>
>
> Review request for Tajo.
>
>
> Bugs: TAJO-601
> https://issues.apache.org/jira/browse/TAJO-601
>
>
> Repository: tajo
>
>
> Description
> -------
>
> Currently, distinct aggregation queries are executed as follows:
> * the first stage: it just shuffles tuples by hashing grouping keys.
> * the second stage: it sorts them and executes sort aggregation.
>
> This way executes queries including distinct aggregation functions with only two stages. But, it leads to large intermediate data during shuffle phase.
>
> This kind of query can be rewritten as two queries:
>
> [Original query]
> ----------
> SELECT grp1, grp2, count(*) as total, count(distinct grp3) as distinct_col from rel1 group by grp1, grp2;
> ----------
>
> [Rewritten query]
> ----------
> SELECT grp1, grp2, sum(cnt) as total, count(grp3) as distinct_col from (
> SELECT grp1, grp2, grp3, count(*) as cnt from rel1 group by grp1, grp2, grp3) tmp1 group by grp1, grp2
> ) table1;
> ----------
>
> I'm expecting that this rewrite will significantly reduce the intermediate data volume and query response time in most cases.
>
>
> Diffs
> -----
>
> tajo-common/src/main/java/org/apache/tajo/util/TUtil.java cc694d4
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/eval/EvalTreeUtil.java da05739
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumDoubleDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloat.java 10fd720
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloatDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumIntDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumLongDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/ExprsVerifier.java b14c448
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/LogicalPlanner.java f7c0bfa
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PlannerUtil.java 624518b
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PreLogicalPlanVerifier.java 6dac031
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/DataChannel.java efa1e05
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java f390b52
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java 91f658d
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/SeqScanExec.java a0c0eeb
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/FilterPushDownRule.java 399903c
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/PartitionedTableRewriter.java e5f7fb4
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/ProjectionPushDownRule.java 633d0c1
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMaster.java ae6d5eb
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMasterManagerService.java 3c30e38
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/eval/TestEvalTreeUtil.java d756242
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestGroupByQuery.java 1f80bce
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestExecutionBlockCursor.java 053c028
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestGlobalPlanner.java 2d3124d
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct.sql 6fe604e
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct2.sql 6bf8a8a
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation1.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation2.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation3.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation4.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation5.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregationWithHaving1.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct.result f2ad32a
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct2.result 9164120
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation1.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation2.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation3.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation4.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation5.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregationWithHaving1.result PRE-CREATION
>
> Diff: https://reviews.apache.org/r/18210/diff/
>
>
> Testing
> -------
>
> mvn clean install
>
>
> Thanks,
>
> Hyunsik Choi
>
>
Re: Review Request 18210: TAJO-601: Improve distinct aggregation query
processing.
Posted by Jung JaeHwa <jh...@gruter.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18210/#review34856
-----------------------------------------------------------
Hi Hyunsik.
I'm reviewing your patch.
First, I found some typos. After I review other codes, I'll comment again.
- Jung JaeHwa
On Feb. 18, 2014, 12:03 p.m., Hyunsik Choi wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18210/
> -----------------------------------------------------------
>
> (Updated Feb. 18, 2014, 12:03 p.m.)
>
>
> Review request for Tajo.
>
>
> Bugs: TAJO-601
> https://issues.apache.org/jira/browse/TAJO-601
>
>
> Repository: tajo
>
>
> Description
> -------
>
> Currently, distinct aggregation queries are executed as follows:
> * the first stage: it just shuffles tuples by hashing grouping keys.
> * the second stage: it sorts them and executes sort aggregation.
>
> This way executes queries including distinct aggregation functions with only two stages. But, it leads to large intermediate data during shuffle phase.
>
> This kind of query can be rewritten as two queries:
>
> [Original query]
> ----------
> SELECT grp1, grp2, count(*) as total, count(distinct grp3) as distinct_col from rel1 group by grp1, grp2;
> ----------
>
> [Rewritten query]
> ----------
> SELECT grp1, grp2, sum(cnt) as total, count(grp3) as distinct_col from (
> SELECT grp1, grp2, grp3, count(*) as cnt from rel1 group by grp1, grp2, grp3) tmp1 group by grp1, grp2
> ) table1;
> ----------
>
> I'm expecting that this rewrite will significantly reduce the intermediate data volume and query response time in most cases.
>
>
> Diffs
> -----
>
> tajo-common/src/main/java/org/apache/tajo/util/TUtil.java cc694d4
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/eval/EvalTreeUtil.java da05739
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumDoubleDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloat.java 10fd720
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloatDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumIntDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumLongDistinct.java PRE-CREATION
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/ExprsVerifier.java b14c448
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/LogicalPlanner.java f7c0bfa
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PlannerUtil.java 624518b
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PreLogicalPlanVerifier.java 6dac031
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/DataChannel.java efa1e05
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java f390b52
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java 91f658d
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/SeqScanExec.java a0c0eeb
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/FilterPushDownRule.java 399903c
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/PartitionedTableRewriter.java e5f7fb4
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/ProjectionPushDownRule.java 633d0c1
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMaster.java ae6d5eb
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMasterManagerService.java 3c30e38
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/eval/TestEvalTreeUtil.java d756242
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestGroupByQuery.java 1f80bce
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestExecutionBlockCursor.java 053c028
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestGlobalPlanner.java 2d3124d
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct.sql 6fe604e
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct2.sql 6bf8a8a
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation1.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation2.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation3.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation4.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation5.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregationWithHaving1.sql PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct.result f2ad32a
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct2.result 9164120
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation1.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation2.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation3.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation4.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation5.result PRE-CREATION
> tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregationWithHaving1.result PRE-CREATION
>
> Diff: https://reviews.apache.org/r/18210/diff/
>
>
> Testing
> -------
>
> mvn clean install
>
>
> Thanks,
>
> Hyunsik Choi
>
>