You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by ravipesala <gi...@git.apache.org> on 2018/07/04 06:42:31 UTC
[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...
GitHub user ravipesala opened a pull request:
https://github.com/apache/carbondata/pull/2444
[CARBONDATA-2686] Implement Left join on MV datamap
Code ported from another branch done by @eychih to implement Left outer join.
Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:
- [ ] Any interfaces changed?
- [ ] Any backward compatibility impacted?
- [ ] Document update required?
- [ ] Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.
- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ravipesala/incubator-carbondata left-join
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/carbondata/pull/2444.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2444
----
commit 354cc01a1be116efbba4f1e5fecf97f76ce3ee61
Author: eychih <ey...@...>
Date: 2018-06-01T02:10:34Z
add matching LEFTJOIN VIEW
commit 07828da6834290e713d45fe4acc81a61de3656b3
Author: eychih <ey...@...>
Date: 2018-06-07T01:14:24Z
add changes for advisor to recommend leftjoin view
----
---
[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2444#discussion_r200568159
--- Diff: datamap/mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/Harmonizer.scala ---
@@ -36,17 +36,33 @@ abstract class Harmonizer(conf: SQLConf)
def batches: Seq[Batch] = {
Batch(
"Data Harmonizations", fixedPoint,
- HarmonizeDimensionTable,
- HarmonizeFactTable) :: Nil
+ Seq( HarmonizeDimensionTable) ++
+ extendedOperatorHarmonizationRules: _*) :: Nil
+// HarmonizeFactTable) :: Nil
--- End diff --
ok
---
[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2444
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5676/
---
[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2444
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5640/
---
[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2444
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6748/
---
[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2444#discussion_r200560747
--- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/DefaultMatchMaker.scala ---
@@ -245,15 +269,22 @@ object SelectSelectNoChildDelta extends DefaultMatchPattern with PredicateHelper
}
}
val tPredicateList = sel_1q.predicateList.filter { p =>
- !sel_1a.predicateList.exists(_.semanticEquals(p)) }
- val wip = sel_1q.copy(
- predicateList = tPredicateList,
- children = tChildren,
- joinEdges = tJoinEdges.filter(_ != null),
- aliasMap = tAliasMap.toMap)
-
- val done = factorOutSubsumer(wip, usel_1a, wip.aliasMap)
- Seq(done)
+ !sel_1a.predicateList.exists(_.semanticEquals(p))
+ } ++ (if (isLeftJoinView(sel_1a) &&
+ sel_1q.joinEdges.head.joinType == Inner) {
+ sel_1a.children(1)
+ .asInstanceOf[HarmonizedRelation].tag.map(IsNotNull(_)).toSeq
+ } else {
+ Seq.empty
+ })
+ val wip = sel_1q.copy(
--- End diff --
can you give a better name for this variable
---
[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2444#discussion_r200560875
--- Diff: datamap/mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/Harmonizer.scala ---
@@ -36,17 +36,33 @@ abstract class Harmonizer(conf: SQLConf)
def batches: Seq[Batch] = {
Batch(
"Data Harmonizations", fixedPoint,
- HarmonizeDimensionTable,
- HarmonizeFactTable) :: Nil
+ Seq( HarmonizeDimensionTable) ++
+ extendedOperatorHarmonizationRules: _*) :: Nil
+// HarmonizeFactTable) :: Nil
--- End diff --
If it is not require, remove it
---
[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2444
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5594/
---
[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap
Posted by brijoobopanna <gi...@git.apache.org>.
Github user brijoobopanna commented on the issue:
https://github.com/apache/carbondata/pull/2444
retest this please
---
[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap
Posted by brijoobopanna <gi...@git.apache.org>.
Github user brijoobopanna commented on the issue:
https://github.com/apache/carbondata/pull/2444
retest this please
---
[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2444
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5657/
---
[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2444
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5578/
---
[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:
https://github.com/apache/carbondata/pull/2444
LGTM
---
[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2444#discussion_r200568031
--- Diff: datamap/mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/ModularRelation.scala ---
@@ -86,10 +87,28 @@ object HarmonizedRelation {
Select(_, _, _, _, _, dim :: Nil, NoFlags, Nil, Nil, _),
NoFlags,
Nil, _) if (dim.isInstanceOf[ModularRelation]) =>
- if (g.outputList
- .forall(col => col.isInstanceOf[AttributeReference] ||
- (col.isInstanceOf[Alias] &&
- col.asInstanceOf[Alias].child.isInstanceOf[AttributeReference]))) {
+ if (g.outputList.forall(col => {
--- End diff --
ok
---
[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2444
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6867/
---
[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/carbondata/pull/2444
---
[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2444
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5630/
---
[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2444
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5622/
---
[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2444
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6841/
---
[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap
Posted by brijoobopanna <gi...@git.apache.org>.
Github user brijoobopanna commented on the issue:
https://github.com/apache/carbondata/pull/2444
retest sdv please
---
[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2444
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5639/
---
[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2444
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6824/
---
[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2444#discussion_r200568172
--- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/DefaultMatchMaker.scala ---
@@ -245,15 +269,22 @@ object SelectSelectNoChildDelta extends DefaultMatchPattern with PredicateHelper
}
}
val tPredicateList = sel_1q.predicateList.filter { p =>
- !sel_1a.predicateList.exists(_.semanticEquals(p)) }
- val wip = sel_1q.copy(
- predicateList = tPredicateList,
- children = tChildren,
- joinEdges = tJoinEdges.filter(_ != null),
- aliasMap = tAliasMap.toMap)
-
- val done = factorOutSubsumer(wip, usel_1a, wip.aliasMap)
- Seq(done)
+ !sel_1a.predicateList.exists(_.semanticEquals(p))
+ } ++ (if (isLeftJoinView(sel_1a) &&
+ sel_1q.joinEdges.head.joinType == Inner) {
+ sel_1a.children(1)
+ .asInstanceOf[HarmonizedRelation].tag.map(IsNotNull(_)).toSeq
+ } else {
+ Seq.empty
+ })
+ val wip = sel_1q.copy(
--- End diff --
ok
---
[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2444#discussion_r200560651
--- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/DefaultMatchMaker.scala ---
@@ -127,8 +126,27 @@ object SelectSelectNoChildDelta extends DefaultMatchPattern with PredicateHelper
}
}
- def apply(
- subsumer: ModularPlan,
+ private def isLeftJoinView(subsumer: ModularPlan): Boolean = {
+ if (subsumer.isInstanceOf[modular.Select]) {
+ val sel = subsumer.asInstanceOf[modular.Select]
--- End diff --
better to use match clause to simply the logic in this function
---
[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2444
retest this please
---
[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2444#discussion_r200560931
--- Diff: datamap/mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/Harmonizer.scala ---
@@ -36,17 +36,33 @@ abstract class Harmonizer(conf: SQLConf)
def batches: Seq[Batch] = {
Batch(
"Data Harmonizations", fixedPoint,
- HarmonizeDimensionTable,
- HarmonizeFactTable) :: Nil
+ Seq( HarmonizeDimensionTable) ++
+ extendedOperatorHarmonizationRules: _*) :: Nil
+// HarmonizeFactTable) :: Nil
}
+
+ /**
+ * Override to provide additional rules for the modular operator harmonization batch.
+ */
+ def extendedOperatorHarmonizationRules: Seq[Rule[ModularPlan]] = Nil
+}
+
+/**
+ * A full Harmonizer - harmonize both fact and dimension tables
+ */
+object FullHarmonizer extends FullHarmonizer
+
+class FullHarmonizer extends Harmonizer(new SQLConf()) {
+ override def extendedOperatorHarmonizationRules: Seq[Rule[ModularPlan]] =
+ super.extendedOperatorHarmonizationRules ++ (HarmonizeFactTable :: Nil)
}
/**
- * An default Harmonizer
+ * A semi Harmonizer - harmonize dimension tables only
--- End diff --
Can you add description for semi harmonizer?
---
[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2444
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6881/
---
[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2444#discussion_r200568187
--- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/DefaultMatchMaker.scala ---
@@ -127,8 +126,27 @@ object SelectSelectNoChildDelta extends DefaultMatchPattern with PredicateHelper
}
}
- def apply(
- subsumer: ModularPlan,
+ private def isLeftJoinView(subsumer: ModularPlan): Boolean = {
+ if (subsumer.isInstanceOf[modular.Select]) {
+ val sel = subsumer.asInstanceOf[modular.Select]
--- End diff --
ok
---
[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2444#discussion_r200568141
--- Diff: datamap/mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/Harmonizer.scala ---
@@ -36,17 +36,33 @@ abstract class Harmonizer(conf: SQLConf)
def batches: Seq[Batch] = {
Batch(
"Data Harmonizations", fixedPoint,
- HarmonizeDimensionTable,
- HarmonizeFactTable) :: Nil
+ Seq( HarmonizeDimensionTable) ++
+ extendedOperatorHarmonizationRules: _*) :: Nil
+// HarmonizeFactTable) :: Nil
}
+
+ /**
+ * Override to provide additional rules for the modular operator harmonization batch.
+ */
+ def extendedOperatorHarmonizationRules: Seq[Rule[ModularPlan]] = Nil
+}
+
+/**
+ * A full Harmonizer - harmonize both fact and dimension tables
+ */
+object FullHarmonizer extends FullHarmonizer
+
+class FullHarmonizer extends Harmonizer(new SQLConf()) {
+ override def extendedOperatorHarmonizationRules: Seq[Rule[ModularPlan]] =
+ super.extendedOperatorHarmonizationRules ++ (HarmonizeFactTable :: Nil)
}
/**
- * An default Harmonizer
+ * A semi Harmonizer - harmonize dimension tables only
--- End diff --
ok
---
[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2444#discussion_r200561054
--- Diff: datamap/mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/ModularRelation.scala ---
@@ -86,10 +87,28 @@ object HarmonizedRelation {
Select(_, _, _, _, _, dim :: Nil, NoFlags, Nil, Nil, _),
NoFlags,
Nil, _) if (dim.isInstanceOf[ModularRelation]) =>
- if (g.outputList
- .forall(col => col.isInstanceOf[AttributeReference] ||
- (col.isInstanceOf[Alias] &&
- col.asInstanceOf[Alias].child.isInstanceOf[AttributeReference]))) {
+ if (g.outputList.forall(col => {
--- End diff --
better to move the logic out of if condition. and have it better formatted
---