You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by ravipesala <gi...@git.apache.org> on 2018/07/04 06:42:31 UTC

[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...

GitHub user ravipesala opened a pull request:

    https://github.com/apache/carbondata/pull/2444

    [CARBONDATA-2686] Implement Left join on MV datamap

    Code ported from another branch done by @eychih to implement Left outer join. 
    
    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
    
     - [ ] Testing done
            Please provide details on 
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ravipesala/incubator-carbondata left-join

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2444.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2444
    
----
commit 354cc01a1be116efbba4f1e5fecf97f76ce3ee61
Author: eychih <ey...@...>
Date:   2018-06-01T02:10:34Z

    add matching LEFTJOIN VIEW

commit 07828da6834290e713d45fe4acc81a61de3656b3
Author: eychih <ey...@...>
Date:   2018-06-07T01:14:24Z

    add changes for advisor to recommend leftjoin view

----


---

[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2444#discussion_r200568159
  
    --- Diff: datamap/mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/Harmonizer.scala ---
    @@ -36,17 +36,33 @@ abstract class Harmonizer(conf: SQLConf)
       def batches: Seq[Batch] = {
         Batch(
           "Data Harmonizations", fixedPoint,
    -      HarmonizeDimensionTable,
    -      HarmonizeFactTable) :: Nil
    +      Seq( HarmonizeDimensionTable) ++
    +      extendedOperatorHarmonizationRules: _*) :: Nil
    +//      HarmonizeFactTable) :: Nil
    --- End diff --
    
    ok


---

[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2444
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5676/



---

[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2444
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5640/



---

[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2444
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6748/



---

[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2444#discussion_r200560747
  
    --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/DefaultMatchMaker.scala ---
    @@ -245,15 +269,22 @@ object SelectSelectNoChildDelta extends DefaultMatchPattern with PredicateHelper
                       }
                   }
                   val tPredicateList = sel_1q.predicateList.filter { p =>
    -                !sel_1a.predicateList.exists(_.semanticEquals(p)) }
    -                val wip = sel_1q.copy(
    -                  predicateList = tPredicateList,
    -                  children = tChildren,
    -                  joinEdges = tJoinEdges.filter(_ != null),
    -                  aliasMap = tAliasMap.toMap)
    -
    -                val done = factorOutSubsumer(wip, usel_1a, wip.aliasMap)
    -                Seq(done)
    +                !sel_1a.predicateList.exists(_.semanticEquals(p))
    +              } ++ (if (isLeftJoinView(sel_1a) &&
    +                        sel_1q.joinEdges.head.joinType == Inner) {
    +                sel_1a.children(1)
    +                  .asInstanceOf[HarmonizedRelation].tag.map(IsNotNull(_)).toSeq
    +              } else {
    +                Seq.empty
    +              })
    +              val wip = sel_1q.copy(
    --- End diff --
    
    can you give a better name for this variable


---

[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2444#discussion_r200560875
  
    --- Diff: datamap/mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/Harmonizer.scala ---
    @@ -36,17 +36,33 @@ abstract class Harmonizer(conf: SQLConf)
       def batches: Seq[Batch] = {
         Batch(
           "Data Harmonizations", fixedPoint,
    -      HarmonizeDimensionTable,
    -      HarmonizeFactTable) :: Nil
    +      Seq( HarmonizeDimensionTable) ++
    +      extendedOperatorHarmonizationRules: _*) :: Nil
    +//      HarmonizeFactTable) :: Nil
    --- End diff --
    
    If it is not require, remove it


---

[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2444
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5594/



---

[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap

Posted by brijoobopanna <gi...@git.apache.org>.
Github user brijoobopanna commented on the issue:

    https://github.com/apache/carbondata/pull/2444
  
    retest this please


---

[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap

Posted by brijoobopanna <gi...@git.apache.org>.
Github user brijoobopanna commented on the issue:

    https://github.com/apache/carbondata/pull/2444
  
    retest this please


---

[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2444
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5657/



---

[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2444
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5578/



---

[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/2444
  
    LGTM


---

[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2444#discussion_r200568031
  
    --- Diff: datamap/mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/ModularRelation.scala ---
    @@ -86,10 +87,28 @@ object HarmonizedRelation {
           Select(_, _, _, _, _, dim :: Nil, NoFlags, Nil, Nil, _),
           NoFlags,
           Nil, _) if (dim.isInstanceOf[ModularRelation]) =>
    -        if (g.outputList
    -          .forall(col => col.isInstanceOf[AttributeReference] ||
    -                         (col.isInstanceOf[Alias] &&
    -                          col.asInstanceOf[Alias].child.isInstanceOf[AttributeReference]))) {
    +        if (g.outputList.forall(col => {
    --- End diff --
    
    ok


---

[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2444
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6867/



---

[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/2444


---

[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2444
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5630/



---

[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2444
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5622/



---

[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2444
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6841/



---

[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap

Posted by brijoobopanna <gi...@git.apache.org>.
Github user brijoobopanna commented on the issue:

    https://github.com/apache/carbondata/pull/2444
  
    retest sdv please


---

[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2444
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5639/



---

[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2444
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6824/



---

[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2444#discussion_r200568172
  
    --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/DefaultMatchMaker.scala ---
    @@ -245,15 +269,22 @@ object SelectSelectNoChildDelta extends DefaultMatchPattern with PredicateHelper
                       }
                   }
                   val tPredicateList = sel_1q.predicateList.filter { p =>
    -                !sel_1a.predicateList.exists(_.semanticEquals(p)) }
    -                val wip = sel_1q.copy(
    -                  predicateList = tPredicateList,
    -                  children = tChildren,
    -                  joinEdges = tJoinEdges.filter(_ != null),
    -                  aliasMap = tAliasMap.toMap)
    -
    -                val done = factorOutSubsumer(wip, usel_1a, wip.aliasMap)
    -                Seq(done)
    +                !sel_1a.predicateList.exists(_.semanticEquals(p))
    +              } ++ (if (isLeftJoinView(sel_1a) &&
    +                        sel_1q.joinEdges.head.joinType == Inner) {
    +                sel_1a.children(1)
    +                  .asInstanceOf[HarmonizedRelation].tag.map(IsNotNull(_)).toSeq
    +              } else {
    +                Seq.empty
    +              })
    +              val wip = sel_1q.copy(
    --- End diff --
    
    ok


---

[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2444#discussion_r200560651
  
    --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/DefaultMatchMaker.scala ---
    @@ -127,8 +126,27 @@ object SelectSelectNoChildDelta extends DefaultMatchPattern with PredicateHelper
         }
       }
     
    -  def apply(
    -      subsumer: ModularPlan,
    +  private def isLeftJoinView(subsumer: ModularPlan): Boolean = {
    +    if (subsumer.isInstanceOf[modular.Select]) {
    +      val sel = subsumer.asInstanceOf[modular.Select]
    --- End diff --
    
    better to use match clause to simply the logic in this function


---

[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2444
  
    retest this please


---

[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2444#discussion_r200560931
  
    --- Diff: datamap/mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/Harmonizer.scala ---
    @@ -36,17 +36,33 @@ abstract class Harmonizer(conf: SQLConf)
       def batches: Seq[Batch] = {
         Batch(
           "Data Harmonizations", fixedPoint,
    -      HarmonizeDimensionTable,
    -      HarmonizeFactTable) :: Nil
    +      Seq( HarmonizeDimensionTable) ++
    +      extendedOperatorHarmonizationRules: _*) :: Nil
    +//      HarmonizeFactTable) :: Nil
       }
    +
    +  /**
    +   * Override to provide additional rules for the modular operator harmonization batch.
    +   */
    +  def extendedOperatorHarmonizationRules: Seq[Rule[ModularPlan]] = Nil
    +}
    +
    +/**
    + * A full Harmonizer - harmonize both fact and dimension tables
    + */
    +object FullHarmonizer extends FullHarmonizer
    +
    +class FullHarmonizer extends Harmonizer(new SQLConf()) {
    +  override def extendedOperatorHarmonizationRules: Seq[Rule[ModularPlan]] =
    +    super.extendedOperatorHarmonizationRules ++ (HarmonizeFactTable :: Nil)
     }
     
     /**
    - * An default Harmonizer
    + * A semi Harmonizer - harmonize dimension tables only
    --- End diff --
    
    Can you add description for semi harmonizer?


---

[GitHub] carbondata issue #2444: [CARBONDATA-2686] Implement Left join on MV datamap

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2444
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6881/



---

[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2444#discussion_r200568187
  
    --- Diff: datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/DefaultMatchMaker.scala ---
    @@ -127,8 +126,27 @@ object SelectSelectNoChildDelta extends DefaultMatchPattern with PredicateHelper
         }
       }
     
    -  def apply(
    -      subsumer: ModularPlan,
    +  private def isLeftJoinView(subsumer: ModularPlan): Boolean = {
    +    if (subsumer.isInstanceOf[modular.Select]) {
    +      val sel = subsumer.asInstanceOf[modular.Select]
    --- End diff --
    
    ok


---

[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2444#discussion_r200568141
  
    --- Diff: datamap/mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/Harmonizer.scala ---
    @@ -36,17 +36,33 @@ abstract class Harmonizer(conf: SQLConf)
       def batches: Seq[Batch] = {
         Batch(
           "Data Harmonizations", fixedPoint,
    -      HarmonizeDimensionTable,
    -      HarmonizeFactTable) :: Nil
    +      Seq( HarmonizeDimensionTable) ++
    +      extendedOperatorHarmonizationRules: _*) :: Nil
    +//      HarmonizeFactTable) :: Nil
       }
    +
    +  /**
    +   * Override to provide additional rules for the modular operator harmonization batch.
    +   */
    +  def extendedOperatorHarmonizationRules: Seq[Rule[ModularPlan]] = Nil
    +}
    +
    +/**
    + * A full Harmonizer - harmonize both fact and dimension tables
    + */
    +object FullHarmonizer extends FullHarmonizer
    +
    +class FullHarmonizer extends Harmonizer(new SQLConf()) {
    +  override def extendedOperatorHarmonizationRules: Seq[Rule[ModularPlan]] =
    +    super.extendedOperatorHarmonizationRules ++ (HarmonizeFactTable :: Nil)
     }
     
     /**
    - * An default Harmonizer
    + * A semi Harmonizer - harmonize dimension tables only
    --- End diff --
    
    ok


---

[GitHub] carbondata pull request #2444: [CARBONDATA-2686] Implement Left join on MV d...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2444#discussion_r200561054
  
    --- Diff: datamap/mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/ModularRelation.scala ---
    @@ -86,10 +87,28 @@ object HarmonizedRelation {
           Select(_, _, _, _, _, dim :: Nil, NoFlags, Nil, Nil, _),
           NoFlags,
           Nil, _) if (dim.isInstanceOf[ModularRelation]) =>
    -        if (g.outputList
    -          .forall(col => col.isInstanceOf[AttributeReference] ||
    -                         (col.isInstanceOf[Alias] &&
    -                          col.asInstanceOf[Alias].child.isInstanceOf[AttributeReference]))) {
    +        if (g.outputList.forall(col => {
    --- End diff --
    
    better to move the logic out of if condition. and have it better formatted


---