You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@carbondata.apache.org by GitBox <gi...@apache.org> on 2020/02/17 08:41:16 UTC

[GitHub] [carbondata] jackylk opened a new pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

jackylk opened a new pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625
 
 
    ### Why is this PR needed?
   Materialized View is a feature built on top of Spark, it should support not only carbondata format but also other formats.
    
    ### What changes were proposed in this PR?
   1. Added an API in CarbonMetaStore to lookup any relation, not just carbon relation
   2. When creating MV, use newly added lookup relation to get all parent table relations. Use CatalogTable instead of CarbonTable whenever possible
   3. Skip the segment handling when loading MV
       
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - Yes (WIP, need to add more testcase)
   
       
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#issuecomment-590777086
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2158/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#issuecomment-586884120
 
 
   Build Failed  with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/317/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] jackylk commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

jackylk commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#discussion_r383697114
 
 

 ##########
 File path: mv/core/src/main/scala/org/apache/carbondata/mv/extension/MVHelper.scala
 ##########
 @@ -72,42 +69,41 @@ object MVHelper {
       dataMapSchema: DataMapSchema,
       queryString: String,
       ifNotExistsSet: Boolean = false): Unit = {
-    val dmProperties = dataMapSchema.getProperties.asScala
-    if (dmProperties.contains("streaming") && dmProperties("streaming").equalsIgnoreCase("true")) {
+    val properties = dataMapSchema.getProperties.asScala
+    if (properties.contains("streaming") && properties("streaming").equalsIgnoreCase("true")) {
       throw new MalformedCarbonCommandException(
         s"Materialized view does not support streaming"
       )
     }
     val mvUtil = new MVUtil
-    mvUtil.validateDMProperty(dmProperties)
-    val logicalPlan = dropDummyFunc(
-      MVParser.getMVPlan(queryString, sparkSession))
+    mvUtil.validateDMProperty(properties)
+    val queryPlan = dropDummyFunc(MVParser.getMVPlan(queryString, sparkSession))
     // if there is limit in MV ctas query string, throw exception, as its not a valid usecase
-    logicalPlan match {
+    queryPlan match {
       case Limit(_, _) =>
         throw new MalformedCarbonCommandException("Materialized view does not support the query " +
                                                   "with limit")
       case _ =>
     }
-    val selectTables = getTables(logicalPlan)
-    if (selectTables.isEmpty) {
+    val mainTables = getCatalogTables(queryPlan)
+    if (mainTables.isEmpty) {
       throw new MalformedCarbonCommandException(
         s"Non-Carbon table does not support creating MV datamap")
     }
-    val modularPlan = validateMVQuery(sparkSession, logicalPlan)
+    val modularPlan = validateMVQuery(sparkSession, queryPlan)
     val updatedQueryWithDb = modularPlan.asCompactSQL
-    val (timeSeriesColumn, granularity): (String, String) = validateMVTimeSeriesQuery(
-      logicalPlan,
-      dataMapSchema)
-    val fullRebuild = isFullReload(logicalPlan)
+    val (timeSeriesColumn, granularity) = validateMVTimeSeriesQuery(queryPlan, dataMapSchema)
+    val isQueryNeedFullRebuild = checkIsQueryNeedFullRebuild(queryPlan)
+    val isHasNonCarbonProvider = checkIsMainTableHasNonCarbonSource(mainTables)
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#issuecomment-586909581
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2019/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

akashrn5 commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#discussion_r383443633
 
 

 ##########
 File path: mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
 ##########
 @@ -104,6 +104,79 @@ class MVCreateTestCase extends QueryTest with BeforeAndAfterAll {
     sql(s"""LOAD DATA local inpath '$resourcesPath/data_big.csv' INTO TABLE fact_table6 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '"')""")
   }
 
+  test("test create mv on parquet spark table") {
+    sql("drop materialized view if exists mv1")
+    sql("drop table if exists source")
+    sql("create table source using parquet as select * from fact_table1")
+    sql("create materialized view mv1 as select empname, deptname, avg(salary) from source group by empname, deptname")
+    var df = sql("select empname, avg(salary) from source group by empname")
+    assert(TestUtil.verifyMVDataMap(df.queryExecution.optimizedPlan, "mv1"))
+    checkAnswer(df, sql("select empname, avg(salary) from fact_table2 group by empname"))
+
+    // load to parquet table and check again
+    sql("insert into source select * from fact_table1")
+    df = sql("select empname, avg(salary) from source group by empname")
+    assert(TestUtil.verifyMVDataMap(df.queryExecution.optimizedPlan, "mv1"))
+    checkAnswer(df, sql("select empname, avg(salary) from fact_table2 group by empname"))
+
+    sql(s"drop materialized view mv1")
+    sql("drop table source")
+  }
+
+  test("test create mv on orc spark table") {
+    sql("drop materialized view if exists mv1")
+    sql("drop table if exists source")
+    sql("create table source using orc as select * from fact_table1")
+    sql("create materialized view mv1 as select empname, deptname, avg(salary) from source group by empname, deptname")
+    var df = sql("select empname, avg(salary) from source group by empname")
+    assert(TestUtil.verifyMVDataMap(df.queryExecution.optimizedPlan, "mv1"))
+    checkAnswer(df, sql("select empname, avg(salary) from fact_table2 group by empname"))
+
+    // load to orc table and check again
+    sql("insert into source select * from fact_table1")
+    df = sql("select empname, avg(salary) from source group by empname")
+    assert(TestUtil.verifyMVDataMap(df.queryExecution.optimizedPlan, "mv1"))
+    checkAnswer(df, sql("select empname, avg(salary) from fact_table2 group by empname"))
+
+    sql(s"drop materialized view mv1")
+    sql("drop table source")
+  }
+
+  test("test create mv on parquet hive table") {
+    sql("drop materialized view if exists mv1")
+    sql("drop table if exists source")
+    sql("create table source stored as parquet as select * from fact_table1")
+    sql("create materialized view mv1 as select empname, deptname, avg(salary) from source group by empname, deptname")
+    var df = sql("select empname, avg(salary) from source group by empname")
+    assert(TestUtil.verifyMVDataMap(df.queryExecution.optimizedPlan, "mv1"))
+    checkAnswer(df, sql("select empname, avg(salary) from fact_table2 group by empname"))
+
+    // load to parquet table and check again
+    sql("insert into source select * from fact_table1")
+    df = sql("select empname, avg(salary) from source group by empname")
+    assert(TestUtil.verifyMVDataMap(df.queryExecution.optimizedPlan, "mv1"))
+    checkAnswer(df, sql("select empname, avg(salary) from fact_table2 group by empname"))
+
+    sql(s"drop materialized view mv1")
+    sql("drop table source")
+  }
+
 
 Review comment:
   can you add a test case for partition also?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] jackylk commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

jackylk commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#discussion_r383700870
 
 

 ##########
 File path: mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
 ##########
 @@ -104,6 +104,79 @@ class MVCreateTestCase extends QueryTest with BeforeAndAfterAll {
     sql(s"""LOAD DATA local inpath '$resourcesPath/data_big.csv' INTO TABLE fact_table6 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '"')""")
   }
 
+  test("test create mv on parquet spark table") {
+    sql("drop materialized view if exists mv1")
+    sql("drop table if exists source")
+    sql("create table source using parquet as select * from fact_table1")
+    sql("create materialized view mv1 as select empname, deptname, avg(salary) from source group by empname, deptname")
+    var df = sql("select empname, avg(salary) from source group by empname")
+    assert(TestUtil.verifyMVDataMap(df.queryExecution.optimizedPlan, "mv1"))
+    checkAnswer(df, sql("select empname, avg(salary) from fact_table2 group by empname"))
+
+    // load to parquet table and check again
+    sql("insert into source select * from fact_table1")
+    df = sql("select empname, avg(salary) from source group by empname")
+    assert(TestUtil.verifyMVDataMap(df.queryExecution.optimizedPlan, "mv1"))
+    checkAnswer(df, sql("select empname, avg(salary) from fact_table2 group by empname"))
+
+    sql(s"drop materialized view mv1")
+    sql("drop table source")
+  }
+
+  test("test create mv on orc spark table") {
+    sql("drop materialized view if exists mv1")
+    sql("drop table if exists source")
+    sql("create table source using orc as select * from fact_table1")
+    sql("create materialized view mv1 as select empname, deptname, avg(salary) from source group by empname, deptname")
+    var df = sql("select empname, avg(salary) from source group by empname")
+    assert(TestUtil.verifyMVDataMap(df.queryExecution.optimizedPlan, "mv1"))
+    checkAnswer(df, sql("select empname, avg(salary) from fact_table2 group by empname"))
+
+    // load to orc table and check again
+    sql("insert into source select * from fact_table1")
+    df = sql("select empname, avg(salary) from source group by empname")
+    assert(TestUtil.verifyMVDataMap(df.queryExecution.optimizedPlan, "mv1"))
+    checkAnswer(df, sql("select empname, avg(salary) from fact_table2 group by empname"))
+
+    sql(s"drop materialized view mv1")
+    sql("drop table source")
+  }
+
+  test("test create mv on parquet hive table") {
+    sql("drop materialized view if exists mv1")
+    sql("drop table if exists source")
+    sql("create table source stored as parquet as select * from fact_table1")
+    sql("create materialized view mv1 as select empname, deptname, avg(salary) from source group by empname, deptname")
+    var df = sql("select empname, avg(salary) from source group by empname")
+    assert(TestUtil.verifyMVDataMap(df.queryExecution.optimizedPlan, "mv1"))
+    checkAnswer(df, sql("select empname, avg(salary) from fact_table2 group by empname"))
+
+    // load to parquet table and check again
+    sql("insert into source select * from fact_table1")
+    df = sql("select empname, avg(salary) from source group by empname")
+    assert(TestUtil.verifyMVDataMap(df.queryExecution.optimizedPlan, "mv1"))
+    checkAnswer(df, sql("select empname, avg(salary) from fact_table2 group by empname"))
+
+    sql(s"drop materialized view mv1")
+    sql("drop table source")
+  }
+
 
 Review comment:
   added, see "test create mv on partitioned parquet spark table"

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

akashrn5 commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#discussion_r383454314
 
 

 ##########
 File path: mv/core/src/main/scala/org/apache/carbondata/mv/extension/MVHelper.scala
 ##########
 @@ -568,17 +573,18 @@ object MVHelper {
     generatePartitionerField(allPartitionColumn.toList, Seq.empty)
   }
 
-  private def inheritTablePropertiesFromMainTable(parentTable: CarbonTable,
+  private def inheritTablePropertiesFromMainTable(
 
 Review comment:
   this method we can skip for non carbon tables right?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#issuecomment-590418277
 
 
   Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/440/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

akashrn5 commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#discussion_r383437391
 
 

 ##########
 File path: mv/core/src/main/scala/org/apache/carbondata/mv/extension/MVHelper.scala
 ##########
 @@ -610,13 +616,12 @@ object MVHelper {
         val relationList = fields._2.columnTableRelationList
         // check if columns present in datamap are long_string_col in parent table. If they are
         // long_string_columns in parent, make them long_string_columns in datamap
-        if (aggFunc.isEmpty &&
-            relationList.size == 1 &&
-            longStringColumnInParents.contains(relationList.head.parentColumnName)) {
-          varcharDatamapFields += relationList.head.parentColumnName
+        if (aggFunc.isEmpty && relationList.size == 1 && longStringColumnInParents
+          .contains(relationList.head.columnName)) {
+          varcharDatamapFields += relationList.head.columnName
         }
       })
-      if (varcharDatamapFields.nonEmpty) {
+      if (!varcharDatamapFields.isEmpty) {
 
 Review comment:
   revert this change, old one was correct

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#issuecomment-590743314
 
 
   Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/457/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#issuecomment-587468262
 
 
   Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/339/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

akashrn5 commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#discussion_r383454639
 
 

 ##########
 File path: mv/core/src/main/scala/org/apache/carbondata/mv/extension/MVHelper.scala
 ##########
 @@ -568,17 +573,18 @@ object MVHelper {
     generatePartitionerField(allPartitionColumn.toList, Seq.empty)
   }
 
-  private def inheritTablePropertiesFromMainTable(parentTable: CarbonTable,
+  private def inheritTablePropertiesFromMainTable(
+      parentTable: CarbonTable,
       fields: Seq[Field],
       fieldRelationMap: scala.collection.mutable.LinkedHashMap[Field, MVField],
       tableProperties: mutable.Map[String, String]): Unit = {
     var neworder = Seq[String]()
-    val parentOrder = parentTable.getSortColumns.asScala
+    val parentOrder = parentTable.getSortColumns().asScala
 
 Review comment:
   please revert the change as old one is correct

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] jackylk commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

jackylk commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#discussion_r383700870
 
 

 ##########
 File path: mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
 ##########
 @@ -104,6 +104,79 @@ class MVCreateTestCase extends QueryTest with BeforeAndAfterAll {
     sql(s"""LOAD DATA local inpath '$resourcesPath/data_big.csv' INTO TABLE fact_table6 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '"')""")
   }
 
+  test("test create mv on parquet spark table") {
+    sql("drop materialized view if exists mv1")
+    sql("drop table if exists source")
+    sql("create table source using parquet as select * from fact_table1")
+    sql("create materialized view mv1 as select empname, deptname, avg(salary) from source group by empname, deptname")
+    var df = sql("select empname, avg(salary) from source group by empname")
+    assert(TestUtil.verifyMVDataMap(df.queryExecution.optimizedPlan, "mv1"))
+    checkAnswer(df, sql("select empname, avg(salary) from fact_table2 group by empname"))
+
+    // load to parquet table and check again
+    sql("insert into source select * from fact_table1")
+    df = sql("select empname, avg(salary) from source group by empname")
+    assert(TestUtil.verifyMVDataMap(df.queryExecution.optimizedPlan, "mv1"))
+    checkAnswer(df, sql("select empname, avg(salary) from fact_table2 group by empname"))
+
+    sql(s"drop materialized view mv1")
+    sql("drop table source")
+  }
+
+  test("test create mv on orc spark table") {
+    sql("drop materialized view if exists mv1")
+    sql("drop table if exists source")
+    sql("create table source using orc as select * from fact_table1")
+    sql("create materialized view mv1 as select empname, deptname, avg(salary) from source group by empname, deptname")
+    var df = sql("select empname, avg(salary) from source group by empname")
+    assert(TestUtil.verifyMVDataMap(df.queryExecution.optimizedPlan, "mv1"))
+    checkAnswer(df, sql("select empname, avg(salary) from fact_table2 group by empname"))
+
+    // load to orc table and check again
+    sql("insert into source select * from fact_table1")
+    df = sql("select empname, avg(salary) from source group by empname")
+    assert(TestUtil.verifyMVDataMap(df.queryExecution.optimizedPlan, "mv1"))
+    checkAnswer(df, sql("select empname, avg(salary) from fact_table2 group by empname"))
+
+    sql(s"drop materialized view mv1")
+    sql("drop table source")
+  }
+
+  test("test create mv on parquet hive table") {
+    sql("drop materialized view if exists mv1")
+    sql("drop table if exists source")
+    sql("create table source stored as parquet as select * from fact_table1")
+    sql("create materialized view mv1 as select empname, deptname, avg(salary) from source group by empname, deptname")
+    var df = sql("select empname, avg(salary) from source group by empname")
+    assert(TestUtil.verifyMVDataMap(df.queryExecution.optimizedPlan, "mv1"))
+    checkAnswer(df, sql("select empname, avg(salary) from fact_table2 group by empname"))
+
+    // load to parquet table and check again
+    sql("insert into source select * from fact_table1")
+    df = sql("select empname, avg(salary) from source group by empname")
+    assert(TestUtil.verifyMVDataMap(df.queryExecution.optimizedPlan, "mv1"))
+    checkAnswer(df, sql("select empname, avg(salary) from fact_table2 group by empname"))
+
+    sql(s"drop materialized view mv1")
+    sql("drop table source")
+  }
+
 
 Review comment:
   added

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

Indhumathi27 commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#discussion_r382534896
 
 

 ##########
 File path: datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
 ##########
 @@ -104,6 +104,58 @@ class MVCreateTestCase extends QueryTest with BeforeAndAfterAll {
     sql(s"""LOAD DATA local inpath '$resourcesPath/data_big.csv' INTO TABLE fact_table6 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '"')""")
   }
 
+  test("test create mv on parquet spark table") {
+    sql("drop materialized view if exists mv1")
+    sql("drop table if exists source")
+    sql("create table source using parquet as select * from fact_table1")
+    sql("create materialized view mv1 as select empname, deptname, avg(salary) from source group by empname, deptname")
+    val df = sql("select empname, avg(salary) from source group by empname")
 
 Review comment:
   Please add a testcase with following steps:
   1. create parquet table, create mv, chekc results
   2. load to parquet table
   3. check results

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] jackylk commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

jackylk commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#discussion_r382912976
 
 

 ##########
 File path: datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
 ##########
 @@ -104,6 +104,58 @@ class MVCreateTestCase extends QueryTest with BeforeAndAfterAll {
     sql(s"""LOAD DATA local inpath '$resourcesPath/data_big.csv' INTO TABLE fact_table6 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '"')""")
   }
 
+  test("test create mv on parquet spark table") {
+    sql("drop materialized view if exists mv1")
+    sql("drop table if exists source")
+    sql("create table source using parquet as select * from fact_table1")
+    sql("create materialized view mv1 as select empname, deptname, avg(salary) from source group by empname, deptname")
+    val df = sql("select empname, avg(salary) from source group by empname")
 
 Review comment:
   ok, added. see testcase name "test create mv on parquet spark table"

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] jackylk commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

jackylk commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#discussion_r383712152
 
 

 ##########
 File path: mv/core/src/main/scala/org/apache/carbondata/mv/extension/MVHelper.scala
 ##########
 @@ -220,55 +209,55 @@ object MVHelper {
       } else {
         Seq()
       }
-      partitionerFields = getPartitionerFields(parentPartitionColumns, fieldRelationMap)
+      getPartitionerFields(parentPartitionColumns, fieldRelationMap)
+    } else {
+      Seq.empty
     }
 
-    var order = 0
     val columnOrderMap = new java.util.HashMap[Integer, String]()
     if (partitionerFields.nonEmpty) {
-      fields.foreach { field =>
-        columnOrderMap.put(order, field.column)
-        order += 1
+      fields.zipWithIndex.foreach { case (field, index) =>
+        columnOrderMap.put(index, field.column)
       }
     }
 
-    // TODO Use a proper DB
-    val tableIdentifier = TableIdentifier(
-      dataMapSchema.getDataMapName + "_table", selectTables.head.identifier.database)
+    val mvTableIdentifier = TableIdentifier(
+      dataMapSchema.getDataMapName + "_table", mainTables.head.identifier.database)
+
     // prepare table model of the collected tokens
-    val tableModel: TableModel = CarbonParserUtil.prepareTableModel(
+    val mvTableModel: TableModel = CarbonParserUtil.prepareTableModel(
       ifNotExistPresent = ifNotExistsSet,
-      CarbonParserUtil.convertDbNameToLowerCase(tableIdentifier.database),
-      tableIdentifier.table.toLowerCase,
+      CarbonParserUtil.convertDbNameToLowerCase(mvTableIdentifier.database),
+      mvTableIdentifier.table.toLowerCase,
       fields,
       partitionerFields,
       tableProperties,
       None,
       isAlterFlow = false,
       None)
 
-    val tablePath = if (dmProperties.contains("path")) {
-      dmProperties("path")
+    val mvTablePath = if (properties.contains("path")) {
+      properties("path")
     } else {
-      CarbonEnv.getTablePath(tableModel.databaseNameOp, tableModel.tableName)(sparkSession)
+      CarbonEnv.getTablePath(mvTableModel.databaseNameOp, mvTableModel.tableName)(sparkSession)
     }
-    CarbonCreateTableCommand(TableNewProcessor(tableModel),
-      tableModel.ifNotExistsSet, Some(tablePath), isVisible = false).run(sparkSession)
+    CarbonCreateTableCommand(TableNewProcessor(mvTableModel),
 
 Review comment:
   Do you mean providing sort_columns property when creating the MV? If user put sort_columns property in parent table (parquet table),  carbon MV should not inherit it, right?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

akashrn5 commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#discussion_r383722655
 
 

 ##########
 File path: mv/core/src/main/scala/org/apache/carbondata/mv/extension/MVHelper.scala
 ##########
 @@ -220,55 +209,55 @@ object MVHelper {
       } else {
         Seq()
       }
-      partitionerFields = getPartitionerFields(parentPartitionColumns, fieldRelationMap)
+      getPartitionerFields(parentPartitionColumns, fieldRelationMap)
+    } else {
+      Seq.empty
     }
 
-    var order = 0
     val columnOrderMap = new java.util.HashMap[Integer, String]()
     if (partitionerFields.nonEmpty) {
-      fields.foreach { field =>
-        columnOrderMap.put(order, field.column)
-        order += 1
+      fields.zipWithIndex.foreach { case (field, index) =>
+        columnOrderMap.put(index, field.column)
       }
     }
 
-    // TODO Use a proper DB
-    val tableIdentifier = TableIdentifier(
-      dataMapSchema.getDataMapName + "_table", selectTables.head.identifier.database)
+    val mvTableIdentifier = TableIdentifier(
+      dataMapSchema.getDataMapName + "_table", mainTables.head.identifier.database)
+
     // prepare table model of the collected tokens
-    val tableModel: TableModel = CarbonParserUtil.prepareTableModel(
+    val mvTableModel: TableModel = CarbonParserUtil.prepareTableModel(
       ifNotExistPresent = ifNotExistsSet,
-      CarbonParserUtil.convertDbNameToLowerCase(tableIdentifier.database),
-      tableIdentifier.table.toLowerCase,
+      CarbonParserUtil.convertDbNameToLowerCase(mvTableIdentifier.database),
+      mvTableIdentifier.table.toLowerCase,
       fields,
       partitionerFields,
       tableProperties,
       None,
       isAlterFlow = false,
       None)
 
-    val tablePath = if (dmProperties.contains("path")) {
-      dmProperties("path")
+    val mvTablePath = if (properties.contains("path")) {
+      properties("path")
     } else {
-      CarbonEnv.getTablePath(tableModel.databaseNameOp, tableModel.tableName)(sparkSession)
+      CarbonEnv.getTablePath(mvTableModel.databaseNameOp, mvTableModel.tableName)(sparkSession)
     }
-    CarbonCreateTableCommand(TableNewProcessor(tableModel),
-      tableModel.ifNotExistsSet, Some(tablePath), isVisible = false).run(sparkSession)
+    CarbonCreateTableCommand(TableNewProcessor(mvTableModel),
 
 Review comment:
   yes

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#issuecomment-590463974
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2140/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] jackylk commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

jackylk commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#discussion_r383694832
 
 

 ##########
 File path: mv/core/src/main/scala/org/apache/carbondata/mv/extension/MVHelper.scala
 ##########
 @@ -610,13 +616,12 @@ object MVHelper {
         val relationList = fields._2.columnTableRelationList
         // check if columns present in datamap are long_string_col in parent table. If they are
         // long_string_columns in parent, make them long_string_columns in datamap
-        if (aggFunc.isEmpty &&
-            relationList.size == 1 &&
-            longStringColumnInParents.contains(relationList.head.parentColumnName)) {
-          varcharDatamapFields += relationList.head.parentColumnName
+        if (aggFunc.isEmpty && relationList.size == 1 && longStringColumnInParents
+          .contains(relationList.head.columnName)) {
+          varcharDatamapFields += relationList.head.columnName
         }
       })
-      if (varcharDatamapFields.nonEmpty) {
+      if (!varcharDatamapFields.isEmpty) {
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

akashrn5 commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#discussion_r383437109
 
 

 ##########
 File path: mv/core/src/main/scala/org/apache/carbondata/mv/extension/MVHelper.scala
 ##########
 @@ -568,17 +573,18 @@ object MVHelper {
     generatePartitionerField(allPartitionColumn.toList, Seq.empty)
   }
 
-  private def inheritTablePropertiesFromMainTable(parentTable: CarbonTable,
+  private def inheritTablePropertiesFromMainTable(
+      parentTable: CarbonTable,
       fields: Seq[Field],
       fieldRelationMap: scala.collection.mutable.LinkedHashMap[Field, MVField],
       tableProperties: mutable.Map[String, String]): Unit = {
     var neworder = Seq[String]()
-    val parentOrder = parentTable.getSortColumns.asScala
+    val parentOrder = parentTable.getSortColumns().asScala
     parentOrder.foreach(parentcol =>
       fields.filter(col => fieldRelationMap(col).aggregateFunction.isEmpty &&
                            fieldRelationMap(col).columnTableRelationList.size == 1 &&
                            parentcol.equalsIgnoreCase(
-                             fieldRelationMap(col).columnTableRelationList(0).parentColumnName))
+                             fieldRelationMap(col).columnTableRelationList(0).columnName))
 
 Review comment:
   replace with `.head`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

akashrn5 commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#discussion_r383447624
 
 

 ##########
 File path: mv/core/src/main/scala/org/apache/carbondata/mv/extension/MVHelper.scala
 ##########
 @@ -72,42 +69,41 @@ object MVHelper {
       dataMapSchema: DataMapSchema,
       queryString: String,
       ifNotExistsSet: Boolean = false): Unit = {
-    val dmProperties = dataMapSchema.getProperties.asScala
-    if (dmProperties.contains("streaming") && dmProperties("streaming").equalsIgnoreCase("true")) {
+    val properties = dataMapSchema.getProperties.asScala
+    if (properties.contains("streaming") && properties("streaming").equalsIgnoreCase("true")) {
       throw new MalformedCarbonCommandException(
         s"Materialized view does not support streaming"
       )
     }
     val mvUtil = new MVUtil
-    mvUtil.validateDMProperty(dmProperties)
-    val logicalPlan = dropDummyFunc(
-      MVParser.getMVPlan(queryString, sparkSession))
+    mvUtil.validateDMProperty(properties)
+    val queryPlan = dropDummyFunc(MVParser.getMVPlan(queryString, sparkSession))
     // if there is limit in MV ctas query string, throw exception, as its not a valid usecase
-    logicalPlan match {
+    queryPlan match {
       case Limit(_, _) =>
         throw new MalformedCarbonCommandException("Materialized view does not support the query " +
                                                   "with limit")
       case _ =>
     }
-    val selectTables = getTables(logicalPlan)
-    if (selectTables.isEmpty) {
+    val mainTables = getCatalogTables(queryPlan)
+    if (mainTables.isEmpty) {
       throw new MalformedCarbonCommandException(
         s"Non-Carbon table does not support creating MV datamap")
     }
-    val modularPlan = validateMVQuery(sparkSession, logicalPlan)
+    val modularPlan = validateMVQuery(sparkSession, queryPlan)
     val updatedQueryWithDb = modularPlan.asCompactSQL
-    val (timeSeriesColumn, granularity): (String, String) = validateMVTimeSeriesQuery(
-      logicalPlan,
-      dataMapSchema)
-    val fullRebuild = isFullReload(logicalPlan)
+    val (timeSeriesColumn, granularity) = validateMVTimeSeriesQuery(queryPlan, dataMapSchema)
+    val isQueryNeedFullRebuild = checkIsQueryNeedFullRebuild(queryPlan)
+    val isHasNonCarbonProvider = checkIsMainTableHasNonCarbonSource(mainTables)
 
 Review comment:
   just `hasNonCarbonProvider` would be fine i think

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] asfgit closed pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

asfgit closed pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#issuecomment-587505158
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2041/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] QiangCai commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

QiangCai commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#issuecomment-590825798
 
 
   LGTM

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] jackylk commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

jackylk commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#discussion_r383694739
 
 

 ##########
 File path: mv/core/src/main/scala/org/apache/carbondata/mv/extension/MVHelper.scala
 ##########
 @@ -568,17 +573,18 @@ object MVHelper {
     generatePartitionerField(allPartitionColumn.toList, Seq.empty)
   }
 
-  private def inheritTablePropertiesFromMainTable(parentTable: CarbonTable,
+  private def inheritTablePropertiesFromMainTable(
+      parentTable: CarbonTable,
       fields: Seq[Field],
       fieldRelationMap: scala.collection.mutable.LinkedHashMap[Field, MVField],
       tableProperties: mutable.Map[String, String]): Unit = {
     var neworder = Seq[String]()
-    val parentOrder = parentTable.getSortColumns.asScala
+    val parentOrder = parentTable.getSortColumns().asScala
     parentOrder.foreach(parentcol =>
       fields.filter(col => fieldRelationMap(col).aggregateFunction.isEmpty &&
                            fieldRelationMap(col).columnTableRelationList.size == 1 &&
                            parentcol.equalsIgnoreCase(
-                             fieldRelationMap(col).columnTableRelationList(0).parentColumnName))
+                             fieldRelationMap(col).columnTableRelationList(0).columnName))
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] jackylk commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

jackylk commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#discussion_r382912976
 
 

 ##########
 File path: datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
 ##########
 @@ -104,6 +104,58 @@ class MVCreateTestCase extends QueryTest with BeforeAndAfterAll {
     sql(s"""LOAD DATA local inpath '$resourcesPath/data_big.csv' INTO TABLE fact_table6 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '"')""")
   }
 
+  test("test create mv on parquet spark table") {
+    sql("drop materialized view if exists mv1")
+    sql("drop table if exists source")
+    sql("create table source using parquet as select * from fact_table1")
+    sql("create materialized view mv1 as select empname, deptname, avg(salary) from source group by empname, deptname")
+    val df = sql("select empname, avg(salary) from source group by empname")
 
 Review comment:
   ok, added

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] jackylk commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

jackylk commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#discussion_r383699181
 
 

 ##########
 File path: mv/core/src/main/scala/org/apache/carbondata/mv/extension/MVHelper.scala
 ##########
 @@ -568,17 +573,18 @@ object MVHelper {
     generatePartitionerField(allPartitionColumn.toList, Seq.empty)
   }
 
-  private def inheritTablePropertiesFromMainTable(parentTable: CarbonTable,
+  private def inheritTablePropertiesFromMainTable(
+      parentTable: CarbonTable,
       fields: Seq[Field],
       fieldRelationMap: scala.collection.mutable.LinkedHashMap[Field, MVField],
       tableProperties: mutable.Map[String, String]): Unit = {
     var neworder = Seq[String]()
-    val parentOrder = parentTable.getSortColumns.asScala
+    val parentOrder = parentTable.getSortColumns().asScala
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#issuecomment-589958619
 
 
   Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/400/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#issuecomment-589965909
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2101/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#issuecomment-586955255
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2020/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on issue #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#issuecomment-586933814
 
 
   Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/318/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

akashrn5 commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#discussion_r383463395
 
 

 ##########
 File path: mv/core/src/main/scala/org/apache/carbondata/mv/extension/MVHelper.scala
 ##########
 @@ -220,55 +209,55 @@ object MVHelper {
       } else {
         Seq()
       }
-      partitionerFields = getPartitionerFields(parentPartitionColumns, fieldRelationMap)
+      getPartitionerFields(parentPartitionColumns, fieldRelationMap)
+    } else {
+      Seq.empty
     }
 
-    var order = 0
     val columnOrderMap = new java.util.HashMap[Integer, String]()
     if (partitionerFields.nonEmpty) {
-      fields.foreach { field =>
-        columnOrderMap.put(order, field.column)
-        order += 1
+      fields.zipWithIndex.foreach { case (field, index) =>
+        columnOrderMap.put(index, field.column)
       }
     }
 
-    // TODO Use a proper DB
-    val tableIdentifier = TableIdentifier(
-      dataMapSchema.getDataMapName + "_table", selectTables.head.identifier.database)
+    val mvTableIdentifier = TableIdentifier(
+      dataMapSchema.getDataMapName + "_table", mainTables.head.identifier.database)
+
     // prepare table model of the collected tokens
-    val tableModel: TableModel = CarbonParserUtil.prepareTableModel(
+    val mvTableModel: TableModel = CarbonParserUtil.prepareTableModel(
       ifNotExistPresent = ifNotExistsSet,
-      CarbonParserUtil.convertDbNameToLowerCase(tableIdentifier.database),
-      tableIdentifier.table.toLowerCase,
+      CarbonParserUtil.convertDbNameToLowerCase(mvTableIdentifier.database),
+      mvTableIdentifier.table.toLowerCase,
       fields,
       partitionerFields,
       tableProperties,
       None,
       isAlterFlow = false,
       None)
 
-    val tablePath = if (dmProperties.contains("path")) {
-      dmProperties("path")
+    val mvTablePath = if (properties.contains("path")) {
+      properties("path")
     } else {
-      CarbonEnv.getTablePath(tableModel.databaseNameOp, tableModel.tableName)(sparkSession)
+      CarbonEnv.getTablePath(mvTableModel.databaseNameOp, mvTableModel.tableName)(sparkSession)
     }
-    CarbonCreateTableCommand(TableNewProcessor(tableModel),
-      tableModel.ifNotExistsSet, Some(tablePath), isVisible = false).run(sparkSession)
+    CarbonCreateTableCommand(TableNewProcessor(mvTableModel),
 
 Review comment:
   since we can give sort columns and other things in properties during create of noncarbon table, can we have tests some of them for Materialize views created for non carbon tables?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] jackylk commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table

Posted by GitBox <gi...@apache.org>.

jackylk commented on a change in pull request #3625: [CARBONDATA-3705] Support create and load MV for spark datasource table
URL: https://github.com/apache/carbondata/pull/3625#discussion_r383699083
 
 

 ##########
 File path: mv/core/src/main/scala/org/apache/carbondata/mv/extension/MVHelper.scala
 ##########
 @@ -568,17 +573,18 @@ object MVHelper {
     generatePartitionerField(allPartitionColumn.toList, Seq.empty)
   }
 
-  private def inheritTablePropertiesFromMainTable(parentTable: CarbonTable,
+  private def inheritTablePropertiesFromMainTable(
 
 Review comment:
   yes, fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services