You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by kumarvishal09 <gi...@git.apache.org> on 2017/12/06 14:54:33 UTC

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

GitHub user kumarvishal09 opened a pull request:

    https://github.com/apache/carbondata/pull/1626

    [CARBONDATA-1519][PreAgg-Timeseries] Support Query and Load on timeseries table

    Support Loading and query on time series pre aggregate table
    
     - [ ] Any interfaces changed?
     No
     - [ ] Any backward compatibility impacted?
     No
     - [ ] Document update required?
    No
     - [ ] Testing done
         Added UT test cases for timeseries table selection and data loading on timeseries table + validation timeseries table data
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kumarvishal09/incubator-carbondata Branch_master_Timeseries_5-12_Query

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1626.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1626
    
----
commit 31ff61fb36336d3cbbf518cde6716894eed17638
Author: kumarvishal <ku...@gmail.com>
Date:   2017-12-05T10:30:48Z

    added support for time series create table

commit 9e72145b862553ad34ddced5bfe1282e23561fdd
Author: kumarvishal <ku...@gmail.com>
Date:   2017-12-05T15:26:58Z

    Added support for timeseries query

----


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/595/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1813/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/581/



---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155279382
  
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateUtil.scala ---
    @@ -493,4 +492,102 @@ object PreAggregateUtil {
         updatedPlan
       }
     
    +  /**
    +   * Below method will be used to get the select query when rollup policy is
    +   * applied in case of timeseries table
    +   * @param tableSchema
    +   *                    main data map schema
    +   * @param selectedDataMapSchema
    +   *                              selected data map schema for rollup
    +   * @return select query based on rolloup
    +   */
    +  def createTimeseriesSelectQueryForRollup(
    +      tableSchema: TableSchema,
    +      selectedDataMapSchema: AggregationDataMapSchema): String = {
    +    val aggregateColumns = scala.collection.mutable.ArrayBuffer.empty[String]
    +    val groupingExpressions = scala.collection.mutable.ArrayBuffer.empty[String]
    +    tableSchema.getListOfColumns.asScala.foreach {
    +      a => if (a.getAggFunction.nonEmpty) {
    +        aggregateColumns += s"${a.getAggFunction match {
    +          case "count" => "sum"
    +          case others@_ => others}}(${selectedDataMapSchema.getAggChildColByParent(
    +          a.getParentColumnTableRelations.get(0).getColumnName, a.getAggFunction).getColumnName})"
    +      } else if (a.getTimeSeriesFunction.nonEmpty) {
    +        groupingExpressions += s"timeseries(${
    +          selectedDataMapSchema
    +            .getNonAggNonTimeChildColBasedByParent(a.getParentColumnTableRelations.
    +              get(0).getColumnName).getColumnName
    +        } , '${ a.getTimeSeriesFunction }')"
    +      } else {
    +        groupingExpressions += selectedDataMapSchema
    +          .getNonAggNonTimeChildColBasedByParent(a.getParentColumnTableRelations.
    +            get(0).getColumnName).getColumnName
    +      }
    +    }
    +    s"select ${ groupingExpressions.mkString(",") },${ aggregateColumns.mkString(",")
    +    } from ${selectedDataMapSchema.getChildSchema.getTableName } " +
    +    s"group by ${ groupingExpressions.mkString(",") }"
    +  }
    +
    +  /**
    +   * Below method will be used to creating select query for timeseries
    +   * for lowest level for aggergation like second level, in that case it will
    +   * hit the maintable
    +   * @param tableSchema
    +   *                    data map schema
    +   * @param parentTableName
    +   *                        parent schema
    +   * @return select query for loading
    +   */
    +  def createTimeSeriesSelectQueryFromMain(tableSchema: TableSchema,
    +      parentTableName: String): String = {
    +    val aggregateColumns = scala.collection.mutable.ArrayBuffer.empty[String]
    +    val groupingExpressions = scala.collection.mutable.ArrayBuffer.empty[String]
    +    tableSchema.getListOfColumns.asScala.foreach {
    +      a =>
    +        if (a.getAggFunction.nonEmpty) {
    +          aggregateColumns +=
    +          s"${ a.getAggFunction }(${ a.getParentColumnTableRelations.get(0).getColumnName })"
    +        } else if (a.getTimeSeriesFunction.nonEmpty) {
    +          groupingExpressions +=
    +          s"timeseries(${ a.getParentColumnTableRelations.get(0).getColumnName },'${
    +            a
    +              .getTimeSeriesFunction
    +          }')"
    --- End diff --
    
    format properly


---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/1626


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1827/



---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155271321
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/preagg/AggregateTableSelector.java ---
    @@ -70,8 +70,15 @@ public AggregateTableSelector(QueryPlan queryPlan, CarbonTable parentTable) {
             AggregationDataMapSchema aggregationDataMapSchema = (AggregationDataMapSchema) dmSchema;
             isMatch = true;
             for (QueryColumn queryColumn : projectionColumn) {
    -          ColumnSchema columnSchemaByParentName = aggregationDataMapSchema
    -              .getNonAggChildColBasedByParent(queryColumn.getColumnSchema().getColumnName());
    +          ColumnSchema columnSchemaByParentName = null;
    --- End diff --
    
    Please refactor to remove duplicate code


---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by kumarvishal09 <gi...@git.apache.org>.
Github user kumarvishal09 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155429834
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/AggregationDataMapSchema.java ---
    @@ -210,4 +291,45 @@ private void fillAggFunctionColumns(List<ColumnSchema> listOfColumns) {
         }
       }
     
    +  public boolean isTimeseriesDataMap() {
    +    return isTimeseriesDataMap;
    +  }
    +
    +  /**
    +   * Below method is to support rollup during loading the data in pre aggregate table
    +   * In case of timeseries year level table data loading can be done using month level table or any
    +   * time series level below year level for example day,hour minute, second.
    +   * @TODO need to handle for pre aggregate table without timeseries
    +   *
    +   * @param aggregationDataMapSchema
    +   * @return whether aggregation data map can be selected or not
    +   */
    +  public boolean canSelectForRollup(AggregationDataMapSchema aggregationDataMapSchema) {
    +    List<ColumnSchema> listOfColumns = childSchema.getListOfColumns();
    --- End diff --
    
    handled in query preparation


---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155282425
  
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonPreAggregateRules.scala ---
    @@ -321,6 +313,62 @@ case class CarbonPreAggregateQueryRules(sparkSession: SparkSession) extends Rule
         }
       }
     
    +  /**
    +   * Below method will be used to extract the query columns from
    +   * filter expression
    +   * @param filterExp
    +   *                  filter expression
    +   * @param set
    +   *             query column list
    +   * @param carbonTable
    +   *                    parent table
    +   * @param tableName
    +   *                  table name
    +   * @return isvalid filter expression for aggregate
    +   */
    +  def extractQueryColumnFromFilterExp(filterExp: Expression,
    +      set: scala.collection.mutable.HashSet[QueryColumn],
    +      carbonTable: CarbonTable, tableName: String): Boolean = {
    +    val newFilterList = scala.collection.mutable.HashMap.empty[AttributeReference, String]
    --- End diff --
    
    rename to map


---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155277594
  
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateListeners.scala ---
    @@ -51,8 +63,26 @@ object LoadPostAggregateListener extends OperationEventListener {
               carbonLoadModel.getTableName, "false")
             val childTableName = dataMapSchema.getRelationIdentifier.getTableName
             val childDatabaseName = dataMapSchema.getRelationIdentifier.getDatabaseName
    -        val childDataFrame = sparkSession.sql(new CarbonSpark2SqlParser().addPreAggLoadFunction(
    -          s"${ dataMapSchema.getProperties.get("CHILD_SELECT QUERY") } ")).drop("preAggLoad")
    +        val childDataFrame =
    +          if (!dataMapSchema.asInstanceOf[AggregationDataMapSchema].isTimeseriesDataMap) {
    +            sparkSession.sql(new CarbonSpark2SqlParser().addPreAggLoadFunction(
    +              s"${ dataMapSchema.getProperties.get("CHILD_SELECT QUERY") } ")).drop("preAggLoad")
    +          } else {
    +            // for timeseries rollup policy
    +            val tableSelectedForRollup = PreAggregateUtil.getRollupDataMapNameForTimeSeries(list,
    +              dataMapSchema.asInstanceOf[AggregationDataMapSchema])
    +            // if non of the rollup data map is selected hit the maintable and prepare query
    +            val childQuery = if (!tableSelectedForRollup.isDefined) {
    --- End diff --
    
    use isEmpty


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Add PR description


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1831/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    retest this please


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2188/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/542/



---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155264101
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/AggregationDataMapSchema.java ---
    @@ -74,7 +86,49 @@ public ColumnSchema getNonAggChildColBasedByParent(String columnName) {
           Iterator<ColumnSchema> iterator = columnSchemas.iterator();
           while (iterator.hasNext()) {
             ColumnSchema next = iterator.next();
    -        if (null == next.getAggFunction() || next.getAggFunction().isEmpty()) {
    +        if ((null == next.getAggFunction() || next.getAggFunction().isEmpty()) && null == next
    +            .getTimeSeriesFunction() || next.getTimeSeriesFunction().isEmpty()) {
    +          return next;
    +        }
    +      }
    +    }
    +    return null;
    +  }
    +
    +  /**
    +   * Below method will be used to get the columns on which aggregate function is not applied
    +   * @param columnName
    +   *                parent column name
    +   * @return child column schema
    +   */
    +  public ColumnSchema getNonAggNonTimeChildColBasedByParent(String columnName) {
    +    Set<ColumnSchema> columnSchemas = parentToNonAggChildMapping.get(columnName);
    +    if (null != columnSchemas) {
    +      Iterator<ColumnSchema> iterator = columnSchemas.iterator();
    +      while (iterator.hasNext()) {
    +        ColumnSchema next = iterator.next();
    +        if ((null == next.getAggFunction() || next.getAggFunction().isEmpty())) {
    +          return next;
    +        }
    +      }
    +    }
    +    return null;
    +  }
    +
    +  /**
    +   * Below method will be used to get the columns on which aggregate function is not applied
    --- End diff --
    
    change description


---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155275089
  
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateListeners.scala ---
    @@ -38,8 +38,20 @@ object LoadPostAggregateListener extends OperationEventListener {
         val sparkSession = loadEvent.sparkSession
         val carbonLoadModel = loadEvent.carbonLoadModel
         val table = carbonLoadModel.getCarbonDataLoadSchema.getCarbonTable
    +    //    val parentTableName = table.getTableName
         if (table.hasDataMapSchema) {
    -      for (dataMapSchema: DataMapSchema <- table.getTableInfo.getDataMapSchemaList.asScala) {
    +      // getting all the aggergate datamap schema
    +      val aggregationDataMapList = table.getTableInfo.getDataMapSchemaList.asScala
    --- End diff --
    
    simplify `table.getTableInfo.getDataMapSchemaList.asScala.filter(_.isInstanceOf[AggregationDataMapSchema]).map(_.asInstanceOf)`


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2157/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/588/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1834/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2203/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/603/



---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155267822
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/AggregationDataMapSchema.java ---
    @@ -210,4 +291,45 @@ private void fillAggFunctionColumns(List<ColumnSchema> listOfColumns) {
         }
       }
     
    +  public boolean isTimeseriesDataMap() {
    +    return isTimeseriesDataMap;
    +  }
    +
    +  /**
    +   * Below method is to support rollup during loading the data in pre aggregate table
    +   * In case of timeseries year level table data loading can be done using month level table or any
    +   * time series level below year level for example day,hour minute, second.
    +   * @TODO need to handle for pre aggregate table without timeseries
    +   *
    +   * @param aggregationDataMapSchema
    +   * @return whether aggregation data map can be selected or not
    +   */
    +  public boolean canSelectForRollup(AggregationDataMapSchema aggregationDataMapSchema) {
    +    List<ColumnSchema> listOfColumns = childSchema.getListOfColumns();
    --- End diff --
    
    please handle `dummy-measure` scenario


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1786/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/529/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1819/



---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155279264
  
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateUtil.scala ---
    @@ -493,4 +492,102 @@ object PreAggregateUtil {
         updatedPlan
       }
     
    +  /**
    +   * Below method will be used to get the select query when rollup policy is
    +   * applied in case of timeseries table
    +   * @param tableSchema
    +   *                    main data map schema
    +   * @param selectedDataMapSchema
    +   *                              selected data map schema for rollup
    +   * @return select query based on rolloup
    +   */
    +  def createTimeseriesSelectQueryForRollup(
    --- End diff --
    
    After compaction PR is merged , this duplicate function should be merged to there.


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1824/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2160/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2186/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/597/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/535/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2174/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/608/



---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155276423
  
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateListeners.scala ---
    @@ -51,8 +63,26 @@ object LoadPostAggregateListener extends OperationEventListener {
               carbonLoadModel.getTableName, "false")
             val childTableName = dataMapSchema.getRelationIdentifier.getTableName
             val childDatabaseName = dataMapSchema.getRelationIdentifier.getDatabaseName
    -        val childDataFrame = sparkSession.sql(new CarbonSpark2SqlParser().addPreAggLoadFunction(
    -          s"${ dataMapSchema.getProperties.get("CHILD_SELECT QUERY") } ")).drop("preAggLoad")
    +        val childDataFrame =
    +          if (!dataMapSchema.asInstanceOf[AggregationDataMapSchema].isTimeseriesDataMap) {
    --- End diff --
    
    no need to use asInstanceOf


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/544/



---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155264012
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/AggregationDataMapSchema.java ---
    @@ -74,7 +86,49 @@ public ColumnSchema getNonAggChildColBasedByParent(String columnName) {
           Iterator<ColumnSchema> iterator = columnSchemas.iterator();
           while (iterator.hasNext()) {
             ColumnSchema next = iterator.next();
    -        if (null == next.getAggFunction() || next.getAggFunction().isEmpty()) {
    +        if ((null == next.getAggFunction() || next.getAggFunction().isEmpty()) && null == next
    +            .getTimeSeriesFunction() || next.getTimeSeriesFunction().isEmpty()) {
    +          return next;
    +        }
    +      }
    +    }
    +    return null;
    +  }
    +
    +  /**
    +   * Below method will be used to get the columns on which aggregate function is not applied
    --- End diff --
    
    change the description


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2194/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1778/



---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155263690
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/AggregationDataMapSchema.java ---
    @@ -74,7 +86,49 @@ public ColumnSchema getNonAggChildColBasedByParent(String columnName) {
           Iterator<ColumnSchema> iterator = columnSchemas.iterator();
           while (iterator.hasNext()) {
             ColumnSchema next = iterator.next();
    -        if (null == next.getAggFunction() || next.getAggFunction().isEmpty()) {
    +        if ((null == next.getAggFunction() || next.getAggFunction().isEmpty()) && null == next
    --- End diff --
    
    please rename method to `getNonAggNonTimeChildColBasedByParent`


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/579/



---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155279540
  
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateUtil.scala ---
    @@ -493,4 +492,102 @@ object PreAggregateUtil {
         updatedPlan
       }
     
    +  /**
    +   * Below method will be used to get the select query when rollup policy is
    +   * applied in case of timeseries table
    +   * @param tableSchema
    +   *                    main data map schema
    +   * @param selectedDataMapSchema
    +   *                              selected data map schema for rollup
    +   * @return select query based on rolloup
    +   */
    +  def createTimeseriesSelectQueryForRollup(
    +      tableSchema: TableSchema,
    +      selectedDataMapSchema: AggregationDataMapSchema): String = {
    +    val aggregateColumns = scala.collection.mutable.ArrayBuffer.empty[String]
    +    val groupingExpressions = scala.collection.mutable.ArrayBuffer.empty[String]
    +    tableSchema.getListOfColumns.asScala.foreach {
    +      a => if (a.getAggFunction.nonEmpty) {
    +        aggregateColumns += s"${a.getAggFunction match {
    +          case "count" => "sum"
    +          case others@_ => others}}(${selectedDataMapSchema.getAggChildColByParent(
    +          a.getParentColumnTableRelations.get(0).getColumnName, a.getAggFunction).getColumnName})"
    +      } else if (a.getTimeSeriesFunction.nonEmpty) {
    +        groupingExpressions += s"timeseries(${
    +          selectedDataMapSchema
    +            .getNonAggNonTimeChildColBasedByParent(a.getParentColumnTableRelations.
    +              get(0).getColumnName).getColumnName
    +        } , '${ a.getTimeSeriesFunction }')"
    +      } else {
    +        groupingExpressions += selectedDataMapSchema
    +          .getNonAggNonTimeChildColBasedByParent(a.getParentColumnTableRelations.
    +            get(0).getColumnName).getColumnName
    +      }
    +    }
    +    s"select ${ groupingExpressions.mkString(",") },${ aggregateColumns.mkString(",")
    +    } from ${selectedDataMapSchema.getChildSchema.getTableName } " +
    +    s"group by ${ groupingExpressions.mkString(",") }"
    +  }
    +
    +  /**
    +   * Below method will be used to creating select query for timeseries
    +   * for lowest level for aggergation like second level, in that case it will
    +   * hit the maintable
    +   * @param tableSchema
    +   *                    data map schema
    +   * @param parentTableName
    +   *                        parent schema
    +   * @return select query for loading
    +   */
    +  def createTimeSeriesSelectQueryFromMain(tableSchema: TableSchema,
    +      parentTableName: String): String = {
    +    val aggregateColumns = scala.collection.mutable.ArrayBuffer.empty[String]
    +    val groupingExpressions = scala.collection.mutable.ArrayBuffer.empty[String]
    +    tableSchema.getListOfColumns.asScala.foreach {
    +      a =>
    +        if (a.getAggFunction.nonEmpty) {
    +          aggregateColumns +=
    +          s"${ a.getAggFunction }(${ a.getParentColumnTableRelations.get(0).getColumnName })"
    +        } else if (a.getTimeSeriesFunction.nonEmpty) {
    +          groupingExpressions +=
    +          s"timeseries(${ a.getParentColumnTableRelations.get(0).getColumnName },'${
    +            a
    +              .getTimeSeriesFunction
    +          }')"
    +        } else {
    +          groupingExpressions += a.getParentColumnTableRelations.get(0).getColumnName
    +        }
    +    }
    +    s"select ${ groupingExpressions.mkString(",") },${
    +      aggregateColumns.mkString(",")
    +    } from ${ parentTableName } group by ${ groupingExpressions.mkString(",") }"
    +
    +  }
    +    /**
    +   * Below method will be used to select rollup table in case of
    +   * timeseries data map loading
    +   * @param list
    +   *             list of timeseries datamap
    +   * @param dataMapSchema
    +   *                      datamap schema
    +   * @return select table name
    +   */
    +  def getRollupDataMapNameForTimeSeries(
    +      list: scala.collection.mutable.ListBuffer[AggregationDataMapSchema],
    +      dataMapSchema: AggregationDataMapSchema): Option[AggregationDataMapSchema] = {
    +    if (list.isEmpty) {
    +      None
    +    } else {
    +      val rollupDataMapSchema = scala.collection.mutable.ListBuffer.empty[AggregationDataMapSchema]
    +      list.foreach{f =>
    +        if (dataMapSchema.canSelectForRollup(f)) {
    +          rollupDataMapSchema += f
    +        } }
    +      if(rollupDataMapSchema.isEmpty) {
    --- End diff --
    
    use `lastOption` instead of if else


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1801/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1799/



---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155283460
  
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonPreAggregateRules.scala ---
    @@ -725,6 +786,31 @@ case class CarbonPreAggregateQueryRules(sparkSession: SparkSession) extends Rule
             Alias(aggExp,
               name)(NamedExpression.newExprId,
               alias.qualifier).asInstanceOf[NamedExpression]
    +      case alias@Alias(expression: Expression, name) =>
    +        val updatedExp =
    +          if (expression.isInstanceOf[ScalaUDF] &&
    +              expression.asInstanceOf[ScalaUDF].function.getClass.getName.equalsIgnoreCase(
    +                "org.apache.spark.sql.execution.command.timeseries.TimeseriesUDf")) {
    +          expression.asInstanceOf[ScalaUDF].transform {
    +            case attr: AttributeReference =>
    +            val childAttributeReference = getChildAttributeReference(aggDataMapSchema,
    +              attr,
    +              attributes,
    +              timeseriesFunction =
    --- End diff --
    
    remove it


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1836/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/563/



---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155276144
  
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateListeners.scala ---
    @@ -65,6 +95,10 @@ object LoadPostAggregateListener extends OperationEventListener {
                 dataFrame = Some(childDataFrame),
                 internalOptions = Map(CarbonCommonConstants.IS_INTERNAL_LOAD_CALL -> "true")).
                 run(sparkSession)
    +          if (dataMapSchema.isInstanceOf[AggregationDataMapSchema] &&
    --- End diff --
    
    remove this check


---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155278666
  
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateUtil.scala ---
    @@ -493,4 +492,102 @@ object PreAggregateUtil {
         updatedPlan
       }
     
    +  /**
    +   * Below method will be used to get the select query when rollup policy is
    +   * applied in case of timeseries table
    +   * @param tableSchema
    +   *                    main data map schema
    +   * @param selectedDataMapSchema
    +   *                              selected data map schema for rollup
    +   * @return select query based on rolloup
    +   */
    +  def createTimeseriesSelectQueryForRollup(
    +      tableSchema: TableSchema,
    +      selectedDataMapSchema: AggregationDataMapSchema): String = {
    +    val aggregateColumns = scala.collection.mutable.ArrayBuffer.empty[String]
    +    val groupingExpressions = scala.collection.mutable.ArrayBuffer.empty[String]
    +    tableSchema.getListOfColumns.asScala.foreach {
    +      a => if (a.getAggFunction.nonEmpty) {
    --- End diff --
    
    move up a =>


---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155271792
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/preagg/TimeSeriesFunction.java ---
    @@ -21,20 +21,46 @@
      * enum for timeseries function
      */
     public enum TimeSeriesFunction {
    -  SECOND("second"),
    -  MINUTE("minute"),
    -  HOUR("hour"),
    -  DAY("day"),
    -  MONTH("month"),
    -  YEAR("year");
    +  SECOND("second", 0),
    +  MINUTE("minute", 1),
    +  HOUR("hour", 2),
    +  DAY("day", 3),
    +  MONTH("month", 4),
    +  YEAR("year", 5);
     
       private String name;
     
    -  TimeSeriesFunction(String name) {
    +  private int ordinal;
    +
    +  TimeSeriesFunction(String name, int ordinal) {
         this.name = name;
    +    this.ordinal = ordinal;
       }
     
       public String getName() {
         return name;
       }
    +
    +  public int getOrdinal() {
    +    return ordinal;
    +  }
    +
    +//   public static TimeSeriesFunction valueOf(String name) {
    --- End diff --
    
    remove commented code


---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155437464
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java ---
    @@ -719,4 +719,34 @@ public long size() throws IOException {
         }
         return dataSize + indexSize;
       }
    +
    +  /**
    +   * Utility function to check whether table has timseries datamap or not
    +   * @param carbonTable
    +   * @return timeseries data map present
    +   */
    +  public static boolean hasTimeSeriesDataMap(CarbonTable carbonTable) {
    --- End diff --
    
    Please move them to utility


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2173/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2202/



---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155270121
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java ---
    @@ -152,6 +154,15 @@ public static CarbonTable buildFromTableInfo(TableInfo tableInfo) {
         }
         table.hasDataMapSchema =
    --- End diff --
    
    Please don't change the carbonTable for aggdatamap. Add a utility function to decide it has agg datamap or not


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2192/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/600/



---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155437483
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java ---
    @@ -719,4 +719,34 @@ public long size() throws IOException {
         }
         return dataSize + indexSize;
       }
    +
    +  /**
    +   * Utility function to check whether table has timseries datamap or not
    +   * @param carbonTable
    +   * @return timeseries data map present
    +   */
    +  public static boolean hasTimeSeriesDataMap(CarbonTable carbonTable) {
    +    List<DataMapSchema> dataMapSchemaList = carbonTable.getTableInfo().getDataMapSchemaList();
    +    for (DataMapSchema dataMapSchema : dataMapSchemaList) {
    +      if (dataMapSchema instanceof AggregationDataMapSchema) {
    +        return ((AggregationDataMapSchema) dataMapSchema).isTimeseriesDataMap();
    +      }
    +    }
    +    return false;
    +  }
    +
    +  /**
    +   * Utility function to check whether table has timseries datamap or not
    +   * @param carbonTable
    +   * @return timeseries data map present
    +   */
    +  public static boolean hasAggregationDataMap(CarbonTable carbonTable) {
    --- End diff --
    
    Please move to utility


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2199/



---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155263571
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/AggregationDataMapSchema.java ---
    @@ -74,7 +86,49 @@ public ColumnSchema getNonAggChildColBasedByParent(String columnName) {
           Iterator<ColumnSchema> iterator = columnSchemas.iterator();
           while (iterator.hasNext()) {
             ColumnSchema next = iterator.next();
    -        if (null == next.getAggFunction() || next.getAggFunction().isEmpty()) {
    +        if ((null == next.getAggFunction() || next.getAggFunction().isEmpty()) && null == next
    +            .getTimeSeriesFunction() || next.getTimeSeriesFunction().isEmpty()) {
    +          return next;
    +        }
    +      }
    +    }
    +    return null;
    +  }
    +
    +  /**
    +   * Below method will be used to get the columns on which aggregate function is not applied
    +   * @param columnName
    +   *                parent column name
    +   * @return child column schema
    +   */
    +  public ColumnSchema getNonAggNonTimeChildColBasedByParent(String columnName) {
    --- End diff --
    
    Please rename method to `getNonAggTimeChildColBasedByParent` 


---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155265299
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/AggregationDataMapSchema.java ---
    @@ -125,6 +179,25 @@ public ColumnSchema getAggChildColByParent(String columnName,
         return null;
       }
     
    +  /**
    +   * Below method will be used to get the column schema based on parent column name
    +   * @param columName
    +   *                parent column name
    +   * @return child column schema
    +   */
    +  public ColumnSchema getTimeseriesChildColByParent(String columName, String timeseriesFunction) {
    --- End diff --
    
    I think it is duplicated to `getChildColByParentColName`, please remove it


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    retest this please


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/565/



---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155282655
  
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonPreAggregateRules.scala ---
    @@ -321,6 +313,62 @@ case class CarbonPreAggregateQueryRules(sparkSession: SparkSession) extends Rule
         }
       }
     
    +  /**
    +   * Below method will be used to extract the query columns from
    +   * filter expression
    +   * @param filterExp
    +   *                  filter expression
    +   * @param set
    +   *             query column list
    +   * @param carbonTable
    +   *                    parent table
    +   * @param tableName
    +   *                  table name
    +   * @return isvalid filter expression for aggregate
    +   */
    +  def extractQueryColumnFromFilterExp(filterExp: Expression,
    +      set: scala.collection.mutable.HashSet[QueryColumn],
    +      carbonTable: CarbonTable, tableName: String): Boolean = {
    +    val newFilterList = scala.collection.mutable.HashMap.empty[AttributeReference, String]
    --- End diff --
    
    Add comment why is this required


---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155282532
  
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonPreAggregateRules.scala ---
    @@ -321,6 +313,62 @@ case class CarbonPreAggregateQueryRules(sparkSession: SparkSession) extends Rule
         }
       }
     
    +  /**
    +   * Below method will be used to extract the query columns from
    +   * filter expression
    +   * @param filterExp
    +   *                  filter expression
    +   * @param set
    +   *             query column list
    +   * @param carbonTable
    +   *                    parent table
    +   * @param tableName
    +   *                  table name
    +   * @return isvalid filter expression for aggregate
    +   */
    +  def extractQueryColumnFromFilterExp(filterExp: Expression,
    +      set: scala.collection.mutable.HashSet[QueryColumn],
    +      carbonTable: CarbonTable, tableName: String): Boolean = {
    +    val newFilterList = scala.collection.mutable.HashMap.empty[AttributeReference, String]
    +    var isValidPlan = true
    +    filterExp.transform {
    +      case attr: AttributeReference =>
    +        if (!newFilterList.get(attr).isDefined) {
    +          newFilterList.put(attr, null)
    +        }
    +        attr
    +      case udf@ScalaUDF(_, _, _, _) =>
    +        if (udf.function.getClass.getName
    +          .equalsIgnoreCase("org.apache.spark.sql.execution.command.timeseries.TimeseriesUDf") &&
    +            carbonTable.hasTimeSeriesDataMap) {
    +          newFilterList.put(udf.children(0).asInstanceOf[AttributeReference],
    +            udf.children(1).asInstanceOf[Literal].value.toString)
    +        } else {
    +          udf.transform {
    +            case attr: AttributeReference =>
    +              if (!newFilterList.get(attr).isDefined) {
    +                newFilterList.put(attr, null)
    +              }
    +              attr
    +          }
    +        }
    +        udf
    +    }
    +    newFilterList.foreach {
    +      f =>
    --- End diff --
    
    format properly


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2150/



---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155270004
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java ---
    @@ -152,6 +154,15 @@ public static CarbonTable buildFromTableInfo(TableInfo tableInfo) {
         }
         table.hasDataMapSchema =
             null != tableInfo.getDataMapSchemaList() && tableInfo.getDataMapSchemaList().size() > 0;
    +    List<DataMapSchema> dataMapSchemaList = tableInfo.getDataMapSchemaList();
    +    for (DataMapSchema dataMapSchema : dataMapSchemaList) {
    +      if (dataMapSchema instanceof AggregationDataMapSchema) {
    +        if (!table.hasTimeSeriesDataMap) {
    --- End diff --
    
    Please don't change the carbonTable for timeseriesdatamap. Add an utility function to decide it has timeseries datamap or not


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/606/



---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    LGTM


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/515/



---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155272388
  
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/timeseries/TestTimeseriesDataLoad.scala ---
    @@ -0,0 +1,73 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.carbondata.integration.spark.testsuite.timeseries
    +
    +import java.sql.Timestamp
    +
    +import org.apache.spark.sql.Row
    +import org.apache.spark.sql.test.util.QueryTest
    +import org.scalatest.BeforeAndAfterAll
    +
    +class TestTimeseriesDataLoad extends QueryTest with BeforeAndAfterAll {
    +
    +  override def beforeAll: Unit = {
    +    sql("drop table if exists mainTable")
    +    sql("CREATE TABLE mainTable(mytime timestamp, name string, age int) STORED BY 'org.apache.carbondata.format'")
    +    sql("create datamap agg0 on table mainTable using 'preaggregate' DMPROPERTIES ('timeseries.eventTime'='mytime', 'timeseries.hierarchy'='second=1,minute=1,hour=1,day=1,month=1,year=1') as select mytime, sum(age) from mainTable group by mytime")
    +    sql(s"LOAD DATA LOCAL INPATH 'D:/mydata.csv' into table mainTable")
    --- End diff --
    
    change path


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/525/



---

[GitHub] carbondata pull request #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1626#discussion_r155279800
  
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/timeseries/TimeSeriesUDF.scala ---
    @@ -0,0 +1,33 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.sql.execution.command.timeseries
    +
    +import java.sql.Timestamp
    +
    +import org.apache.carbondata.core.preagg.TimeSeriesUDF
    +
    +/**
    + * Time series udf class
    + */
    +
    +class TimeseriesUDf extends Function2[Timestamp, String, Timestamp] with Serializable{
    --- End diff --
    
    Please rename this class to avoid duplicate names


---

[GitHub] carbondata issue #1626: [CARBONDATA-1519][PreAgg-Timeseries] Support Query a...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1626
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2167/



---