You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by akashrn5 <gi...@git.apache.org> on 2017/11/17 11:38:32 UTC

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

GitHub user akashrn5 opened a pull request:

    https://github.com/apache/carbondata/pull/1524

    [CARBONDATA-1762] Remove existing column level dateformat and support dateformat, timestampformat in the load option

    (1) Remove column level dateformat option
    (2) Support dateformat and timestampformat in load options(table level)
    
    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [X] Document update required?
             2 new load level properties are added. Document to be updated accordingly.
     - [X] Testing done
            UT Added
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/akashrn5/incubator-carbondata timeformat

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1524.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1524
    
----
commit f7b253c672c4eed21466806cd4bc9990264a8a37
Author: akashrn5 <ak...@gmail.com>
Date:   2017-11-17T11:25:33Z

    [CARBONDATA-1762] Remove existing column level dateformat and support dateformat, timestampformat in the load option

----


---

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1524#discussion_r151673677
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java ---
    @@ -52,6 +52,14 @@
       public static final String CARBON_OPTIONS_DATEFORMAT =
           "carbon.options.dateformat";
       public static final String CARBON_OPTIONS_DATEFORMAT_DEFAULT = "";
    +
    +  /**
    +   * option to specify the load option
    +   */
    +  @CarbonProperty
    +  public static final String CARBON_OPTIONS_TIMESTAMPFORMAT =
    +          "carbon.options.dateformat";
    --- End diff --
    
    I think `carbon.options.dateformat ` should be `carbon.options.timestampformat`


---

[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1524
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1266/



---

[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1524
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1241/



---

[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1524
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1233/



---

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1524#discussion_r151674183
  
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/load/ValidateUtil.scala ---
    @@ -17,35 +17,32 @@
     
     package org.apache.carbondata.spark.load
     
    -import scala.collection.JavaConverters._
    +import java.text.SimpleDateFormat
     
    -import org.apache.carbondata.core.constants.CarbonCommonConstants
     import org.apache.carbondata.core.metadata.schema.table.CarbonTable
    -import org.apache.carbondata.processing.loading.model.CarbonLoadModel
     import org.apache.carbondata.processing.loading.sort.SortScopeOptions
     import org.apache.carbondata.spark.exception.MalformedCarbonCommandException
     
     object ValidateUtil {
    -  def validateDateFormat(dateFormat: String, table: CarbonTable, tableName: String): Unit = {
    -    val dimensions = table.getDimensionByTableName(tableName).asScala
    +
    +  /**
    +   * validates both timestamp and date for illegal values
    +   *
    +   * @param optionValue
    +   * @param optionName
    +   */
    +  def validateDateTimeFormat(optionValue: String, optionName: String): Unit = {
    --- End diff --
    
    give proper names to parameters


---

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1524#discussion_r151831621
  
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala ---
    @@ -75,55 +76,19 @@ class TestLoadDataWithDiffTimestampFormat extends QueryTest with BeforeAndAfterA
           assert(false)
         } catch {
           case ex: MalformedCarbonCommandException =>
    -        assertResult(ex.getMessage)("Error: Option DateFormat is not provided for Column date.")
    +        assertResult(ex.getMessage)("Error: Wrong option: date is provided for option DateFormat")
           case _: Throwable=> assert(false)
         }
     
         try {
           sql(s"""
                LOAD DATA LOCAL INPATH '$resourcesPath/timeStampFormatData1.csv' into table t3
    -           OPTIONS('dateformat' = 'fasfdas:yyyy/MM/dd')
    --- End diff --
    
    why delete these testcase? `dateformat` is still a valid loading option, right?


---

[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

Posted by dhatchayani <gi...@git.apache.org>.
Github user dhatchayani commented on the issue:

    https://github.com/apache/carbondata/pull/1524
  
    retest this please


---

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

Posted by dhatchayani <gi...@git.apache.org>.
Github user dhatchayani commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1524#discussion_r151832053
  
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/load/ValidateUtil.scala ---
    @@ -17,35 +17,32 @@
     
     package org.apache.carbondata.spark.load
     
    -import scala.collection.JavaConverters._
    +import java.text.SimpleDateFormat
     
    -import org.apache.carbondata.core.constants.CarbonCommonConstants
     import org.apache.carbondata.core.metadata.schema.table.CarbonTable
    -import org.apache.carbondata.processing.loading.model.CarbonLoadModel
     import org.apache.carbondata.processing.loading.sort.SortScopeOptions
     import org.apache.carbondata.spark.exception.MalformedCarbonCommandException
     
     object ValidateUtil {
    -  def validateDateFormat(dateFormat: String, table: CarbonTable, tableName: String): Unit = {
    -    val dimensions = table.getDimensionByTableName(tableName).asScala
    +
    +  /**
    +   * validates both timestamp and date for illegal values
    --- End diff --
    
    this validates both timestamp and date format


---

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1524#discussion_r151831369
  
    --- Diff: integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ---
    @@ -799,18 +799,19 @@ object CarbonDataRDDFactory {
           throw new DataLoadingException("Partition column not found.")
         }
     
    -    val dateFormatMap = CarbonDataProcessorUtil.getDateFormatMap(carbonLoadModel.getDateFormat)
    -    val specificFormat = Option(dateFormatMap.get(partitionColumn.toLowerCase))
    -    val timeStampFormat = if (specificFormat.isDefined) {
    -      new SimpleDateFormat(specificFormat.get)
    -    } else {
    -      val timestampFormatString = CarbonProperties.getInstance().getProperty(CarbonCommonConstants
    -        .CARBON_TIMESTAMP_FORMAT, CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT)
    -      new SimpleDateFormat(timestampFormatString)
    -    }
    +    val specificTimestampFormat = carbonLoadModel.getTimestampformat
    +    val specificDateFormat = carbonLoadModel.getDateFormat
    +    val timeStampFormat =
    +      if (specificTimestampFormat != null && !specificTimestampFormat.trim.isEmpty) {
    +        new SimpleDateFormat(specificTimestampFormat)
    +      } else {
    +        val timestampFormatString = CarbonProperties.getInstance().getProperty(CarbonCommonConstants
    --- End diff --
    
    move `CarbonCommonConstants` to next line, give one parameter in one line


---

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1524#discussion_r151674289
  
    --- Diff: integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ---
    @@ -799,18 +799,19 @@ object CarbonDataRDDFactory {
           throw new DataLoadingException("Partition column not found.")
         }
     
    -    val dateFormatMap = CarbonDataProcessorUtil.getDateFormatMap(carbonLoadModel.getDateFormat)
    -    val specificFormat = Option(dateFormatMap.get(partitionColumn.toLowerCase))
    -    val timeStampFormat = if (specificFormat.isDefined) {
    -      new SimpleDateFormat(specificFormat.get)
    +    val specificTimestampFormat = carbonLoadModel.getTimestampformat
    +    val specificDateFormat = carbonLoadModel.getDateFormat
    +    val timeStampFormat = if (specificTimestampFormat != null &&
    +                              !specificTimestampFormat.trim.isEmpty) {
    --- End diff --
    
    format it properly.


---

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1524#discussion_r151831633
  
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/load/ValidateUtil.scala ---
    @@ -17,35 +17,32 @@
     
     package org.apache.carbondata.spark.load
     
    -import scala.collection.JavaConverters._
    +import java.text.SimpleDateFormat
     
    -import org.apache.carbondata.core.constants.CarbonCommonConstants
     import org.apache.carbondata.core.metadata.schema.table.CarbonTable
    -import org.apache.carbondata.processing.loading.model.CarbonLoadModel
     import org.apache.carbondata.processing.loading.sort.SortScopeOptions
     import org.apache.carbondata.spark.exception.MalformedCarbonCommandException
     
     object ValidateUtil {
    -  def validateDateFormat(dateFormat: String, table: CarbonTable, tableName: String): Unit = {
    -    val dimensions = table.getDimensionByTableName(tableName).asScala
    +
    +  /**
    +   * validates both timestamp and date for illegal values
    --- End diff --
    
    validate timestamp format and date format. please change


---

[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1524
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1231/



---

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1524#discussion_r151831351
  
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/load/ValidateUtil.scala ---
    @@ -17,35 +17,32 @@
     
     package org.apache.carbondata.spark.load
     
    -import scala.collection.JavaConverters._
    +import java.text.SimpleDateFormat
     
    -import org.apache.carbondata.core.constants.CarbonCommonConstants
     import org.apache.carbondata.core.metadata.schema.table.CarbonTable
    -import org.apache.carbondata.processing.loading.model.CarbonLoadModel
     import org.apache.carbondata.processing.loading.sort.SortScopeOptions
     import org.apache.carbondata.spark.exception.MalformedCarbonCommandException
     
     object ValidateUtil {
    -  def validateDateFormat(dateFormat: String, table: CarbonTable, tableName: String): Unit = {
    -    val dimensions = table.getDimensionByTableName(tableName).asScala
    +
    +  /**
    +   * validates both timestamp and date for illegal values
    +   *
    +   * @param dateTimeLoadFormat
    +   * @param dateTimeLoadOption
    +   */
    +  def validateDateTimeFormat(dateTimeLoadFormat: String, dateTimeLoadOption: String): Unit = {
         // allowing empty value to be configured for dateformat option.
    -    if (dateFormat != null && dateFormat.trim != "") {
    -        val dateFormats: Array[String] = dateFormat.split(CarbonCommonConstants.COMMA)
    -        for (singleDateFormat <- dateFormats) {
    -          val dateFormatSplits: Array[String] = singleDateFormat.split(":", 2)
    -          val columnName = dateFormatSplits(0).trim.toLowerCase
    -          if (!dimensions.exists(_.getColName.equals(columnName))) {
    -            throw new MalformedCarbonCommandException("Error: Wrong Column Name " +
    -              dateFormatSplits(0) +
    -              " is provided in Option DateFormat.")
    -          }
    -          if (dateFormatSplits.length < 2 || dateFormatSplits(1).trim.isEmpty) {
    -            throw new MalformedCarbonCommandException("Error: Option DateFormat is not provided " +
    -              "for " + "Column " + dateFormatSplits(0) +
    -              ".")
    -          }
    -        }
    +    if (dateTimeLoadFormat != null && dateTimeLoadFormat.trim != "") {
    +      try {
    +        new SimpleDateFormat(dateTimeLoadFormat)
           }
    +      catch {
    --- End diff --
    
    move to previous line


---

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/1524


---

[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

Posted by dhatchayani <gi...@git.apache.org>.
Github user dhatchayani commented on the issue:

    https://github.com/apache/carbondata/pull/1524
  
    Retest this please


---

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1524#discussion_r151831713
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java ---
    @@ -52,6 +52,14 @@
       public static final String CARBON_OPTIONS_DATEFORMAT =
           "carbon.options.dateformat";
       public static final String CARBON_OPTIONS_DATEFORMAT_DEFAULT = "";
    +
    +  /**
    +   * option to specify the load option
    +   */
    +  @CarbonProperty
    +  public static final String CARBON_OPTIONS_TIMESTAMPFORMAT =
    --- End diff --
    
    please add comment for these two options to explain how it should be used: `CARBON_OPTIONS_DATEFORMAT` and `CARBON_OPTIONS_TIMESTAMPFORMAT`


---

[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1524
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1252/



---

[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1524
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1261/



---

[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1524
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1236/



---

[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/1524
  
    LGTM


---

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1524#discussion_r151831356
  
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/load/ValidateUtil.scala ---
    @@ -17,35 +17,32 @@
     
     package org.apache.carbondata.spark.load
     
    -import scala.collection.JavaConverters._
    +import java.text.SimpleDateFormat
     
    -import org.apache.carbondata.core.constants.CarbonCommonConstants
     import org.apache.carbondata.core.metadata.schema.table.CarbonTable
    -import org.apache.carbondata.processing.loading.model.CarbonLoadModel
     import org.apache.carbondata.processing.loading.sort.SortScopeOptions
     import org.apache.carbondata.spark.exception.MalformedCarbonCommandException
     
     object ValidateUtil {
    -  def validateDateFormat(dateFormat: String, table: CarbonTable, tableName: String): Unit = {
    -    val dimensions = table.getDimensionByTableName(tableName).asScala
    +
    +  /**
    +   * validates both timestamp and date for illegal values
    +   *
    +   * @param dateTimeLoadFormat
    +   * @param dateTimeLoadOption
    +   */
    +  def validateDateTimeFormat(dateTimeLoadFormat: String, dateTimeLoadOption: String): Unit = {
         // allowing empty value to be configured for dateformat option.
    -    if (dateFormat != null && dateFormat.trim != "") {
    -        val dateFormats: Array[String] = dateFormat.split(CarbonCommonConstants.COMMA)
    -        for (singleDateFormat <- dateFormats) {
    -          val dateFormatSplits: Array[String] = singleDateFormat.split(":", 2)
    -          val columnName = dateFormatSplits(0).trim.toLowerCase
    -          if (!dimensions.exists(_.getColName.equals(columnName))) {
    -            throw new MalformedCarbonCommandException("Error: Wrong Column Name " +
    -              dateFormatSplits(0) +
    -              " is provided in Option DateFormat.")
    -          }
    -          if (dateFormatSplits.length < 2 || dateFormatSplits(1).trim.isEmpty) {
    -            throw new MalformedCarbonCommandException("Error: Option DateFormat is not provided " +
    -              "for " + "Column " + dateFormatSplits(0) +
    -              ".")
    -          }
    -        }
    +    if (dateTimeLoadFormat != null && dateTimeLoadFormat.trim != "") {
    +      try {
    +        new SimpleDateFormat(dateTimeLoadFormat)
           }
    +      catch {
    +        case _: IllegalArgumentException =>
    +          throw new MalformedCarbonCommandException("Error: Wrong option: " + dateTimeLoadFormat +
    --- End diff --
    
    use $ for variable


---