You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by kunal642 <gi...@git.apache.org> on 2018/05/28 07:56:47 UTC

[GitHub] carbondata pull request #2347: [WIP] Added support for logical type

GitHub user kunal642 opened a pull request:

    https://github.com/apache/carbondata/pull/2347

    [WIP] Added support for logical type

    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
    
     - [ ] Testing done
            Please provide details on 
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kunal642/carbondata avro_logical_type_support

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2347.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2347
    
----
commit 135719aa1e3f69497d9b12a50b27d2cd3787ffce
Author: kunal642 <ku...@...>
Date:   2018-05-28T06:11:59Z

    added support for logical type

----


---

[GitHub] carbondata issue #2347: [WIP] Added support for logical type

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4959/



---

[GitHub] carbondata issue #2347: [WIP] Added support for logical type

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6121/



---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4988/



---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6176/



---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4999/



---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6161/



---

[GitHub] carbondata pull request #2347: [CARBONDATA-2554] Added support for logical t...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/2347


---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5144/



---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5143/



---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5140/



---

[GitHub] carbondata pull request #2347: [CARBONDATA-2554] Added support for logical t...

Posted by sounakr <gi...@git.apache.org>.
Github user sounakr commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2347#discussion_r191396937
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/datatypes/PrimitiveDataType.java ---
    @@ -316,15 +321,32 @@ public int getSurrogateIndex() {
               if (!this.carbonDimension.getUseActualData()) {
                 byte[] value = null;
                 if (isDirectDictionary) {
    -              int surrogateKey = dictionaryGenerator.getOrGenerateKey(parsedValue);
    +              int surrogateKey;
    +              if (dictionaryGenerator instanceof DirectDictionary
    --- End diff --
    
    Rectify the indentation


---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6154/



---

[GitHub] carbondata pull request #2347: [CARBONDATA-2554] Added support for logical t...

Posted by sounakr <gi...@git.apache.org>.
Github user sounakr commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2347#discussion_r191396859
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/datatypes/PrimitiveDataType.java ---
    @@ -288,7 +288,12 @@ public int getSurrogateIndex() {
               logHolder.setReason(message);
             }
           } else {
    -        surrogateKey = dictionaryGenerator.getOrGenerateKey(parsedValue);
    +        if (dictionaryGenerator instanceof DirectDictionary
    +            && input instanceof Long) {
    --- End diff --
    
    Rectify the indentation.


---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by kumarvishal09 <gi...@git.apache.org>.
Github user kumarvishal09 commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    LGTM


---

[GitHub] carbondata issue #2347: [WIP] Added support for logical type

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5117/



---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5154/



---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4992/



---

[GitHub] carbondata issue #2347: [WIP] Added support for logical type

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6141/



---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by kunal642 <gi...@git.apache.org>.
Github user kunal642 commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    retest this please


---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5133/



---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5151/



---

[GitHub] carbondata pull request #2347: [CARBONDATA-2554] Added support for logical t...

Posted by sounakr <gi...@git.apache.org>.
Github user sounakr commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2347#discussion_r191386255
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/datatypes/PrimitiveDataType.java ---
    @@ -288,7 +288,12 @@ public int getSurrogateIndex() {
               logHolder.setReason(message);
             }
           } else {
    -        surrogateKey = dictionaryGenerator.getOrGenerateKey(parsedValue);
    +        if (dictionaryGenerator instanceof DirectDictionary
    --- End diff --
    
      @Override public void writeByteArray(Object input, DataOutputStream dataOutputStream,
          BadRecordLogHolder logHolder) throws IOException, DictionaryGenerationException {
        String parsedValue =
            input == null ? null : DataTypeUtil.parseValue(input.toString(), carbonDimension);
    
    If the input is long, is it needed to parse as toString?


---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by sounakr <gi...@git.apache.org>.
Github user sounakr commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    LGTM


---

[GitHub] carbondata issue #2347: [WIP] Added support for logical type

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5112/



---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6179/



---

[GitHub] carbondata pull request #2347: [CARBONDATA-2554] Added support for logical t...

Posted by sounakr <gi...@git.apache.org>.
Github user sounakr commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2347#discussion_r191398287
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/datatypes/PrimitiveDataType.java ---
    @@ -316,15 +321,32 @@ public int getSurrogateIndex() {
               if (!this.carbonDimension.getUseActualData()) {
                 byte[] value = null;
                 if (isDirectDictionary) {
    -              int surrogateKey = dictionaryGenerator.getOrGenerateKey(parsedValue);
    +              int surrogateKey;
    +              if (dictionaryGenerator instanceof DirectDictionary
    +                  && input instanceof Long) {
    +                surrogateKey = ((DirectDictionary) dictionaryGenerator).generateKey((long) input);
    +              } else {
    +                surrogateKey = dictionaryGenerator.getOrGenerateKey(parsedValue);
    +              }
                   if (surrogateKey == CarbonCommonConstants.INVALID_SURROGATE_KEY) {
                     value = new byte[0];
                   } else {
                     value = ByteUtil.toBytes(surrogateKey);
                   }
                 } else {
    -              value = DataTypeUtil.getBytesBasedOnDataTypeForNoDictionaryColumn(parsedValue,
    -                  this.carbonDimension.getDataType(), dateFormat);
    +              if (this.carbonDimension.getDataType().equals(DataTypes.DATE)
    +                  || this.carbonDimension.getDataType().equals(DataTypes.TIMESTAMP)
    +                  && input instanceof Long) {
    --- End diff --
    
    Add a comment on which case input will be long


---

[GitHub] carbondata pull request #2347: [CARBONDATA-2554] Added support for logical t...

Posted by sounakr <gi...@git.apache.org>.
Github user sounakr commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2347#discussion_r191399826
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/steps/InputProcessorStepWithNoConverterImpl.java ---
    @@ -313,7 +315,22 @@ private CarbonRowBatch getBatch() {
                   throw new CarbonDataLoadingException("Loading Exception", e);
                 }
               } else {
    -            newData[i] = data[orderOfData[i]];
    +            DataType dataType = dataFields[i].getColumn().getDataType();
    +            if (dataType == DataTypes.DATE && data[orderOfData[i]] instanceof Long) {
    +              DirectDictionaryGenerator directDictionaryGenerator =
    --- End diff --
    
    why everytime a new directDictionaryGenerator object is needed? It can also be a member variable of InputProcessorStepWithNoConverterImpl and initialize only once. 


---

[GitHub] carbondata issue #2347: [WIP] Added support for logical type

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6129/



---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6167/



---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5015/



---

[GitHub] carbondata issue #2347: [WIP] Added support for logical type

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4978/



---

[GitHub] carbondata pull request #2347: [CARBONDATA-2554] Added support for logical t...

Posted by sounakr <gi...@git.apache.org>.
Github user sounakr commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2347#discussion_r191404125
  
    --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/AvroCarbonWriter.java ---
    @@ -221,13 +255,22 @@ private static Field prepareFields(Schema.Field avroField) {
     
       private static StructField prepareSubFields(String FieldName, Schema childSchema) {
         Schema.Type type = childSchema.getType();
    +    LogicalType logicalType = childSchema.getLogicalType();
         switch (type) {
           case BOOLEAN:
             return new StructField(FieldName, DataTypes.BOOLEAN);
           case INT:
    -        return new StructField(FieldName, DataTypes.INT);
    +        if (logicalType == null) {
    +          return new StructField(FieldName, DataTypes.INT);
    --- End diff --
    
    Make the checks in sync with avroFieldToObject logicaltype checks


---

[GitHub] carbondata issue #2347: [WIP] Added support for logical type

Posted by kunal642 <gi...@git.apache.org>.
Github user kunal642 commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    retest this please


---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6169/



---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5018/



---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by sounakr <gi...@git.apache.org>.
Github user sounakr commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    @kunal642 . Please check all these logicalType can be supported in the current PR.  
       1. Time-millis
       2. Time-micros     
       2. duration   


---

[GitHub] carbondata pull request #2347: [CARBONDATA-2554] Added support for logical t...

Posted by KanakaKumar <gi...@git.apache.org>.
Github user KanakaKumar commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2347#discussion_r191310300
  
    --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/AvroCarbonWriter.java ---
    @@ -177,13 +188,22 @@ private static Field prepareFields(Schema.Field avroField) {
         String FieldName = avroField.name();
         Schema childSchema = avroField.schema();
         Schema.Type type = childSchema.getType();
    +    LogicalType logicalType = childSchema.getLogicalType();
         switch (type) {
           case BOOLEAN:
             return new Field(FieldName, DataTypes.BOOLEAN);
           case INT:
    -        return new Field(FieldName, DataTypes.INT);
    +        if (logicalType == null) {
    --- End diff --
    
    Need to check the logical type is Date or something else.


---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by kunal642 <gi...@git.apache.org>.
Github user kunal642 commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    @kumarvishal09 Please review


---

[GitHub] carbondata issue #2347: [WIP] Added support for logical type

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4967/



---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by kunal642 <gi...@git.apache.org>.
Github user kunal642 commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    @ravipesala Please review


---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by kunal642 <gi...@git.apache.org>.
Github user kunal642 commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    retest this please


---

[GitHub] carbondata issue #2347: [CARBONDATA-2554] Added support for logical type

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2347
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6150/



---

[GitHub] carbondata pull request #2347: [CARBONDATA-2554] Added support for logical t...

Posted by sounakr <gi...@git.apache.org>.
Github user sounakr commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2347#discussion_r191403062
  
    --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/AvroCarbonWriter.java ---
    @@ -177,13 +198,26 @@ private static Field prepareFields(Schema.Field avroField) {
         String FieldName = avroField.name();
         Schema childSchema = avroField.schema();
         Schema.Type type = childSchema.getType();
    +    LogicalType logicalType = childSchema.getLogicalType();
         switch (type) {
           case BOOLEAN:
             return new Field(FieldName, DataTypes.BOOLEAN);
           case INT:
    -        return new Field(FieldName, DataTypes.INT);
    +        if (logicalType instanceof LogicalTypes.Date) {
    +          return new Field(FieldName, DataTypes.DATE);
    +        } else {
    +          LOGGER.warn("Unsupported logical type. Considering Data Type as INT for " + childSchema
    +              .getName());
    +          return new Field(FieldName, DataTypes.INT);
    +        }
           case LONG:
    -        return new Field(FieldName, DataTypes.LONG);
    +        if (logicalType instanceof LogicalTypes.TimestampMillis) {
    --- End diff --
    
    Don't we have to check the TimeStampMicros and TimeStamp logicaltypes 


---

[GitHub] carbondata pull request #2347: [CARBONDATA-2554] Added support for logical t...

Posted by KanakaKumar <gi...@git.apache.org>.
Github user KanakaKumar commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2347#discussion_r191310013
  
    --- Diff: store/sdk/src/main/java/org/apache/carbondata/sdk/file/AvroCarbonWriter.java ---
    @@ -88,9 +90,18 @@
       private Object avroFieldToObject(Schema.Field avroField, Object fieldValue) {
         Object out;
         Schema.Type type = avroField.schema().getType();
    +    LogicalType logicalType = avroField.schema().getLogicalType();
         switch (type) {
    -      case BOOLEAN:
           case INT:
    +        if (logicalType != null) {
    --- End diff --
    
    Consider to convert as date only if the logical type is Date


---