You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by ajantha-bhat <gi...@git.apache.org> on 2018/05/11 09:04:28 UTC

[GitHub] carbondata pull request #2299: [CARBONDATA-2472] Fixed:NonTransactional tabl...

GitHub user ajantha-bhat opened a pull request:

    https://github.com/apache/carbondata/pull/2299

    [CARBONDATA-2472] Fixed:NonTransactional table performance degarde issue induced by PR #2273

    
    
    
    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
    
     - [ ] Testing done
            Please provide details on 
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ajantha-bhat/carbondata unmanaged_table

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2299.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2299
    
----
commit 0846033a4ea91bf6e97a8a32dcbf853c2d98fdb0
Author: ajantha-bhat <aj...@...>
Date:   2018-05-11T08:58:46Z

    [CARBONDATA-2472] Fixed:NonTransactional table performance degarde issue induced by PR #2273

----


---

[GitHub] carbondata pull request #2299: [CARBONDATA-2472] Fixed:Refactor NonTransacti...

Posted by ajantha-bhat <gi...@git.apache.org>.
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2299#discussion_r188627098
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java ---
    @@ -401,6 +401,11 @@ public static Object getDataDataTypeForNoDictionaryColumn(String dimensionValue,
         } else if (actualDataType == DataTypes.LONG) {
           return ByteUtil.toBytes((Long) dimensionValue);
         } else if (actualDataType == DataTypes.TIMESTAMP) {
    +      if (dimensionValue instanceof String) {
    --- End diff --
    
    This fix is removed from this PR as it is not applicable


---

[GitHub] carbondata issue #2299: [CARBONDATA-2472] Fixed:Refactor NonTransactional ta...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2299
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5894/



---

[GitHub] carbondata pull request #2299: [CARBONDATA-2472] Fixed:Refactor NonTransacti...

Posted by ajantha-bhat <gi...@git.apache.org>.
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2299#discussion_r188626030
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java ---
    @@ -115,13 +123,55 @@ public DataMapBuilder createBuilder(Segment segment, String shardName) {
         Set<TableBlockIndexUniqueIdentifier> tableBlockIndexUniqueIdentifiers =
             segmentMap.get(segment.getSegmentNo());
         if (tableBlockIndexUniqueIdentifiers == null) {
    +      CarbonTable carbonTable = this.getCarbonTable();
    +      if (!carbonTable.getTableInfo().isTransactionalTable()) {
    +        // For NonTransactional table, compare the schema of all index files with inferred schema.
    +        // If there is a mismatch throw exception. As all files must be of same schema.
    +        validateSchemaForNewTranscationalTableFiles(segment, carbonTable);
    +      }
           tableBlockIndexUniqueIdentifiers =
               BlockletDataMapUtil.getTableBlockUniqueIdentifiers(segment);
           segmentMap.put(segment.getSegmentNo(), tableBlockIndexUniqueIdentifiers);
         }
         return tableBlockIndexUniqueIdentifiers;
       }
     
    +  private void validateSchemaForNewTranscationalTableFiles(Segment segment, CarbonTable carbonTable)
    +      throws IOException {
    +    SchemaConverter schemaConverter = new ThriftWrapperSchemaConverterImpl();
    +    Map<String, String> indexFiles = segment.getCommittedIndexFile();
    +    for (Map.Entry<String, String> indexFileEntry : indexFiles.entrySet()) {
    +      Path indexFile = new Path(indexFileEntry.getKey());
    +      org.apache.carbondata.format.TableInfo tableInfo = CarbonUtil.inferSchemaFromIndexFile(
    +          indexFile.toString(), carbonTable.getTableName());
    +      TableInfo wrapperTableInfo = schemaConverter.fromExternalToWrapperTableInfo(
    +          tableInfo, identifier.getDatabaseName(),
    +          identifier.getTableName(),
    +          identifier.getTablePath());
    +      List<ColumnSchema> indexFileColumnList =
    +          wrapperTableInfo.getFactTable().getListOfColumns();
    +      List<ColumnSchema> tableColumnList =
    +          carbonTable.getTableInfo().getFactTable().getListOfColumns();
    +      if (!compareColumnSchemaList(indexFileColumnList, tableColumnList)) {
    +        LOG.error("Schema of " + indexFile.getName()
    +            + " doesn't match with the table's schema");
    +        throw new IOException("All the files doesn't have same schema. "
    +            + "Unsupported operation on nonTransactional table. Check logs.");
    +      }
    +    }
    +  }
    +
    +  private boolean compareColumnSchemaList(List<ColumnSchema> indexFileColumnList,
    --- End diff --
    
    @xuchuanyin : Changed  the method name as suggested


---

[GitHub] carbondata issue #2299: [CARBONDATA-2472] Fixed:NonTransactional table perfo...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2299
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4679/



---

[GitHub] carbondata issue #2299: [CARBONDATA-2472] Fixed:NonTransactional table perfo...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2299
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4682/



---

[GitHub] carbondata issue #2299: [CARBONDATA-2472] Fixed:NonTransactional table perfo...

Posted by ajantha-bhat <gi...@git.apache.org>.
Github user ajantha-bhat commented on the issue:

    https://github.com/apache/carbondata/pull/2299
  
    retest this please 


---

[GitHub] carbondata issue #2299: [CARBONDATA-2472] Fixed:Refactor NonTransactional ta...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2299
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4753/



---

[GitHub] carbondata issue #2299: [CARBONDATA-2472] Fixed:NonTransactional table perfo...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2299
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4888/



---

[GitHub] carbondata issue #2299: [CARBONDATA-2472] Fixed:Refactor NonTransactional ta...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2299
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4740/



---

[GitHub] carbondata issue #2299: [CARBONDATA-2472] Fixed:Refactor NonTransactional ta...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2299
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5908/



---

[GitHub] carbondata issue #2299: [CARBONDATA-2472] Fixed:Refactor NonTransactional ta...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2299
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4941/



---

[GitHub] carbondata pull request #2299: [CARBONDATA-2472] Fixed:NonTransactional tabl...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2299#discussion_r187843922
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java ---
    @@ -401,6 +401,11 @@ public static Object getDataDataTypeForNoDictionaryColumn(String dimensionValue,
         } else if (actualDataType == DataTypes.LONG) {
           return ByteUtil.toBytes((Long) dimensionValue);
         } else if (actualDataType == DataTypes.TIMESTAMP) {
    +      if (dimensionValue instanceof String) {
    --- End diff --
    
    Avro supports timestamp as logical type. So need not support this, as avro supports specific data type


---

[GitHub] carbondata issue #2299: [CARBONDATA-2472] Fixed:NonTransactional table perfo...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2299
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5838/



---

[GitHub] carbondata issue #2299: [CARBONDATA-2472] Fixed:Refactor NonTransactional ta...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2299
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4929/



---

[GitHub] carbondata pull request #2299: [CARBONDATA-2472] Fixed:NonTransactional tabl...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2299#discussion_r187768312
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java ---
    @@ -115,13 +123,55 @@ public DataMapBuilder createBuilder(Segment segment, String shardName) {
         Set<TableBlockIndexUniqueIdentifier> tableBlockIndexUniqueIdentifiers =
             segmentMap.get(segment.getSegmentNo());
         if (tableBlockIndexUniqueIdentifiers == null) {
    +      CarbonTable carbonTable = this.getCarbonTable();
    +      if (!carbonTable.getTableInfo().isTransactionalTable()) {
    +        // For NonTransactional table, compare the schema of all index files with inferred schema.
    +        // If there is a mismatch throw exception. As all files must be of same schema.
    +        validateSchemaForNewTranscationalTableFiles(segment, carbonTable);
    +      }
           tableBlockIndexUniqueIdentifiers =
               BlockletDataMapUtil.getTableBlockUniqueIdentifiers(segment);
           segmentMap.put(segment.getSegmentNo(), tableBlockIndexUniqueIdentifiers);
         }
         return tableBlockIndexUniqueIdentifiers;
       }
     
    +  private void validateSchemaForNewTranscationalTableFiles(Segment segment, CarbonTable carbonTable)
    +      throws IOException {
    +    SchemaConverter schemaConverter = new ThriftWrapperSchemaConverterImpl();
    +    Map<String, String> indexFiles = segment.getCommittedIndexFile();
    +    for (Map.Entry<String, String> indexFileEntry : indexFiles.entrySet()) {
    +      Path indexFile = new Path(indexFileEntry.getKey());
    +      org.apache.carbondata.format.TableInfo tableInfo = CarbonUtil.inferSchemaFromIndexFile(
    +          indexFile.toString(), carbonTable.getTableName());
    +      TableInfo wrapperTableInfo = schemaConverter.fromExternalToWrapperTableInfo(
    +          tableInfo, identifier.getDatabaseName(),
    +          identifier.getTableName(),
    +          identifier.getTablePath());
    +      List<ColumnSchema> indexFileColumnList =
    +          wrapperTableInfo.getFactTable().getListOfColumns();
    +      List<ColumnSchema> tableColumnList =
    +          carbonTable.getTableInfo().getFactTable().getListOfColumns();
    +      if (!compareColumnSchemaList(indexFileColumnList, tableColumnList)) {
    +        LOG.error("Schema of " + indexFile.getName()
    +            + " doesn't match with the table's schema");
    +        throw new IOException("All the files doesn't have same schema. "
    +            + "Unsupported operation on nonTransactional table. Check logs.");
    +      }
    +    }
    +  }
    +
    +  private boolean compareColumnSchemaList(List<ColumnSchema> indexFileColumnList,
    +      List<ColumnSchema> tableColumnList) {
    +    if (indexFileColumnList.size() != tableColumnList.size()) {
    +      return false;
    --- End diff --
    
    Can you add a log here and line170 to tell the reason.


---

[GitHub] carbondata pull request #2299: [CARBONDATA-2472] Fixed:NonTransactional tabl...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2299#discussion_r187842996
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java ---
    @@ -115,13 +123,55 @@ public DataMapBuilder createBuilder(Segment segment, String shardName) {
         Set<TableBlockIndexUniqueIdentifier> tableBlockIndexUniqueIdentifiers =
             segmentMap.get(segment.getSegmentNo());
         if (tableBlockIndexUniqueIdentifiers == null) {
    +      CarbonTable carbonTable = this.getCarbonTable();
    +      if (!carbonTable.getTableInfo().isTransactionalTable()) {
    +        // For NonTransactional table, compare the schema of all index files with inferred schema.
    +        // If there is a mismatch throw exception. As all files must be of same schema.
    +        validateSchemaForNewTranscationalTableFiles(segment, carbonTable);
    +      }
           tableBlockIndexUniqueIdentifiers =
               BlockletDataMapUtil.getTableBlockUniqueIdentifiers(segment);
           segmentMap.put(segment.getSegmentNo(), tableBlockIndexUniqueIdentifiers);
         }
         return tableBlockIndexUniqueIdentifiers;
       }
     
    +  private void validateSchemaForNewTranscationalTableFiles(Segment segment, CarbonTable carbonTable)
    +      throws IOException {
    +    SchemaConverter schemaConverter = new ThriftWrapperSchemaConverterImpl();
    +    Map<String, String> indexFiles = segment.getCommittedIndexFile();
    +    for (Map.Entry<String, String> indexFileEntry : indexFiles.entrySet()) {
    +      Path indexFile = new Path(indexFileEntry.getKey());
    +      org.apache.carbondata.format.TableInfo tableInfo = CarbonUtil.inferSchemaFromIndexFile(
    +          indexFile.toString(), carbonTable.getTableName());
    +      TableInfo wrapperTableInfo = schemaConverter.fromExternalToWrapperTableInfo(
    +          tableInfo, identifier.getDatabaseName(),
    +          identifier.getTableName(),
    +          identifier.getTablePath());
    +      List<ColumnSchema> indexFileColumnList =
    +          wrapperTableInfo.getFactTable().getListOfColumns();
    +      List<ColumnSchema> tableColumnList =
    +          carbonTable.getTableInfo().getFactTable().getListOfColumns();
    +      if (!compareColumnSchemaList(indexFileColumnList, tableColumnList)) {
    +        LOG.error("Schema of " + indexFile.getName()
    +            + " doesn't match with the table's schema");
    +        throw new IOException("All the files doesn't have same schema. "
    +            + "Unsupported operation on nonTransactional table. Check logs.");
    +      }
    +    }
    +  }
    +
    +  private boolean compareColumnSchemaList(List<ColumnSchema> indexFileColumnList,
    --- End diff --
    
    1) Add validation logic the place where read is happening. otherwise files will be read 2 two times.
    2) All the blocklet index loading distribution logics also will work fine if handled while reading.


---

[GitHub] carbondata pull request #2299: [CARBONDATA-2472] Fixed:NonTransactional tabl...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2299#discussion_r187768276
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java ---
    @@ -115,13 +123,55 @@ public DataMapBuilder createBuilder(Segment segment, String shardName) {
         Set<TableBlockIndexUniqueIdentifier> tableBlockIndexUniqueIdentifiers =
             segmentMap.get(segment.getSegmentNo());
         if (tableBlockIndexUniqueIdentifiers == null) {
    +      CarbonTable carbonTable = this.getCarbonTable();
    +      if (!carbonTable.getTableInfo().isTransactionalTable()) {
    +        // For NonTransactional table, compare the schema of all index files with inferred schema.
    +        // If there is a mismatch throw exception. As all files must be of same schema.
    +        validateSchemaForNewTranscationalTableFiles(segment, carbonTable);
    +      }
           tableBlockIndexUniqueIdentifiers =
               BlockletDataMapUtil.getTableBlockUniqueIdentifiers(segment);
           segmentMap.put(segment.getSegmentNo(), tableBlockIndexUniqueIdentifiers);
         }
         return tableBlockIndexUniqueIdentifiers;
       }
     
    +  private void validateSchemaForNewTranscationalTableFiles(Segment segment, CarbonTable carbonTable)
    +      throws IOException {
    +    SchemaConverter schemaConverter = new ThriftWrapperSchemaConverterImpl();
    +    Map<String, String> indexFiles = segment.getCommittedIndexFile();
    +    for (Map.Entry<String, String> indexFileEntry : indexFiles.entrySet()) {
    +      Path indexFile = new Path(indexFileEntry.getKey());
    --- End diff --
    
    It seems it's unnecessary to new a path object.


---

[GitHub] carbondata pull request #2299: [CARBONDATA-2472] Fixed:Refactor NonTransacti...

Posted by ajantha-bhat <gi...@git.apache.org>.
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2299#discussion_r188624043
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java ---
    @@ -115,13 +123,55 @@ public DataMapBuilder createBuilder(Segment segment, String shardName) {
         Set<TableBlockIndexUniqueIdentifier> tableBlockIndexUniqueIdentifiers =
             segmentMap.get(segment.getSegmentNo());
         if (tableBlockIndexUniqueIdentifiers == null) {
    +      CarbonTable carbonTable = this.getCarbonTable();
    +      if (!carbonTable.getTableInfo().isTransactionalTable()) {
    +        // For NonTransactional table, compare the schema of all index files with inferred schema.
    +        // If there is a mismatch throw exception. As all files must be of same schema.
    +        validateSchemaForNewTranscationalTableFiles(segment, carbonTable);
    +      }
           tableBlockIndexUniqueIdentifiers =
               BlockletDataMapUtil.getTableBlockUniqueIdentifiers(segment);
           segmentMap.put(segment.getSegmentNo(), tableBlockIndexUniqueIdentifiers);
         }
         return tableBlockIndexUniqueIdentifiers;
       }
     
    +  private void validateSchemaForNewTranscationalTableFiles(Segment segment, CarbonTable carbonTable)
    +      throws IOException {
    +    SchemaConverter schemaConverter = new ThriftWrapperSchemaConverterImpl();
    +    Map<String, String> indexFiles = segment.getCommittedIndexFile();
    +    for (Map.Entry<String, String> indexFileEntry : indexFiles.entrySet()) {
    +      Path indexFile = new Path(indexFileEntry.getKey());
    --- End diff --
    
    In below logs, we need to print just the index file name. Hence path object getName() is used.


---

[GitHub] carbondata issue #2299: [CARBONDATA-2472] Fixed:NonTransactional table perfo...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2299
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4883/



---

[GitHub] carbondata pull request #2299: [CARBONDATA-2472] Fixed:NonTransactional tabl...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2299#discussion_r187768333
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java ---
    @@ -115,13 +123,55 @@ public DataMapBuilder createBuilder(Segment segment, String shardName) {
         Set<TableBlockIndexUniqueIdentifier> tableBlockIndexUniqueIdentifiers =
             segmentMap.get(segment.getSegmentNo());
         if (tableBlockIndexUniqueIdentifiers == null) {
    +      CarbonTable carbonTable = this.getCarbonTable();
    +      if (!carbonTable.getTableInfo().isTransactionalTable()) {
    +        // For NonTransactional table, compare the schema of all index files with inferred schema.
    +        // If there is a mismatch throw exception. As all files must be of same schema.
    +        validateSchemaForNewTranscationalTableFiles(segment, carbonTable);
    +      }
           tableBlockIndexUniqueIdentifiers =
               BlockletDataMapUtil.getTableBlockUniqueIdentifiers(segment);
           segmentMap.put(segment.getSegmentNo(), tableBlockIndexUniqueIdentifiers);
         }
         return tableBlockIndexUniqueIdentifiers;
       }
     
    +  private void validateSchemaForNewTranscationalTableFiles(Segment segment, CarbonTable carbonTable)
    +      throws IOException {
    +    SchemaConverter schemaConverter = new ThriftWrapperSchemaConverterImpl();
    +    Map<String, String> indexFiles = segment.getCommittedIndexFile();
    +    for (Map.Entry<String, String> indexFileEntry : indexFiles.entrySet()) {
    +      Path indexFile = new Path(indexFileEntry.getKey());
    +      org.apache.carbondata.format.TableInfo tableInfo = CarbonUtil.inferSchemaFromIndexFile(
    +          indexFile.toString(), carbonTable.getTableName());
    +      TableInfo wrapperTableInfo = schemaConverter.fromExternalToWrapperTableInfo(
    +          tableInfo, identifier.getDatabaseName(),
    +          identifier.getTableName(),
    +          identifier.getTablePath());
    +      List<ColumnSchema> indexFileColumnList =
    +          wrapperTableInfo.getFactTable().getListOfColumns();
    +      List<ColumnSchema> tableColumnList =
    +          carbonTable.getTableInfo().getFactTable().getListOfColumns();
    +      if (!compareColumnSchemaList(indexFileColumnList, tableColumnList)) {
    +        LOG.error("Schema of " + indexFile.getName()
    +            + " doesn't match with the table's schema");
    +        throw new IOException("All the files doesn't have same schema. "
    +            + "Unsupported operation on nonTransactional table. Check logs.");
    +      }
    +    }
    +  }
    +
    +  private boolean compareColumnSchemaList(List<ColumnSchema> indexFileColumnList,
    --- End diff --
    
    please optimize the method name, such as `isColumnSchemaSame`


---

[GitHub] carbondata issue #2299: [CARBONDATA-2472] Fixed:NonTransactional table perfo...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2299
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4685/



---

[GitHub] carbondata pull request #2299: [CARBONDATA-2472] Fixed:NonTransactional tabl...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2299#discussion_r187768121
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java ---
    @@ -297,7 +297,7 @@ public TableDataMap registerDataMap(CarbonTable table,
             dataMapSchema, dataMapFactory, blockletDetailsFetcher, segmentPropertiesFetcher);
     
         tableIndices.add(dataMap);
    -    allDataMaps.put(tableUniqueName, tableIndices);
    +    allDataMaps.put(tableUniqueName.toLowerCase(), tableIndices);
    --- End diff --
    
    The name is already lowercased here, no need to convert it here. If it is not, there maybe other bugs that cause it.



---

[GitHub] carbondata pull request #2299: [CARBONDATA-2472] Fixed:Refactor NonTransacti...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/2299


---

[GitHub] carbondata issue #2299: [CARBONDATA-2472] Fixed:Refactor NonTransactional ta...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2299
  
    LGTM


---

[GitHub] carbondata issue #2299: [CARBONDATA-2472] Fixed:Refactor NonTransactional ta...

Posted by ajantha-bhat <gi...@git.apache.org>.
Github user ajantha-bhat commented on the issue:

    https://github.com/apache/carbondata/pull/2299
  
    @ravipesala : please review


---

[GitHub] carbondata pull request #2299: [CARBONDATA-2472] Fixed:NonTransactional tabl...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2299#discussion_r187762940
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java ---
    @@ -297,7 +297,7 @@ public TableDataMap registerDataMap(CarbonTable table,
             dataMapSchema, dataMapFactory, blockletDetailsFetcher, segmentPropertiesFetcher);
     
         tableIndices.add(dataMap);
    -    allDataMaps.put(tableUniqueName, tableIndices);
    +    allDataMaps.put(tableUniqueName.toLowerCase(), tableIndices);
    --- End diff --
    
    Is this a bug? It seems not related to this PR


---

[GitHub] carbondata issue #2299: [CARBONDATA-2472] Fixed:NonTransactional table perfo...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2299
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5841/



---

[GitHub] carbondata issue #2299: [CARBONDATA-2472] Fixed:NonTransactional table perfo...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2299
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5835/



---

[GitHub] carbondata issue #2299: [CARBONDATA-2472] Fixed:NonTransactional table perfo...

Posted by ajantha-bhat <gi...@git.apache.org>.
Github user ajantha-bhat commented on the issue:

    https://github.com/apache/carbondata/pull/2299
  
    retest this please


---

[GitHub] carbondata pull request #2299: [CARBONDATA-2472] Fixed:Refactor NonTransacti...

Posted by ajantha-bhat <gi...@git.apache.org>.
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2299#discussion_r188625272
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java ---
    @@ -115,13 +123,55 @@ public DataMapBuilder createBuilder(Segment segment, String shardName) {
         Set<TableBlockIndexUniqueIdentifier> tableBlockIndexUniqueIdentifiers =
             segmentMap.get(segment.getSegmentNo());
         if (tableBlockIndexUniqueIdentifiers == null) {
    +      CarbonTable carbonTable = this.getCarbonTable();
    +      if (!carbonTable.getTableInfo().isTransactionalTable()) {
    +        // For NonTransactional table, compare the schema of all index files with inferred schema.
    +        // If there is a mismatch throw exception. As all files must be of same schema.
    +        validateSchemaForNewTranscationalTableFiles(segment, carbonTable);
    +      }
           tableBlockIndexUniqueIdentifiers =
               BlockletDataMapUtil.getTableBlockUniqueIdentifiers(segment);
           segmentMap.put(segment.getSegmentNo(), tableBlockIndexUniqueIdentifiers);
         }
         return tableBlockIndexUniqueIdentifiers;
       }
     
    +  private void validateSchemaForNewTranscationalTableFiles(Segment segment, CarbonTable carbonTable)
    +      throws IOException {
    +    SchemaConverter schemaConverter = new ThriftWrapperSchemaConverterImpl();
    +    Map<String, String> indexFiles = segment.getCommittedIndexFile();
    +    for (Map.Entry<String, String> indexFileEntry : indexFiles.entrySet()) {
    +      Path indexFile = new Path(indexFileEntry.getKey());
    +      org.apache.carbondata.format.TableInfo tableInfo = CarbonUtil.inferSchemaFromIndexFile(
    +          indexFile.toString(), carbonTable.getTableName());
    +      TableInfo wrapperTableInfo = schemaConverter.fromExternalToWrapperTableInfo(
    +          tableInfo, identifier.getDatabaseName(),
    +          identifier.getTableName(),
    +          identifier.getTablePath());
    +      List<ColumnSchema> indexFileColumnList =
    +          wrapperTableInfo.getFactTable().getListOfColumns();
    +      List<ColumnSchema> tableColumnList =
    +          carbonTable.getTableInfo().getFactTable().getListOfColumns();
    +      if (!compareColumnSchemaList(indexFileColumnList, tableColumnList)) {
    +        LOG.error("Schema of " + indexFile.getName()
    +            + " doesn't match with the table's schema");
    +        throw new IOException("All the files doesn't have same schema. "
    +            + "Unsupported operation on nonTransactional table. Check logs.");
    +      }
    +    }
    +  }
    +
    +  private boolean compareColumnSchemaList(List<ColumnSchema> indexFileColumnList,
    +      List<ColumnSchema> tableColumnList) {
    +    if (indexFileColumnList.size() != tableColumnList.size()) {
    +      return false;
    --- End diff --
    
    Added detail logs in this flow


---

[GitHub] carbondata pull request #2299: [CARBONDATA-2472] Fixed:Refactor NonTransacti...

Posted by ajantha-bhat <gi...@git.apache.org>.
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2299#discussion_r188625173
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java ---
    @@ -297,7 +297,7 @@ public TableDataMap registerDataMap(CarbonTable table,
             dataMapSchema, dataMapFactory, blockletDetailsFetcher, segmentPropertiesFetcher);
     
         tableIndices.add(dataMap);
    -    allDataMaps.put(tableUniqueName, tableIndices);
    +    allDataMaps.put(tableUniqueName.toLowerCase(), tableIndices);
    --- End diff --
    
    I found the actual bug, for external table, during table creation itself table name and db name was not converted to lowercase, because of this if table name is not in lowercase, when drop table, blocklet datamap was not getting cleared. 


---