You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@carbondata.apache.org by ra...@apache.org on 2019/04/02 02:41:20 UTC

[carbondata] branch branch-1.5 updated (441edbb -> 6a57b4b)

This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a change to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git.


    from 441edbb  [maven-release-plugin] prepare for next development iteration
     new d55164f  [CARBONDATA-3287]Remove the validation for same schema in a location and fix drop datamap issue
     new 3df5a2f  [CARBONDATA-3284] [CARBONDATA-3285] Workaround for Create-PreAgg Datamap Fail & Sort-Columns Fix
     new 9e6b544  [CARBONDATA-3278] Remove duplicate code to get filter string of date/timestamp
     new 459ad23  [CARBONDATA-3107] Optimize error/exception coding for better debugging
     new 1c4c6dc  [CARBONDATA-2447] Block update operation on range/list/hash partition table
     new eef1eb1  [CARBONDATA-3299] Desc Formatted Issue Fixed
     new f6e1c2e  [CARBONDATA-3276] Compacting table that do not exist should modify the message of MalformedCarbonCommandException
     new 02cc2b2  [CARBONDATA-3298]Removed Log Message for Already Deleted Segments
     new fdb48d0  [CARBONDATA-3305] Support show metacache command to list the cache sizes for all tables
     new 4eefe52  [CARBONDATA-3305] Added DDL to drop cache for a table
     new ffa7730  [CARBONDATA-3281] Add validation for the size of the LRU cache
     new 6fa9bd2  [CARBONDATA-3301]Fix inserting null values to Array<date> columns in carbon file format data load
     new 7e83df1  [LOG] Optimize the logs of CarbonProperties
     new 46fc6c5  [CARBONDATA-3307] Fix Performance Issue in No Sort
     new 9ed8184  [CARBONDATA-3297] Fix that the IndexoutOfBoundsException when creating table and dropping table are at the same time
     new bfdff7f  [CARBONDATA-3300] Fixed ClassNotFoundException when using UDF in spark-shell
     new ded8885  [DOC] Update the doc of "Show DataMap"
     new 271fd55  [CARBONDATA-3304] Distinguish the thread names created by thread pool of CarbonThreadFactory
     new 79d91fe  [CARBONDATA-3313] count(*) is not invalidating the invalid segments cache
     new d78eed4  [CARBONDATA-3315] Fix for Range Filter failing with two between clauses as children of OR expression
     new 3f6a853  [CARBONDATA-3314] Fix for Index Cache Size in SHOW METACACHE DDL
     new 6975346  [CARBONDATA-3317] Fix NPE when execute 'show segments' command for stream table
     new bfc912a  [CARBONDATA-3311] support presto 0.217 #3142
     new 4364473  [TestCase][HOTFIX] Added drop database in beforeEach to avoid exception
     new 15f13ad  [CARBONDATA-3302] [Spark-Integration] code cleaning related to CarbonCreateTable command
     new c4f32dd  [CARBONDATA-3318] Added PreAgg & Bloom Event-Listener for ShowCacheCommmand
     new ef8001e  [CARBONDATA-3293] Prune datamaps improvement for count(*)
     new fbea5c6  [CARBONDATA-3321] Improved Single/Concurrent query Performance
     new 976e48a  [DOC] Fix the spell mistake of enable.unsafe.in.query.processing
     new 0f6ab06  [CARBONDATA-3322] [CARBONDATA-3323] Added check for invalid tables in ShowCacheCommand & Standard output on ShowCacheCommand on table
     new b081014  [CARBONDATA-3320]fix number of partitions issue in describe formatted and drop partition issue
     new 236b5e1  [CARBONDATA-3329] Fixed deadlock issue during failed query
     new f4141cb  [CARBONDATA-3328]Fixed performance issue with merge small files distribution
     new cde80f2  [HOTFIX][DOC] Optimize quick-start-guide.md and dml-of-carbondata.md
     new e565d1f  [CARBONDATA-3319][TestCase]Added condition to check if datamap exist or not before caching
     new 09c598f  [CARBONDATA-3330] Fix Invalid Exception while clearing datamap from SDK carbon reader
     new d52ef32  [CARBONDATA-3333]Fixed No Sort Store Size issue and Compatibility issue after alter added column done in 1.1 and load in 1.5
     new 8ae260f  [Document] update doc about presto version support details #3163
     new 81e9714  [CARBONDATA-3332] Blocked concurrent compaction and update/delete
     new 6a2c072  [CARBONDATA-3335]Fixed load and compaction failure after alter done in older version
     new 6a57b4b  [HOTFIX]Fixed data map loading issue when number of segments are high

The 41 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../carbondata/core/cache/CacheProvider.java       |   4 +
 .../carbondata/core/cache/CarbonLRUCache.java      |  59 +++-
 .../cache/dictionary/ForwardDictionaryCache.java   |   2 +-
 .../cache/dictionary/ReverseDictionaryCache.java   |   2 +-
 .../core/constants/CarbonCommonConstants.java      |   5 +
 .../constants/CarbonCommonConstantsInternal.java   |   2 +
 .../core/datamap/DataMapStoreManager.java          |  39 ++-
 .../carbondata/core/datamap/DataMapUtil.java       |   2 +-
 .../core/datamap/DistributableDataMapFormat.java   |   1 +
 .../apache/carbondata/core/datamap/Segment.java    |  24 +-
 .../carbondata/core/datamap/TableDataMap.java      | 118 +++++--
 .../carbondata/core/datamap/dev/DataMap.java       |  19 +-
 .../datamap/dev/cgdatamap/CoarseGrainDataMap.java  |  18 +-
 .../datamap/dev/fgdatamap/FineGrainDataMap.java    |  17 +-
 .../carbondata/core/datastore/TableSpec.java       |  85 ++++-
 .../block/SegmentPropertiesAndSchemaHolder.java    |  62 ++--
 .../UnsafeAbstractDimensionDataChunkStore.java     |  10 +-
 .../filesystem/AbstractDFSCarbonFile.java          |  14 +-
 .../datastore/filesystem/AlluxioCarbonFile.java    |   2 +-
 .../core/datastore/filesystem/HDFSCarbonFile.java  |   2 +-
 .../core/datastore/filesystem/LocalCarbonFile.java |   2 +-
 .../core/datastore/filesystem/S3CarbonFile.java    |   2 +-
 .../datastore/filesystem/ViewFSCarbonFile.java     |   2 +-
 .../core/datastore/impl/FileFactory.java           |  47 ++-
 .../client/NonSecureDictionaryClient.java          |   2 +-
 .../client/NonSecureDictionaryClientHandler.java   |   4 +-
 .../generator/TableDictionaryGenerator.java        |   2 +-
 .../server/NonSecureDictionaryServerHandler.java   |   2 +-
 .../service/AbstractDictionaryServer.java          |   8 +-
 .../core/indexstore/BlockletDataMapIndexStore.java |   4 +-
 .../core/indexstore/BlockletDetailInfo.java        |   2 +-
 .../core/indexstore/ExtendedBlocklet.java          |  97 ++++--
 .../core/indexstore/SegmentPropertiesFetcher.java  |   3 +
 .../TableBlockIndexUniqueIdentifier.java           |   5 +-
 .../core/indexstore/UnsafeMemoryDMStore.java       | 161 ++++++---
 .../indexstore/blockletindex/BlockDataMap.java     | 157 +++++----
 .../indexstore/blockletindex/BlockletDataMap.java  |  28 +-
 .../blockletindex/BlockletDataMapFactory.java      |   9 +-
 .../blockletindex/BlockletDataMapRowIndexes.java   |  14 +-
 .../carbondata/core/indexstore/row/DataMapRow.java |  12 +-
 .../core/indexstore/row/UnsafeDataMapRow.java      | 217 ++-----------
 .../core/indexstore/schema/CarbonRowSchema.java    |   8 +
 .../core/indexstore/schema/SchemaGenerator.java    |  72 ++++
 .../timestamp/DateDirectDictionaryGenerator.java   |   4 +-
 .../apache/carbondata/core/locks/LockUsage.java    |   1 +
 .../carbondata/core/locks/ZookeeperInit.java       |   2 +-
 .../core/memory/UnsafeMemoryManager.java           |   3 +-
 .../carbondata/core/metadata/SegmentFileStore.java |  21 +-
 .../core/metadata/blocklet/BlockletInfo.java       |   4 +-
 .../core/metadata/schema/table/CarbonTable.java    |  29 +-
 .../carbondata/core/mutate/CarbonUpdateUtil.java   |  14 +-
 .../core/reader/CarbonDeleteFilesDataReader.java   |  14 +-
 .../scan/executor/impl/AbstractQueryExecutor.java  |  22 +-
 .../scan/expression/RangeExpressionEvaluator.java  |  30 +-
 .../carbondata/core/scan/filter/FilterUtil.java    |  16 +-
 .../carbondata/core/scan/model/QueryModel.java     |  64 ++--
 .../core/scan/result/BlockletScannedResult.java    |   6 +-
 .../AbstractDetailQueryResultIterator.java         |   2 +-
 .../core/statusmanager/LoadMetadataDetails.java    |   6 +-
 .../core/statusmanager/SegmentStatusManager.java   |   6 +-
 .../carbondata/core/util/BlockletDataMapUtil.java  |  33 +-
 .../carbondata/core/util/CarbonProperties.java     | 134 ++++----
 .../apache/carbondata/core/util/CarbonUtil.java    |  46 ++-
 .../apache/carbondata/core/util/DataTypeUtil.java  |   8 +-
 .../carbondata/core/util/DeleteLoadFolders.java    |   4 -
 .../core/util/ObjectSerializationUtil.java         |   7 +-
 .../carbondata/core/util/path/HDFSLeaseUtils.java  |   2 +-
 .../apache/carbondata/hadoop/CarbonInputSplit.java | 361 +++++++++++++++------
 .../hadoop/internal/ObjectArrayWritable.java       |   0
 .../carbondata/hadoop/internal/index/Block.java    |   0
 .../carbondata/core/cache/CarbonLRUCacheTest.java  |   7 +
 .../datastore/filesystem/HDFSCarbonFileTest.java   |   2 +-
 .../core/load/LoadMetadataDetailsUnitTest.java     |   2 +-
 .../datamap/bloom/BloomCacheKeyValue.java          |   2 +-
 .../datamap/bloom/BloomCoarseGrainDataMap.java     |  41 +--
 .../bloom/BloomCoarseGrainDataMapFactory.java      |  11 +-
 .../datamap/lucene/LuceneFineGrainDataMap.java     |   6 +-
 .../lucene/LuceneFineGrainDataMapFactory.java      |   2 +-
 docs/datamap/datamap-management.md                 |   1 +
 docs/ddl-of-carbondata.md                          |  43 +++
 docs/dml-of-carbondata.md                          |  24 +-
 docs/faq.md                                        |  18 +
 docs/presto-guide.md                               |  33 +-
 docs/quick-start-guide.md                          |   6 +-
 docs/usecases.md                                   |   4 +-
 .../carbondata/hadoop/CarbonMultiBlockSplit.java   |  23 +-
 .../carbondata/hadoop/CarbonRecordReader.java      |   4 +-
 .../hadoop/api/CarbonFileInputFormat.java          |   6 +-
 .../carbondata/hadoop/api/CarbonInputFormat.java   |  55 +---
 .../hadoop/api/CarbonTableInputFormat.java         | 117 +++----
 .../hadoop/api/CarbonTableOutputFormat.java        |   3 +-
 .../hadoop/util/CarbonVectorizedRecordReader.java  |   2 +-
 .../hive/CarbonDictionaryDecodeReadSupport.java    |   2 +-
 .../carbondata/hive/MapredCarbonInputFormat.java   |   2 +-
 integration/presto/pom.xml                         |   2 +-
 .../presto/CarbondataConnectorFactory.java         |   7 +-
 .../apache/carbondata/presto/CarbondataModule.java |   8 +-
 .../carbondata/presto/CarbondataSplitManager.java  |   2 +-
 .../presto/impl/CarbonLocalInputSplit.java         |   4 +-
 .../carbondata/presto/impl/CarbonTableReader.java  |  12 +-
 .../PrestoTestNonTransactionalTableFiles.scala     |   2 +-
 .../cluster/sdv/generated/AlterTableTestCase.scala |  10 +-
 .../cluster/sdv/generated/QueriesBVATestCase.scala |   4 +-
 ...teTableUsingSparkCarbonFileFormatTestCase.scala |   7 +-
 .../datasource/SparkCarbonDataSourceTestCase.scala |  19 +-
 .../carbondata/cluster/sdv/suite/SDVSuites.scala   |   2 +-
 .../sql/common/util/DataSourceTestUtil.scala}      |  88 ++---
 ...ryWithColumnMetCacheAndCacheLevelProperty.scala |   5 +-
 .../dblocation/DBLocationCarbonTableTestCase.scala |  26 +-
 .../detailquery/RangeFilterTestCase.scala          |  38 +++
 .../TestAllDataTypeForPartitionTable.scala         |   4 +-
 .../partition/TestDDLForPartitionTable.scala       |  13 +-
 .../partition/TestUpdateForPartitionTable.scala    |  71 ++++
 ...StandardPartitionWithPreaggregateTestCase.scala |  22 ++
 .../sql/commands/TestCarbonDropCacheCommand.scala  | 200 ++++++++++++
 .../sql/commands/TestCarbonShowCacheCommand.scala  | 233 +++++++++++++
 .../client/SecureDictionaryClientHandler.java      |   4 +-
 .../server/SecureDictionaryServerHandler.java      |   2 +-
 .../org/apache/carbondata/spark/util/Util.java     |   3 +-
 .../org/apache/carbondata/api/CarbonStore.scala    |  10 +-
 .../{CleanFilesEvents.scala => CacheEvents.scala}  |  25 +-
 .../org/apache/carbondata/events/Events.scala      |  14 +
 .../carbondata/spark/rdd/CarbonMergerRDD.scala     |   4 +-
 .../carbondata/spark/rdd/CarbonScanRDD.scala       |  13 +-
 .../apache/carbondata/spark/util/CommonUtil.scala  |  39 ++-
 .../spark/sql/catalyst/CarbonDDLSqlParser.scala    |   1 +
 .../vectorreader/VectorizedCarbonRecordReader.java |   4 +-
 .../datasources/SparkCarbonFileFormat.scala        |   6 +-
 .../datasource/SparkCarbonDataSourceTest.scala     |  19 +-
 .../spark/rdd/CarbonDataRDDFactory.scala           |  48 +--
 .../org/apache/spark/sql/CarbonCountStar.scala     |   2 +-
 .../spark/sql/CarbonDatasourceHadoopRelation.scala |   6 +-
 .../scala/org/apache/spark/sql/CarbonEnv.scala     |   5 +
 .../sql/execution/command/cache/CacheUtil.scala    | 114 +++++++
 .../command/cache/CarbonDropCacheCommand.scala     |  66 ++++
 .../command/cache/CarbonShowCacheCommand.scala     | 225 +++++++++++++
 .../command/cache/DropCacheEventListeners.scala    | 121 +++++++
 .../command/cache/ShowCacheEventListeners.scala    | 126 +++++++
 .../CarbonAlterTableCompactionCommand.scala        |  55 ++--
 .../command/management/CarbonLoadDataCommand.scala |   2 +-
 .../mutation/CarbonProjectForDeleteCommand.scala   |  18 +-
 .../mutation/CarbonProjectForUpdateCommand.scala   |  99 +++---
 .../command/mutation/DeleteExecution.scala         |   2 +-
 .../schema/CarbonAlterTableRenameCommand.scala     |   2 -
 .../command/table/CarbonCreateTableCommand.scala   |   3 +-
 .../table/CarbonDescribeFormattedCommand.scala     |  39 ++-
 .../command/table/CarbonDropTableCommand.scala     |  32 +-
 .../spark/sql/execution/strategy/DDLStrategy.scala |  12 +-
 .../spark/sql/hive/CarbonFileMetastore.scala       |  37 ++-
 .../spark/sql/parser/CarbonSpark2SqlParser.scala   |  18 +-
 .../BloomCoarseGrainDataMapFunctionSuite.scala     |   2 +-
 .../register/TestRegisterCarbonTable.scala         |  24 +-
 .../processing/datamap/DataMapWriterListener.java  |   2 +-
 .../processing/datatypes/PrimitiveDataType.java    |   7 +-
 .../loading/AbstractDataLoadProcessorStep.java     |   2 +-
 .../loading/TableProcessingOperations.java         |   3 +-
 .../loading/converter/impl/RowConverterImpl.java   |   4 +-
 .../loading/model/CarbonLoadModelBuilder.java      |   2 +-
 .../loading/parser/impl/JsonRowParser.java         |   2 +-
 .../loading/row/IntermediateSortTempRow.java       |   8 +
 .../loading/sort/SortStepRowHandler.java           |   7 +-
 .../sort/impl/ParallelReadMergeSorterImpl.java     |   5 +-
 ...ParallelReadMergeSorterWithColumnRangeImpl.java |   2 +-
 .../UnsafeBatchParallelReadMergeSorterImpl.java    |   4 +-
 .../impl/UnsafeParallelReadMergeSorterImpl.java    |   5 +-
 ...ParallelReadMergeSorterWithColumnRangeImpl.java |   2 +-
 .../loading/sort/unsafe/UnsafeCarbonRowPage.java   |   8 +-
 .../loading/sort/unsafe/UnsafeSortDataRows.java    |  23 +-
 .../holder/UnsafeSortTempFileChunkHolder.java      |  26 +-
 .../merger/UnsafeIntermediateFileMerger.java       |   2 +-
 .../unsafe/merger/UnsafeIntermediateMerger.java    |   7 +-
 .../UnsafeSingleThreadFinalSortFilesMerger.java    |   4 +-
 .../CarbonRowDataWriterProcessorStepImpl.java      |  10 +-
 .../steps/DataWriterBatchProcessorStepImpl.java    |   4 +-
 .../loading/steps/DataWriterProcessorStepImpl.java |   2 +-
 .../loading/steps/InputProcessorStepImpl.java      |   2 +-
 .../processing/merger/CarbonCompactionUtil.java    |   8 +-
 .../processing/merger/CarbonDataMergerUtil.java    |   4 +-
 .../merger/CompactionResultSortProcessor.java      |  24 +-
 .../merger/RowResultMergerProcessor.java           |   2 +-
 .../partition/spliter/RowResultProcessor.java      |   6 +-
 .../DummyRowUpdater.java}                          |  25 +-
 .../processing/sort/SchemaBasedRowUpdater.java     |  91 ++++++
 .../SortTempRowUpdater.java}                       |  27 +-
 .../sortdata/SingleThreadFinalSortFilesMerger.java |   4 +-
 .../processing/sort/sortdata/SortDataRows.java     |   5 +-
 .../sort/sortdata/SortIntermediateFileMerger.java  |   3 +-
 .../processing/sort/sortdata/SortParameters.java   |  54 +++
 .../sort/sortdata/SortTempFileChunkHolder.java     |  28 +-
 .../processing/sort/sortdata/TableFieldStat.java   |  16 +
 .../store/CarbonFactDataHandlerColumnar.java       |  12 +-
 .../store/writer/AbstractFactDataWriter.java       |   6 +-
 .../processing/util/CarbonLoaderUtil.java          |   6 +-
 .../apache/carbondata/store/LocalCarbonStore.java  |   4 +-
 .../carbondata/sdk/file/CarbonReaderTest.java      |  38 +--
 .../java/org/apache/carbondata/tool/CarbonCli.java |   4 +-
 196 files changed, 3663 insertions(+), 1430 deletions(-)
 rename {hadoop => core}/src/main/java/org/apache/carbondata/hadoop/CarbonInputSplit.java (56%)
 rename {hadoop => core}/src/main/java/org/apache/carbondata/hadoop/internal/ObjectArrayWritable.java (100%)
 rename {hadoop => core}/src/main/java/org/apache/carbondata/hadoop/internal/index/Block.java (100%)
 copy integration/{spark-datasource/src/test/scala/org/apache/spark/sql/carbondata/datasource/TestUtil.scala => spark-common-cluster-test/src/test/scala/org/apache/spark/sql/common/util/DataSourceTestUtil.scala} (67%)
 create mode 100644 integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/partition/TestUpdateForPartitionTable.scala
 create mode 100644 integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonDropCacheCommand.scala
 create mode 100644 integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala
 copy integration/spark-common/src/main/scala/org/apache/carbondata/events/{CleanFilesEvents.scala => CacheEvents.scala} (72%)
 create mode 100644 integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CacheUtil.scala
 create mode 100644 integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonDropCacheCommand.scala
 create mode 100644 integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
 create mode 100644 integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/DropCacheEventListeners.scala
 create mode 100644 integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/ShowCacheEventListeners.scala
 copy processing/src/main/java/org/apache/carbondata/processing/{loading/sort/unsafe/holder/SortTempChunkHolder.java => sort/DummyRowUpdater.java} (54%)
 create mode 100644 processing/src/main/java/org/apache/carbondata/processing/sort/SchemaBasedRowUpdater.java
 copy processing/src/main/java/org/apache/carbondata/processing/{loading/sort/unsafe/holder/SortTempChunkHolder.java => sort/SortTempRowUpdater.java} (53%)


[carbondata] 19/41: [CARBONDATA-3313] count(*) is not invalidating the invalid segments cache

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 79d91fe3c469a3cfe7b017c64a9f0812d0f80ce8
Author: dhatchayani <dh...@gmail.com>
AuthorDate: Tue Mar 12 14:36:54 2019 +0530

    [CARBONDATA-3313] count(*) is not invalidating the invalid segments cache
    
    Problem:
    If any segment is deleted the next query has to clear/invalidate the datamap cache for those invalid segments. But count(*) has not considered the invalid segments to clear the datamap cache.
    
    Solution:
    In count(*) flow, before clearing the datamap cache, check and add the invalid segments of that table.
    
    This closes #3144
---
 .../hadoop/api/CarbonTableInputFormat.java         |  2 ++
 .../sql/commands/TestCarbonShowCacheCommand.scala  | 23 ++++++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
index c56b1db..281143b 100644
--- a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
+++ b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
@@ -617,6 +617,8 @@ public class CarbonTableInputFormat<T> extends CarbonInputFormat<T> {
         toBeCleanedSegments.add(eachSegment);
       }
     }
+    // remove entry in the segment index if there are invalid segments
+    toBeCleanedSegments.addAll(allSegments.getInvalidSegments());
     if (toBeCleanedSegments.size() > 0) {
       DataMapStoreManager.getInstance()
           .clearInvalidSegments(getOrCreateCarbonTable(job.getConfiguration()),
diff --git a/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala b/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala
index e999fc7..69c5f7e 100644
--- a/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala
+++ b/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala
@@ -110,6 +110,28 @@ class TestCarbonShowCacheCommand extends QueryTest with BeforeAndAfterAll {
     sql("select workgroupcategoryname,count(empname) as count from cache_4 group by workgroupcategoryname").collect()
   }
 
+  test("test drop cache invalidation in case of invalid segments"){
+    sql(s"CREATE TABLE empTable(empno int, empname String, designation String, " +
+        s"doj Timestamp, workgroupcategory int, workgroupcategoryname String, deptno int, " +
+        s"deptname String, projectcode int, projectjoindate Timestamp, projectenddate Timestamp," +
+        s"attendance int, utilization int, salary int) stored by 'carbondata'")
+    sql(s"LOAD DATA INPATH '$resourcesPath/data.csv' INTO TABLE empTable")
+    sql(s"LOAD DATA INPATH '$resourcesPath/data.csv' INTO TABLE empTable")
+    sql(s"LOAD DATA INPATH '$resourcesPath/data.csv' INTO TABLE empTable")
+    sql("select count(*) from empTable").show()
+    var showCache = sql("SHOW METACACHE on table empTable").collect()
+    assert(showCache(0).get(2).toString.equalsIgnoreCase("3/3 index files cached"))
+    sql("delete from table empTable where segment.id in(0)").show()
+    // check whether count(*) query invalidates the cache for the invalid segments
+    sql("select count(*) from empTable").show()
+    showCache = sql("SHOW METACACHE on table empTable").collect()
+    assert(showCache(0).get(2).toString.equalsIgnoreCase("2/2 index files cached"))
+    sql("delete from table empTable where segment.id in(1)").show()
+    // check whether select * query invalidates the cache for the invalid segments
+    sql("select * from empTable").show()
+    showCache = sql("SHOW METACACHE on table empTable").collect()
+    assert(showCache(0).get(2).toString.equalsIgnoreCase("1/1 index files cached"))
+  }
 
   override protected def afterAll(): Unit = {
     sql("use default").collect()
@@ -122,6 +144,7 @@ class TestCarbonShowCacheCommand extends QueryTest with BeforeAndAfterAll {
     sql("DROP TABLE IF EXISTS cache_db.cache_3")
     sql("DROP TABLE IF EXISTS default.cache_4")
     sql("DROP TABLE IF EXISTS default.cache_5")
+    sql("DROP TABLE IF EXISTS empTable")
   }
 
   test("show cache") {


[carbondata] 14/41: [CARBONDATA-3307] Fix Performance Issue in No Sort

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 46fc6c550ec34faaeebbba792edc3efd77e1f701
Author: shivamasn <sh...@gmail.com>
AuthorDate: Wed Mar 6 19:03:01 2019 +0530

    [CARBONDATA-3307] Fix Performance Issue in No Sort
    
    When creating the table without sort_columns and loading the data into it, it is generating more carbondata
    files than expected. Now the no. of carbondata files is being generated based on the no. of threads launched.
    Each thread is initialising its own writer and writing data.
    
    Now we pass the same writer instance to all the threads, so all the threads will write the data to same file.
    
    This closes #3140
---
 .../CarbonRowDataWriterProcessorStepImpl.java      | 61 ++++++++++------------
 1 file changed, 29 insertions(+), 32 deletions(-)

diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/steps/CarbonRowDataWriterProcessorStepImpl.java b/processing/src/main/java/org/apache/carbondata/processing/loading/steps/CarbonRowDataWriterProcessorStepImpl.java
index f976abe..184248c 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/steps/CarbonRowDataWriterProcessorStepImpl.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/steps/CarbonRowDataWriterProcessorStepImpl.java
@@ -18,9 +18,7 @@ package org.apache.carbondata.processing.loading.steps;
 
 import java.io.IOException;
 import java.util.Iterator;
-import java.util.List;
 import java.util.Map;
-import java.util.concurrent.CopyOnWriteArrayList;
 import java.util.concurrent.ExecutorService;
 import java.util.concurrent.Executors;
 import java.util.concurrent.Future;
@@ -83,16 +81,17 @@ public class CarbonRowDataWriterProcessorStepImpl extends AbstractDataLoadProces
 
   private Map<String, LocalDictionaryGenerator> localDictionaryGeneratorMap;
 
-  private List<CarbonFactHandler> carbonFactHandlers;
+  private CarbonFactHandler dataHandler;
 
   private ExecutorService executorService = null;
 
+  private static final Object lock = new Object();
+
   public CarbonRowDataWriterProcessorStepImpl(CarbonDataLoadConfiguration configuration,
       AbstractDataLoadProcessorStep child) {
     super(configuration, child);
     this.localDictionaryGeneratorMap =
         CarbonUtil.getLocalDictionaryModel(configuration.getTableSpec().getCarbonTable());
-    this.carbonFactHandlers = new CopyOnWriteArrayList<>();
   }
 
   @Override public void initialize() throws IOException {
@@ -129,20 +128,31 @@ public class CarbonRowDataWriterProcessorStepImpl extends AbstractDataLoadProces
           .recordDictionaryValue2MdkAdd2FileTime(CarbonTablePath.DEPRECATED_PARTITION_ID,
               System.currentTimeMillis());
 
+      //Creating a Instance of CarbonFacthandler that will be passed to all the threads
+      String[] storeLocation = getStoreLocation();
+      DataMapWriterListener listener = getDataMapWriterListener(0);
+      CarbonFactDataHandlerModel model = CarbonFactDataHandlerModel
+          .createCarbonFactDataHandlerModel(configuration, storeLocation, 0, 0, listener);
+      model.setColumnLocalDictGenMap(localDictionaryGeneratorMap);
+      dataHandler = CarbonFactHandlerFactory.createCarbonFactHandler(model);
+      dataHandler.initialise();
+
       if (iterators.length == 1) {
-        doExecute(iterators[0], 0);
+        doExecute(iterators[0], 0, dataHandler);
       } else {
         executorService = Executors.newFixedThreadPool(iterators.length,
             new CarbonThreadFactory("NoSortDataWriterPool:" + configuration.getTableIdentifier()
                 .getCarbonTableIdentifier().getTableName()));
         Future[] futures = new Future[iterators.length];
         for (int i = 0; i < iterators.length; i++) {
-          futures[i] = executorService.submit(new DataWriterRunnable(iterators[i], i));
+          futures[i] = executorService.submit(new DataWriterRunnable(iterators[i], i, dataHandler));
         }
         for (Future future : futures) {
           future.get();
         }
       }
+      finish(dataHandler, 0);
+      dataHandler = null;
     } catch (CarbonDataWriterException e) {
       LOGGER.error("Failed for table: " + tableName + " in DataWriterProcessorStepImpl", e);
       throw new CarbonDataLoadingException(
@@ -157,31 +167,15 @@ public class CarbonRowDataWriterProcessorStepImpl extends AbstractDataLoadProces
     return null;
   }
 
-  private void doExecute(Iterator<CarbonRowBatch> iterator, int iteratorIndex) throws IOException {
-    String[] storeLocation = getStoreLocation();
-    DataMapWriterListener listener = getDataMapWriterListener(0);
-    CarbonFactDataHandlerModel model = CarbonFactDataHandlerModel.createCarbonFactDataHandlerModel(
-        configuration, storeLocation, 0, iteratorIndex, listener);
-    model.setColumnLocalDictGenMap(localDictionaryGeneratorMap);
-    CarbonFactHandler dataHandler = null;
+  private void doExecute(Iterator<CarbonRowBatch> iterator, int iteratorIndex,
+      CarbonFactHandler dataHandler) throws IOException {
     boolean rowsNotExist = true;
     while (iterator.hasNext()) {
       if (rowsNotExist) {
         rowsNotExist = false;
-        dataHandler = CarbonFactHandlerFactory.createCarbonFactHandler(model);
-        this.carbonFactHandlers.add(dataHandler);
-        dataHandler.initialise();
       }
       processBatch(iterator.next(), dataHandler, iteratorIndex);
     }
-    try {
-      if (!rowsNotExist) {
-        finish(dataHandler, iteratorIndex);
-      }
-    } finally {
-      carbonFactHandlers.remove(dataHandler);
-    }
-
 
   }
 
@@ -306,7 +300,9 @@ public class CarbonRowDataWriterProcessorStepImpl extends AbstractDataLoadProces
       while (batch.hasNext()) {
         CarbonRow row = batch.next();
         CarbonRow converted = convertRow(row);
-        dataHandler.addDataToStore(converted);
+        synchronized (lock) {
+          dataHandler.addDataToStore(converted);
+        }
         readCounter[iteratorIndex]++;
       }
       writeCounter[iteratorIndex] += batch.getSize();
@@ -320,15 +316,18 @@ public class CarbonRowDataWriterProcessorStepImpl extends AbstractDataLoadProces
 
     private Iterator<CarbonRowBatch> iterator;
     private int iteratorIndex = 0;
+    private CarbonFactHandler dataHandler = null;
 
-    DataWriterRunnable(Iterator<CarbonRowBatch> iterator, int iteratorIndex) {
+    DataWriterRunnable(Iterator<CarbonRowBatch> iterator, int iteratorIndex,
+        CarbonFactHandler dataHandler) {
       this.iterator = iterator;
       this.iteratorIndex = iteratorIndex;
+      this.dataHandler = dataHandler;
     }
 
     @Override public void run() {
       try {
-        doExecute(this.iterator, iteratorIndex);
+        doExecute(this.iterator, iteratorIndex, dataHandler);
       } catch (IOException e) {
         LOGGER.error(e.getMessage(), e);
         throw new RuntimeException(e);
@@ -342,11 +341,9 @@ public class CarbonRowDataWriterProcessorStepImpl extends AbstractDataLoadProces
       if (null != executorService) {
         executorService.shutdownNow();
       }
-      if (null != this.carbonFactHandlers && !this.carbonFactHandlers.isEmpty()) {
-        for (CarbonFactHandler carbonFactHandler : this.carbonFactHandlers) {
-          carbonFactHandler.finish();
-          carbonFactHandler.closeHandler();
-        }
+      if (null != dataHandler) {
+        dataHandler.finish();
+        dataHandler.closeHandler();
       }
     }
   }


[carbondata] 38/41: [Document] update doc about presto version support details #3163

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 8ae260fcd2cf457a16377ea6f50b51fbec10b264
Author: ajantha-bhat <aj...@gmail.com>
AuthorDate: Mon Mar 25 17:03:14 2019 +0800

    [Document] update doc about presto version support details #3163
    
    update doc about presto version support details
    
    This closes #3163
---
 docs/presto-guide.md | 33 ++++++++++++++++++++++-----------
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/docs/presto-guide.md b/docs/presto-guide.md
index 6bb8196..c5bcd76 100644
--- a/docs/presto-guide.md
+++ b/docs/presto-guide.md
@@ -27,26 +27,37 @@ This tutorial provides a quick introduction to using current integration/presto
 ## Presto Multinode Cluster Setup for Carbondata
 ### Installing Presto
 
-  1. Download the 0.210 version of Presto using:
+To know about which version of presto is supported by this version of carbon, visit 
+https://github.com/apache/carbondata/blob/master/integration/presto/pom.xml
+and look for ```<presto.version>```
+
+_Example:_ 
+  `<presto.version>0.217</presto.version>`
+This means current version of carbon supports presto 0.217 version.   
+
+_Note:_
+Currently carbondata supports only one version of presto, cannot handle multiple versions at same time. If user wish to use older version of presto, then need to use older version of carbon (other old branches, say branch-1.5 and check the supported presto version in it's pom.xml file in integration/presto/)
+
+  1. Download that version of Presto (say 0.217) using below command:
   ```
-  wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.210/presto-server-0.210.tar.gz
+  wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.217/presto-server-0.217.tar.gz
   ```
 
-  2. Extract Presto tar file: `tar zxvf presto-server-0.210.tar.gz`.
+  2. Extract Presto tar file: `tar zxvf presto-server-0.217.tar.gz`.
 
-  3. Download the Presto CLI for the coordinator and name it presto.
+  3. Download the Presto CLI of the same presto server version (say 0.217) for the coordinator and name it presto.
 
   ```
-    wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.210/presto-cli-0.210-executable.jar
+    wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.217/presto-cli-0.217-executable.jar
 
-    mv presto-cli-0.210-executable.jar presto
+    mv presto-cli-0.217-executable.jar presto
 
     chmod +x presto
   ```
 
  ### Create Configuration Files
 
-  1. Create `etc` folder in presto-server-0.210 directory.
+  1. Create `etc` folder in presto-server-0.217 directory.
   2. Create `config.properties`, `jvm.config`, `log.properties`, and `node.properties` files.
   3. Install uuid to generate a node.id.
 
@@ -137,12 +148,12 @@ Then, `query.max-memory=<30GB * number of nodes>`.
 ### Start Presto Server on all nodes
 
 ```
-./presto-server-0.210/bin/launcher start
+./presto-server-0.217/bin/launcher start
 ```
 To run it as a background process.
 
 ```
-./presto-server-0.210/bin/launcher run
+./presto-server-0.217/bin/launcher run
 ```
 To run it in foreground.
 
@@ -165,7 +176,7 @@ Now you can use the Presto CLI on the coordinator to query data sources in the c
 ## Presto Single Node Setup for Carbondata
 
 ### Config presto server
-* Download presto server (0.210 is suggested and supported) : https://repo1.maven.org/maven2/com/facebook/presto/presto-server/
+* Download presto server (0.217 is suggested and supported) : https://repo1.maven.org/maven2/com/facebook/presto/presto-server/
 * Finish presto configuration following https://prestodb.io/docs/current/installation/deployment.html.
   A configuration example:
   
@@ -271,7 +282,7 @@ Load data statement in Spark can be used to create carbondata tables. And then y
 carbondata files.
 
 ### Query carbondata in CLI of presto
-* Download presto cli client of version 0.210 : https://repo1.maven.org/maven2/com/facebook/presto/presto-cli
+* Download presto cli client of version 0.217 : https://repo1.maven.org/maven2/com/facebook/presto/presto-cli
 
 * Start CLI:
   


[carbondata] 40/41: [CARBONDATA-3335]Fixed load and compaction failure after alter done in older version

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 6a2c072f5031488b82f981f23b34623d214ef0d5
Author: kumarvishal09 <ku...@gmail.com>
AuthorDate: Fri Mar 29 06:32:04 2019 +0530

    [CARBONDATA-3335]Fixed load and compaction failure after alter done in older version
    
    No Sort Load/Compaction is failing in latest version with alter in older version This is because for sort step output is based on sort order and writer expect based on schema order, This PR handled the same by updating the sort output based on schema order and tested with 3.5 billion records performance is same
    
    This closes #3168
---
 .../carbondata/core/datastore/TableSpec.java       | 93 ++++++++++++++++------
 .../loading/row/IntermediateSortTempRow.java       |  8 ++
 .../loading/sort/SortStepRowHandler.java           |  7 +-
 .../loading/sort/unsafe/UnsafeCarbonRowPage.java   |  8 +-
 .../holder/UnsafeSortTempFileChunkHolder.java      | 16 +++-
 .../processing/sort/DummyRowUpdater.java           | 40 ++++++++++
 .../processing/sort/SchemaBasedRowUpdater.java     | 91 +++++++++++++++++++++
 .../processing/sort/SortTempRowUpdater.java        | 40 ++++++++++
 .../processing/sort/sortdata/SortParameters.java   | 54 +++++++++++++
 .../sort/sortdata/SortTempFileChunkHolder.java     | 15 +++-
 .../processing/sort/sortdata/TableFieldStat.java   | 16 ++++
 .../carbondata/processing/store/TablePage.java     | 49 ++----------
 12 files changed, 364 insertions(+), 73 deletions(-)

diff --git a/core/src/main/java/org/apache/carbondata/core/datastore/TableSpec.java b/core/src/main/java/org/apache/carbondata/core/datastore/TableSpec.java
index 002104a..d0b8b3c 100644
--- a/core/src/main/java/org/apache/carbondata/core/datastore/TableSpec.java
+++ b/core/src/main/java/org/apache/carbondata/core/datastore/TableSpec.java
@@ -50,6 +50,12 @@ public class TableSpec {
 
   private CarbonTable carbonTable;
 
+  private boolean isUpdateDictDim;
+
+  private boolean isUpdateNoDictDims;
+  private int[] dictDimActualPosition;
+  private int[] noDictDimActualPosition;
+
   public TableSpec(CarbonTable carbonTable) {
     this.carbonTable = carbonTable;
     List<CarbonDimension> dimensions =
@@ -71,10 +77,12 @@ public class TableSpec {
   }
 
   private void addDimensions(List<CarbonDimension> dimensions) {
-    List<DimensionSpec> sortDimSpec = new ArrayList<>();
-    List<DimensionSpec> noSortDimSpec = new ArrayList<>();
+    List<DimensionSpec> dictSortDimSpec = new ArrayList<>();
+    List<DimensionSpec> noSortDictDimSpec = new ArrayList<>();
     List<DimensionSpec> noSortNoDictDimSpec = new ArrayList<>();
-    List<DimensionSpec> sortNoDictDimSpec = new ArrayList<>();
+    List<DimensionSpec> noDictSortDimSpec = new ArrayList<>();
+    List<DimensionSpec> dictDimensionSpec = new ArrayList<>();
+    int dimIndex = 0;
     DimensionSpec spec;
     short actualPosition = 0;
     // sort step's output is based on sort column order i.e sort columns data will be present
@@ -83,40 +91,61 @@ public class TableSpec {
       CarbonDimension dimension = dimensions.get(i);
       if (dimension.isComplex()) {
         spec = new DimensionSpec(ColumnType.COMPLEX, dimension, actualPosition++);
+        dimensionSpec[dimIndex++] = spec;
+        noDictionaryDimensionSpec.add(spec);
+        noSortNoDictDimSpec.add(spec);
       } else if (dimension.getDataType() == DataTypes.TIMESTAMP && !dimension
           .isDirectDictionaryEncoding()) {
         spec = new DimensionSpec(ColumnType.PLAIN_VALUE, dimension, actualPosition++);
+        dimensionSpec[dimIndex++] = spec;
+        noDictionaryDimensionSpec.add(spec);
+        if (dimension.isSortColumn()) {
+          noDictSortDimSpec.add(spec);
+        } else {
+          noSortNoDictDimSpec.add(spec);
+        }
       } else if (dimension.isDirectDictionaryEncoding()) {
         spec = new DimensionSpec(ColumnType.DIRECT_DICTIONARY, dimension, actualPosition++);
+        dimensionSpec[dimIndex++] = spec;
+        dictDimensionSpec.add(spec);
+        if (dimension.isSortColumn()) {
+          dictSortDimSpec.add(spec);
+        } else {
+          noSortDictDimSpec.add(spec);
+        }
       } else if (dimension.isGlobalDictionaryEncoding()) {
         spec = new DimensionSpec(ColumnType.GLOBAL_DICTIONARY, dimension, actualPosition++);
-      } else {
-        spec = new DimensionSpec(ColumnType.PLAIN_VALUE, dimension, actualPosition++);
-      }
-      if (dimension.isSortColumn()) {
-        sortDimSpec.add(spec);
-        if (!dimension.isDirectDictionaryEncoding() && !dimension.isGlobalDictionaryEncoding()
-            || spec.getColumnType() == ColumnType.COMPLEX) {
-          sortNoDictDimSpec.add(spec);
+        dimensionSpec[dimIndex++] = spec;
+        dictDimensionSpec.add(spec);
+        if (dimension.isSortColumn()) {
+          dictSortDimSpec.add(spec);
+        } else {
+          noSortDictDimSpec.add(spec);
         }
       } else {
-        noSortDimSpec.add(spec);
-        if (!dimension.isDirectDictionaryEncoding() && !dimension.isGlobalDictionaryEncoding()
-            || spec.getColumnType() == ColumnType.COMPLEX) {
+        spec = new DimensionSpec(ColumnType.PLAIN_VALUE, dimension, actualPosition++);
+        dimensionSpec[dimIndex++] = spec;
+        noDictionaryDimensionSpec.add(spec);
+        if (dimension.isSortColumn()) {
+          noDictSortDimSpec.add(spec);
+        } else {
           noSortNoDictDimSpec.add(spec);
         }
       }
     }
-    // combine the result
-    final DimensionSpec[] sortDimensionSpecs =
-        sortDimSpec.toArray(new DimensionSpec[sortDimSpec.size()]);
-    final DimensionSpec[] noSortDimensionSpecs =
-        noSortDimSpec.toArray(new DimensionSpec[noSortDimSpec.size()]);
-    System.arraycopy(sortDimensionSpecs, 0, dimensionSpec, 0, sortDimensionSpecs.length);
-    System.arraycopy(noSortDimensionSpecs, 0, dimensionSpec, sortDimensionSpecs.length,
-        noSortDimensionSpecs.length);
-    noDictionaryDimensionSpec.addAll(sortNoDictDimSpec);
-    noDictionaryDimensionSpec.addAll(noSortNoDictDimSpec);
+    noDictSortDimSpec.addAll(noSortNoDictDimSpec);
+    dictSortDimSpec.addAll(noSortDictDimSpec);
+
+    this.dictDimActualPosition = new int[dictSortDimSpec.size()];
+    this.noDictDimActualPosition = new int[noDictSortDimSpec.size()];
+    for (int i = 0; i < dictDimActualPosition.length; i++) {
+      dictDimActualPosition[i] = dictSortDimSpec.get(i).getActualPostion();
+    }
+    for (int i = 0; i < noDictDimActualPosition.length; i++) {
+      noDictDimActualPosition[i] = noDictSortDimSpec.get(i).getActualPostion();
+    }
+    isUpdateNoDictDims = !noDictSortDimSpec.equals(noDictionaryDimensionSpec);
+    isUpdateDictDim = !dictSortDimSpec.equals(dictDimensionSpec);
   }
 
   private void addMeasures(List<CarbonMeasure> measures) {
@@ -126,6 +155,22 @@ public class TableSpec {
     }
   }
 
+  public int[] getDictDimActualPosition() {
+    return dictDimActualPosition;
+  }
+
+  public int[] getNoDictDimActualPosition() {
+    return noDictDimActualPosition;
+  }
+
+  public boolean isUpdateDictDim() {
+    return isUpdateDictDim;
+  }
+
+  public boolean isUpdateNoDictDims() {
+    return isUpdateNoDictDims;
+  }
+
   /**
    * No dictionary and complex dimensions of the table
    *
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/row/IntermediateSortTempRow.java b/processing/src/main/java/org/apache/carbondata/processing/loading/row/IntermediateSortTempRow.java
index 844e45e..0207752 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/row/IntermediateSortTempRow.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/row/IntermediateSortTempRow.java
@@ -64,4 +64,12 @@ public class IntermediateSortTempRow {
   public byte[] getNoSortDimsAndMeasures() {
     return noSortDimsAndMeasures;
   }
+
+  public void setNoDictData(Object[] noDictSortDims) {
+    this.noDictSortDims = noDictSortDims;
+  }
+
+  public void setDictData(int[] dictData) {
+    this.dictSortDims = dictData;
+  }
 }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/SortStepRowHandler.java b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/SortStepRowHandler.java
index fa12dcc..8a0f8ea 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/SortStepRowHandler.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/SortStepRowHandler.java
@@ -35,6 +35,7 @@ import org.apache.carbondata.core.util.DataTypeUtil;
 import org.apache.carbondata.core.util.NonDictionaryUtil;
 import org.apache.carbondata.core.util.ReUsableByteArrayDataOutputStream;
 import org.apache.carbondata.processing.loading.row.IntermediateSortTempRow;
+import org.apache.carbondata.processing.sort.SortTempRowUpdater;
 import org.apache.carbondata.processing.sort.sortdata.SortParameters;
 import org.apache.carbondata.processing.sort.sortdata.TableFieldStat;
 
@@ -78,6 +79,8 @@ public class SortStepRowHandler implements Serializable {
 
   private boolean[] noDictNoSortColMapping;
 
+  private SortTempRowUpdater sortTempRowUpdater;
+
   /**
    * constructor
    * @param tableFieldStat table field stat
@@ -108,6 +111,7 @@ public class SortStepRowHandler implements Serializable {
     for (int i = 0; i < noDictNoSortDataTypes.length; i++) {
       noDictNoSortColMapping[i] = DataTypeUtil.isPrimitiveColumn(noDictNoSortDataTypes[i]);
     }
+    this.sortTempRowUpdater = tableFieldStat.getSortTempRowUpdater();
   }
 
   /**
@@ -167,8 +171,7 @@ public class SortStepRowHandler implements Serializable {
       for (int idx = 0; idx < this.measureCnt; idx++) {
         measures[idx] = row[this.measureIdx[idx]];
       }
-
-      NonDictionaryUtil.prepareOutObj(holder, dictDims, nonDictArray, measures);
+      sortTempRowUpdater.updateOutputRow(holder, dictDims, nonDictArray, measures);
     } catch (Exception e) {
       throw new RuntimeException("Problem while converting row to 3 parts", e);
     }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeCarbonRowPage.java b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeCarbonRowPage.java
index 21403b0..6cf1a25 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeCarbonRowPage.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeCarbonRowPage.java
@@ -28,6 +28,7 @@ import org.apache.carbondata.core.memory.UnsafeSortMemoryManager;
 import org.apache.carbondata.core.util.ReUsableByteArrayDataOutputStream;
 import org.apache.carbondata.processing.loading.row.IntermediateSortTempRow;
 import org.apache.carbondata.processing.loading.sort.SortStepRowHandler;
+import org.apache.carbondata.processing.sort.SortTempRowUpdater;
 import org.apache.carbondata.processing.sort.sortdata.TableFieldStat;
 
 /**
@@ -50,6 +51,8 @@ public class UnsafeCarbonRowPage {
   private SortStepRowHandler sortStepRowHandler;
   private boolean convertNoSortFields;
 
+  private SortTempRowUpdater sortTempRowUpdater;
+
   public UnsafeCarbonRowPage(TableFieldStat tableFieldStat, MemoryBlock memoryBlock,
       String taskId) {
     this.tableFieldStat = tableFieldStat;
@@ -60,6 +63,7 @@ public class UnsafeCarbonRowPage {
     // TODO Only using 98% of space for safe side.May be we can have different logic.
     sizeToBeUsed = dataBlock.size() - (dataBlock.size() * 5) / 100;
     this.managerType = MemoryManagerType.UNSAFE_MEMORY_MANAGER;
+    this.sortTempRowUpdater = tableFieldStat.getSortTempRowUpdater();
   }
 
   public int addRow(Object[] row,
@@ -93,8 +97,10 @@ public class UnsafeCarbonRowPage {
    */
   public IntermediateSortTempRow getRow(long address) {
     if (convertNoSortFields) {
-      return sortStepRowHandler
+      IntermediateSortTempRow intermediateSortTempRow = sortStepRowHandler
           .readRowFromMemoryWithNoSortFieldConvert(dataBlock.getBaseObject(), address);
+      this.sortTempRowUpdater.updateSortTempRow(intermediateSortTempRow);
+      return intermediateSortTempRow;
     } else {
       return sortStepRowHandler
           .readFromMemoryWithoutNoSortFieldConvert(dataBlock.getBaseObject(), address);
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeSortTempFileChunkHolder.java b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeSortTempFileChunkHolder.java
index 04cab70..7fcfc0e 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeSortTempFileChunkHolder.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeSortTempFileChunkHolder.java
@@ -34,6 +34,7 @@ import org.apache.carbondata.core.util.CarbonProperties;
 import org.apache.carbondata.core.util.CarbonUtil;
 import org.apache.carbondata.processing.loading.row.IntermediateSortTempRow;
 import org.apache.carbondata.processing.loading.sort.SortStepRowHandler;
+import org.apache.carbondata.processing.sort.SortTempRowUpdater;
 import org.apache.carbondata.processing.sort.exception.CarbonSortKeyAndGroupByException;
 import org.apache.carbondata.processing.sort.sortdata.IntermediateSortTempRowComparator;
 import org.apache.carbondata.processing.sort.sortdata.SortParameters;
@@ -98,6 +99,8 @@ public class UnsafeSortTempFileChunkHolder implements SortTempChunkHolder {
   private SortStepRowHandler sortStepRowHandler;
   private Comparator<IntermediateSortTempRow> comparator;
   private boolean convertNoSortFields;
+
+  private SortTempRowUpdater sortTempRowUpdater;
   /**
    * Constructor to initialize
    */
@@ -113,6 +116,7 @@ public class UnsafeSortTempFileChunkHolder implements SortTempChunkHolder {
     comparator = new IntermediateSortTempRowComparator(parameters.getNoDictionarySortColumn(),
         parameters.getNoDictDataType());
     this.convertNoSortFields = convertNoSortFields;
+    this.sortTempRowUpdater = tableFieldStat.getSortTempRowUpdater();
     initialize();
   }
 
@@ -168,7 +172,11 @@ public class UnsafeSortTempFileChunkHolder implements SortTempChunkHolder {
     } else {
       try {
         if (convertNoSortFields) {
-          this.returnRow = sortStepRowHandler.readWithNoSortFieldConvert(stream);
+          IntermediateSortTempRow intermediateSortTempRow =
+              sortStepRowHandler.readWithNoSortFieldConvert(stream);
+          sortTempRowUpdater
+              .updateSortTempRow(intermediateSortTempRow);
+          this.returnRow = intermediateSortTempRow;
         } else {
           this.returnRow = sortStepRowHandler.readWithoutNoSortFieldConvert(stream);
         }
@@ -220,7 +228,11 @@ public class UnsafeSortTempFileChunkHolder implements SortTempChunkHolder {
     IntermediateSortTempRow[] holders = new IntermediateSortTempRow[expected];
     for (int i = 0; i < expected; i++) {
       if (convertNoSortFields) {
-        holders[i] = sortStepRowHandler.readWithNoSortFieldConvert(stream);
+        IntermediateSortTempRow intermediateSortTempRow =
+            sortStepRowHandler.readWithNoSortFieldConvert(stream);
+        sortTempRowUpdater
+            .updateSortTempRow(intermediateSortTempRow);
+        holders[i] = intermediateSortTempRow;
       } else {
         holders[i] = sortStepRowHandler.readWithoutNoSortFieldConvert(stream);
       }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/sort/DummyRowUpdater.java b/processing/src/main/java/org/apache/carbondata/processing/sort/DummyRowUpdater.java
new file mode 100644
index 0000000..f86a89c
--- /dev/null
+++ b/processing/src/main/java/org/apache/carbondata/processing/sort/DummyRowUpdater.java
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.processing.sort;
+
+import org.apache.carbondata.core.datastore.row.WriteStepRowUtil;
+import org.apache.carbondata.processing.loading.row.IntermediateSortTempRow;
+
+/**
+ * This class will be used when order is not change so not need to update the row
+ */
+public class DummyRowUpdater implements SortTempRowUpdater {
+
+  private static final long serialVersionUID = 5989093890994039617L;
+
+  @Override public void updateSortTempRow(IntermediateSortTempRow intermediateSortTempRow) {
+    // DO NOTHING
+  }
+
+  @Override public void updateOutputRow(Object[] out, int[] dimArray, Object[] noDictArray,
+      Object[] measureArray) {
+    out[WriteStepRowUtil.DICTIONARY_DIMENSION] = dimArray;
+    out[WriteStepRowUtil.NO_DICTIONARY_AND_COMPLEX] = noDictArray;
+    out[WriteStepRowUtil.MEASURE] = measureArray;
+  }
+}
diff --git a/processing/src/main/java/org/apache/carbondata/processing/sort/SchemaBasedRowUpdater.java b/processing/src/main/java/org/apache/carbondata/processing/sort/SchemaBasedRowUpdater.java
new file mode 100644
index 0000000..2fca803
--- /dev/null
+++ b/processing/src/main/java/org/apache/carbondata/processing/sort/SchemaBasedRowUpdater.java
@@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.processing.sort;
+
+import org.apache.carbondata.core.datastore.row.WriteStepRowUtil;
+import org.apache.carbondata.processing.loading.row.IntermediateSortTempRow;
+
+/**
+ * Below class will be used to update the sort output row based on schema order during filnal merge
+ * this is required because in case of older version(eg:1.1) alter add column was supported
+ * only with sort columns and sort step will return the data based on
+ * sort column order(sort columns first) so as writer step understand format based on schema order
+ * so we need to arrange based on schema order
+ */
+public class SchemaBasedRowUpdater implements SortTempRowUpdater {
+
+  private static final long serialVersionUID = -8864989617597611912L;
+
+  private boolean isUpdateDictDims;
+
+  private boolean isUpdateNonDictDims;
+
+  private int[] dictDimActualPosition;
+
+  private int[] noDictActualPosition;
+
+  public SchemaBasedRowUpdater(int[] dictDimActualPosition, int[] noDictActualPosition,
+      boolean isUpdateDictDims, boolean isUpdateNonDictDims) {
+    this.dictDimActualPosition = dictDimActualPosition;
+    this.noDictActualPosition = noDictActualPosition;
+    this.isUpdateDictDims = isUpdateDictDims;
+    this.isUpdateNonDictDims = isUpdateNonDictDims;
+  }
+
+  @Override public void updateSortTempRow(IntermediateSortTempRow intermediateSortTempRow) {
+    int[] dictSortDims = intermediateSortTempRow.getDictSortDims();
+    if (isUpdateDictDims) {
+      int[] dimArrayNew = new int[intermediateSortTempRow.getDictSortDims().length];
+      for (int i = 0; i < dictSortDims.length; i++) {
+        dimArrayNew[dictDimActualPosition[i]] = dictSortDims[i];
+      }
+      dictSortDims = dimArrayNew;
+    }
+    Object[] noDictSortDims = intermediateSortTempRow.getNoDictSortDims();
+    if (isUpdateNonDictDims) {
+      Object[] noDictArrayNew = new Object[noDictSortDims.length];
+      for (int i = 0; i < noDictArrayNew.length; i++) {
+        noDictArrayNew[noDictActualPosition[i]] = noDictSortDims[i];
+      }
+      noDictSortDims = noDictArrayNew;
+    }
+    intermediateSortTempRow.setDictData(dictSortDims);
+    intermediateSortTempRow.setNoDictData(noDictSortDims);
+  }
+
+  @Override public void updateOutputRow(Object[] out, int[] dimArray, Object[] noDictArray,
+      Object[] measureArray) {
+    if (isUpdateDictDims) {
+      int[] dimArrayNew = new int[dimArray.length];
+      for (int i = 0; i < dimArray.length; i++) {
+        dimArrayNew[dictDimActualPosition[i]] = dimArray[i];
+      }
+      dimArray = dimArrayNew;
+    }
+    if (isUpdateNonDictDims) {
+      Object[] noDictArrayNew = new Object[noDictArray.length];
+      for (int i = 0; i < noDictArrayNew.length; i++) {
+        noDictArrayNew[noDictActualPosition[i]] = noDictArray[i];
+      }
+      noDictArray = noDictArrayNew;
+    }
+    out[WriteStepRowUtil.DICTIONARY_DIMENSION] = dimArray;
+    out[WriteStepRowUtil.NO_DICTIONARY_AND_COMPLEX] = noDictArray;
+    out[WriteStepRowUtil.MEASURE] = measureArray;
+  }
+}
diff --git a/processing/src/main/java/org/apache/carbondata/processing/sort/SortTempRowUpdater.java b/processing/src/main/java/org/apache/carbondata/processing/sort/SortTempRowUpdater.java
new file mode 100644
index 0000000..7b3fd4b
--- /dev/null
+++ b/processing/src/main/java/org/apache/carbondata/processing/sort/SortTempRowUpdater.java
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.processing.sort;
+
+import java.io.Serializable;
+
+import org.apache.carbondata.processing.loading.row.IntermediateSortTempRow;
+
+/**
+ * Below class will be used to update the sort output row based on schema order during filnal merge
+ * this is required because in case of older version(eg:1.1) alter add column was supported
+ * only with sort columns and sort step will return the data based on
+ * sort column order(sort columns first) so as writer step understand format based on schema order
+ * so we need to arrange based on schema order
+ */
+public interface SortTempRowUpdater extends Serializable {
+
+  /**
+   * @param intermediateSortTempRow
+   */
+  void updateSortTempRow(IntermediateSortTempRow intermediateSortTempRow);
+
+  void updateOutputRow(Object[] out, int[] dimArray,
+      Object[] noDictArray, Object[] measureArray);
+}
diff --git a/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortParameters.java b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortParameters.java
index 6fec8dc..ffc7416 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortParameters.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortParameters.java
@@ -22,6 +22,7 @@ import java.util.Map;
 
 import org.apache.carbondata.common.logging.LogServiceFactory;
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datastore.TableSpec;
 import org.apache.carbondata.core.metadata.CarbonTableIdentifier;
 import org.apache.carbondata.core.metadata.datatype.DataType;
 import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
@@ -144,6 +145,14 @@ public class SortParameters implements Serializable {
    */
   private CarbonTable carbonTable;
 
+  private boolean isUpdateDictDims;
+
+  private boolean isUpdateNonDictDims;
+
+  private int[] dictDimActualPosition;
+
+  private int[] noDictActualPosition;
+
   public SortParameters getCopy() {
     SortParameters parameters = new SortParameters();
     parameters.tempFileLocation = tempFileLocation;
@@ -178,6 +187,10 @@ public class SortParameters implements Serializable {
     parameters.batchSortSizeinMb = batchSortSizeinMb;
     parameters.rangeId = rangeId;
     parameters.carbonTable = carbonTable;
+    parameters.isUpdateDictDims = isUpdateDictDims;
+    parameters.isUpdateNonDictDims = isUpdateNonDictDims;
+    parameters.dictDimActualPosition = dictDimActualPosition;
+    parameters.noDictActualPosition = noDictActualPosition;
     return parameters;
   }
 
@@ -473,6 +486,10 @@ public class SortParameters implements Serializable {
         .getNoDictSortAndNoSortDataTypes(configuration.getTableSpec().getCarbonTable());
     parameters.setNoDictSortDataType(noDictSortAndNoSortDataTypes.get("noDictSortDataTypes"));
     parameters.setNoDictNoSortDataType(noDictSortAndNoSortDataTypes.get("noDictNoSortDataTypes"));
+    parameters.setNoDictActualPosition(configuration.getTableSpec().getNoDictDimActualPosition());
+    parameters.setDictDimActualPosition(configuration.getTableSpec().getDictDimActualPosition());
+    parameters.setUpdateDictDims(configuration.getTableSpec().isUpdateDictDim());
+    parameters.setUpdateNonDictDims(configuration.getTableSpec().isUpdateNoDictDims());
     return parameters;
   }
 
@@ -556,6 +573,11 @@ public class SortParameters implements Serializable {
     parameters.setNoDictNoSortDataType(noDictSortAndNoSortDataTypes.get("noDictNoSortDataTypes"));
     parameters.setNoDictionarySortColumn(CarbonDataProcessorUtil
         .getNoDictSortColMapping(parameters.getCarbonTable()));
+    TableSpec tableSpec = new TableSpec(carbonTable);
+    parameters.setNoDictActualPosition(tableSpec.getNoDictDimActualPosition());
+    parameters.setDictDimActualPosition(tableSpec.getDictDimActualPosition());
+    parameters.setUpdateDictDims(tableSpec.isUpdateDictDim());
+    parameters.setUpdateNonDictDims(tableSpec.isUpdateNoDictDims());
     return parameters;
   }
 
@@ -590,4 +612,36 @@ public class SortParameters implements Serializable {
   public void setSortColumn(boolean[] sortColumn) {
     this.sortColumn = sortColumn;
   }
+
+  public boolean isUpdateDictDims() {
+    return isUpdateDictDims;
+  }
+
+  public void setUpdateDictDims(boolean updateDictDims) {
+    isUpdateDictDims = updateDictDims;
+  }
+
+  public boolean isUpdateNonDictDims() {
+    return isUpdateNonDictDims;
+  }
+
+  public void setUpdateNonDictDims(boolean updateNonDictDims) {
+    isUpdateNonDictDims = updateNonDictDims;
+  }
+
+  public int[] getDictDimActualPosition() {
+    return dictDimActualPosition;
+  }
+
+  public void setDictDimActualPosition(int[] dictDimActualPosition) {
+    this.dictDimActualPosition = dictDimActualPosition;
+  }
+
+  public int[] getNoDictActualPosition() {
+    return noDictActualPosition;
+  }
+
+  public void setNoDictActualPosition(int[] noDictActualPosition) {
+    this.noDictActualPosition = noDictActualPosition;
+  }
 }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortTempFileChunkHolder.java b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortTempFileChunkHolder.java
index 2ae90fa..9e9bac1 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortTempFileChunkHolder.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortTempFileChunkHolder.java
@@ -35,6 +35,7 @@ import org.apache.carbondata.core.util.CarbonThreadFactory;
 import org.apache.carbondata.core.util.CarbonUtil;
 import org.apache.carbondata.processing.loading.row.IntermediateSortTempRow;
 import org.apache.carbondata.processing.loading.sort.SortStepRowHandler;
+import org.apache.carbondata.processing.sort.SortTempRowUpdater;
 import org.apache.carbondata.processing.sort.exception.CarbonSortKeyAndGroupByException;
 
 import org.apache.log4j.Logger;
@@ -46,6 +47,7 @@ public class SortTempFileChunkHolder implements Comparable<SortTempFileChunkHold
    */
   private static final Logger LOGGER =
       LogServiceFactory.getLogService(SortTempFileChunkHolder.class.getName());
+  private SortTempRowUpdater sortTempRowUpdater;
 
   /**
    * temp file
@@ -106,6 +108,7 @@ public class SortTempFileChunkHolder implements Comparable<SortTempFileChunkHold
     this.comparator =
         new IntermediateSortTempRowComparator(tableFieldStat.getIsSortColNoDictFlags(),
             tableFieldStat.getNoDictDataType());
+    this.sortTempRowUpdater = tableFieldStat.getSortTempRowUpdater();
   }
 
   /**
@@ -180,7 +183,11 @@ public class SortTempFileChunkHolder implements Comparable<SortTempFileChunkHold
     } else {
       try {
         if (convertToActualField) {
-          this.returnRow = sortStepRowHandler.readWithNoSortFieldConvert(stream);
+          IntermediateSortTempRow intermediateSortTempRow =
+              sortStepRowHandler.readWithNoSortFieldConvert(stream);
+          this.sortTempRowUpdater
+              .updateSortTempRow(intermediateSortTempRow);
+          this.returnRow = intermediateSortTempRow;
         } else {
           this.returnRow = sortStepRowHandler.readWithoutNoSortFieldConvert(stream);
         }
@@ -230,7 +237,11 @@ public class SortTempFileChunkHolder implements Comparable<SortTempFileChunkHold
     IntermediateSortTempRow[] holders = new IntermediateSortTempRow[expected];
     for (int i = 0; i < expected; i++) {
       if (convertToActualField) {
-        holders[i] = sortStepRowHandler.readWithNoSortFieldConvert(stream);
+        IntermediateSortTempRow intermediateSortTempRow =
+            sortStepRowHandler.readWithNoSortFieldConvert(stream);
+        this.sortTempRowUpdater
+            .updateSortTempRow(intermediateSortTempRow);
+        holders[i] = intermediateSortTempRow;
       } else {
         holders[i] = sortStepRowHandler.readWithoutNoSortFieldConvert(stream);
       }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/TableFieldStat.java b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/TableFieldStat.java
index ef92bbc..9553bc9 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/TableFieldStat.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/TableFieldStat.java
@@ -24,6 +24,9 @@ import java.util.Objects;
 import org.apache.carbondata.core.metadata.datatype.DataType;
 import org.apache.carbondata.core.metadata.encoder.Encoding;
 import org.apache.carbondata.core.metadata.schema.table.column.CarbonDimension;
+import org.apache.carbondata.processing.sort.DummyRowUpdater;
+import org.apache.carbondata.processing.sort.SchemaBasedRowUpdater;
+import org.apache.carbondata.processing.sort.SortTempRowUpdater;
 
 /**
  * This class is used to hold field information for a table during data loading. These information
@@ -66,6 +69,8 @@ public class TableFieldStat implements Serializable {
   // indices for measure columns
   private int[] measureIdx;
 
+  private SortTempRowUpdater sortTempRowUpdater;
+
   public TableFieldStat(SortParameters sortParameters) {
     int noDictDimCnt = sortParameters.getNoDictionaryCount();
     int dictDimCnt = sortParameters.getDimColCount() - noDictDimCnt;
@@ -141,6 +146,13 @@ public class TableFieldStat implements Serializable {
     for (int i = 0; i < measureCnt; i++) {
       measureIdx[i] = base + i;
     }
+    if (sortParameters.isUpdateDictDims() || sortParameters.isUpdateNonDictDims()) {
+      this.sortTempRowUpdater = new SchemaBasedRowUpdater(sortParameters.getDictDimActualPosition(),
+          sortParameters.getNoDictActualPosition(), sortParameters.isUpdateDictDims(),
+          sortParameters.isUpdateNonDictDims());
+    } else {
+      this.sortTempRowUpdater = new DummyRowUpdater();
+    }
   }
 
   public int getDictSortDimCnt() {
@@ -241,4 +253,8 @@ public class TableFieldStat implements Serializable {
   public DataType[] getNoDictDataType() {
     return noDictDataType;
   }
+
+  public SortTempRowUpdater getSortTempRowUpdater() {
+    return sortTempRowUpdater;
+  }
 }
\ No newline at end of file
diff --git a/processing/src/main/java/org/apache/carbondata/processing/store/TablePage.java b/processing/src/main/java/org/apache/carbondata/processing/store/TablePage.java
index 5687549..7cc8932 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/store/TablePage.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/store/TablePage.java
@@ -22,6 +22,7 @@ import java.io.DataOutputStream;
 import java.io.IOException;
 import java.nio.ByteBuffer;
 import java.util.ArrayList;
+import java.util.Arrays;
 import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
@@ -392,14 +393,12 @@ public class TablePage {
   private EncodedColumnPage[] encodeAndCompressDimensions()
       throws KeyGenException, IOException, MemoryException {
     List<EncodedColumnPage> encodedDimensions = new ArrayList<>();
-    EncodedColumnPage[][] complexColumnPages =
-        new EncodedColumnPage[complexDimensionPages.length][];
+    List<EncodedColumnPage> encodedComplexDimensions = new ArrayList<>();
     TableSpec tableSpec = model.getTableSpec();
     int dictIndex = 0;
     int noDictIndex = 0;
     int complexDimIndex = 0;
     int numDimensions = tableSpec.getNumDimensions();
-    int totalComplexColumnSize = 0;
     for (int i = 0; i < numDimensions; i++) {
       ColumnPageEncoder columnPageEncoder;
       EncodedColumnPage encodedPage;
@@ -435,51 +434,17 @@ public class TablePage {
           break;
         case COMPLEX:
           EncodedColumnPage[] encodedPages = ColumnPageEncoder.encodeComplexColumn(
-              complexDimensionPages[complexDimIndex]);
-          complexColumnPages[complexDimIndex] = encodedPages;
-          totalComplexColumnSize += encodedPages.length;
-          complexDimIndex++;
+              complexDimensionPages[complexDimIndex++]);
+          encodedComplexDimensions.addAll(Arrays.asList(encodedPages));
           break;
         default:
           throw new IllegalArgumentException("unsupported dimension type:" + spec
               .getColumnType());
       }
     }
-    // below code is to combine the list based on actual order present in carbon table
-    // in case of older version(eg:1.1) alter add column was supported only with sort columns
-    // and sort step will return the data based on sort column order(sort columns first)
-    // so arranging the column pages based on schema is required otherwise query will
-    // either give wrong result(for string columns) or throw exception in case of non string
-    // column as reading is based on schema order
-    int complexEncodedPageIndex = 0;
-    int normalEncodedPageIndex  = 0;
-    int currentPosition = 0;
-    EncodedColumnPage[] combinedList =
-        new EncodedColumnPage[encodedDimensions.size() + totalComplexColumnSize];
-    for (int i = 0; i < numDimensions; i++) {
-      TableSpec.DimensionSpec spec = tableSpec.getDimensionSpec(i);
-      switch (spec.getColumnType()) {
-        case GLOBAL_DICTIONARY:
-        case DIRECT_DICTIONARY:
-        case PLAIN_VALUE:
-          // add the dimension based on actual postion
-          // current position is considered as complex column will have multiple children
-          combinedList[currentPosition + spec.getActualPostion()] =
-              encodedDimensions.get(normalEncodedPageIndex++);
-          break;
-        case COMPLEX:
-          EncodedColumnPage[] complexColumnPage = complexColumnPages[complexEncodedPageIndex++];
-          for (int j = 0; j < complexColumnPage.length; j++) {
-            combinedList[currentPosition + spec.getActualPostion() + j] = complexColumnPage[j];
-          }
-          // as for complex type 1 position is already considered, so subtract -1
-          currentPosition += complexColumnPage.length - 1;
-          break;
-        default:
-          throw new IllegalArgumentException("unsupported dimension type:" + spec.getColumnType());
-      }
-    }
-    return combinedList;
+
+    encodedDimensions.addAll(encodedComplexDimensions);
+    return encodedDimensions.toArray(new EncodedColumnPage[encodedDimensions.size()]);
   }
 
   /**


[carbondata] 39/41: [CARBONDATA-3332] Blocked concurrent compaction and update/delete

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 81e971465ae1f6d9a957f3c3f1f4363e4b837805
Author: kunal642 <ku...@gmail.com>
AuthorDate: Wed Mar 27 14:44:11 2019 +0530

    [CARBONDATA-3332] Blocked concurrent compaction and update/delete
    
    Problem:
    When update and compaction are executed concurrently then update is trying to update the contents of segment 0 whereas compaction has already marked the segment as COMPACTED. This compacted segment is of no use to the update command and therefore when trying to access the segment info from map it throws key not found.
    
    Solution:
    Block update and compaction concurrent operations.
    
    This closes #3166
---
 .../apache/carbondata/core/locks/LockUsage.java    |  1 +
 .../spark/rdd/CarbonDataRDDFactory.scala           | 48 +++++++-----
 .../CarbonAlterTableCompactionCommand.scala        | 55 +++++++------
 .../mutation/CarbonProjectForDeleteCommand.scala   | 18 ++++-
 .../mutation/CarbonProjectForUpdateCommand.scala   | 90 ++++++++++++----------
 5 files changed, 123 insertions(+), 89 deletions(-)

diff --git a/core/src/main/java/org/apache/carbondata/core/locks/LockUsage.java b/core/src/main/java/org/apache/carbondata/core/locks/LockUsage.java
index b16c3f1..14907c5 100644
--- a/core/src/main/java/org/apache/carbondata/core/locks/LockUsage.java
+++ b/core/src/main/java/org/apache/carbondata/core/locks/LockUsage.java
@@ -36,5 +36,6 @@ public class LockUsage {
   public static final String STREAMING_LOCK = "streaming.lock";
   public static final String DATAMAP_STATUS_LOCK = "datamapstatus.lock";
   public static final String CONCURRENT_LOAD_LOCK = "concurrentload.lock";
+  public static final String UPDATE_LOCK = "update.lock";
 
 }
diff --git a/integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala b/integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala
index 8268379..8a04887 100644
--- a/integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala
+++ b/integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala
@@ -54,6 +54,7 @@ import org.apache.carbondata.core.datastore.compression.CompressorFactory
 import org.apache.carbondata.core.datastore.filesystem.CarbonFile
 import org.apache.carbondata.core.datastore.impl.FileFactory
 import org.apache.carbondata.core.dictionary.server.DictionaryServer
+import org.apache.carbondata.core.exception.ConcurrentOperationException
 import org.apache.carbondata.core.locks.{CarbonLockFactory, ICarbonLock, LockUsage}
 import org.apache.carbondata.core.metadata.{CarbonTableIdentifier, ColumnarFormatVersion, SegmentFileStore}
 import org.apache.carbondata.core.metadata.datatype.DataTypes
@@ -867,27 +868,34 @@ object CarbonDataRDDFactory {
         val lock = CarbonLockFactory.getCarbonLockObj(
           carbonTable.getAbsoluteTableIdentifier,
           LockUsage.COMPACTION_LOCK)
-
-        if (lock.lockWithRetries()) {
-          LOGGER.info("Acquired the compaction lock.")
-          try {
-            startCompactionThreads(sqlContext,
-              carbonLoadModel,
-              storeLocation,
-              compactionModel,
-              lock,
-              compactedSegments,
-              operationContext
-            )
-          } catch {
-            case e: Exception =>
-              LOGGER.error(s"Exception in start compaction thread. ${ e.getMessage }")
-              lock.unlock()
-              throw e
+        val updateLock = CarbonLockFactory.getCarbonLockObj(carbonTable
+          .getAbsoluteTableIdentifier, LockUsage.UPDATE_LOCK)
+        try {
+          if (updateLock.lockWithRetries(3, 3)) {
+            if (lock.lockWithRetries()) {
+              LOGGER.info("Acquired the compaction lock.")
+              startCompactionThreads(sqlContext,
+                carbonLoadModel,
+                storeLocation,
+                compactionModel,
+                lock,
+                compactedSegments,
+                operationContext
+              )
+            } else {
+              LOGGER.error("Not able to acquire the compaction lock for table " +
+                           s"${ carbonLoadModel.getDatabaseName }.${ carbonLoadModel.getTableName}")
+            }
+          } else {
+            throw new ConcurrentOperationException(carbonTable, "update", "compaction")
           }
-        } else {
-          LOGGER.error("Not able to acquire the compaction lock for table " +
-                       s"${ carbonLoadModel.getDatabaseName }.${ carbonLoadModel.getTableName}")
+        } catch {
+          case e: Exception =>
+            LOGGER.error(s"Exception in start compaction thread.", e)
+            lock.unlock()
+            throw e
+        } finally {
+          updateLock.unlock()
         }
       }
     }
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonAlterTableCompactionCommand.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonAlterTableCompactionCommand.scala
index 419fa16..9d8bf90 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonAlterTableCompactionCommand.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonAlterTableCompactionCommand.scala
@@ -288,31 +288,38 @@ case class CarbonAlterTableCompactionCommand(
       val lock = CarbonLockFactory.getCarbonLockObj(
         carbonTable.getAbsoluteTableIdentifier,
         LockUsage.COMPACTION_LOCK)
-
-      if (lock.lockWithRetries()) {
-        LOGGER.info("Acquired the compaction lock for table" +
-                    s" ${ carbonLoadModel.getDatabaseName }.${ carbonLoadModel.getTableName }")
-        try {
-          CarbonDataRDDFactory.startCompactionThreads(
-            sqlContext,
-            carbonLoadModel,
-            storeLocation,
-            compactionModel,
-            lock,
-            compactedSegments,
-            operationContext
-          )
-        } catch {
-          case e: Exception =>
-            LOGGER.error(s"Exception in start compaction thread. ${ e.getMessage }")
-            lock.unlock()
-            throw e
+      val updateLock = CarbonLockFactory.getCarbonLockObj(carbonTable
+        .getAbsoluteTableIdentifier, LockUsage.UPDATE_LOCK)
+      try {
+        if (updateLock.lockWithRetries(3, 3)) {
+          if (lock.lockWithRetries()) {
+            LOGGER.info("Acquired the compaction lock for table" +
+                        s" ${ carbonLoadModel.getDatabaseName }.${ carbonLoadModel.getTableName }")
+            CarbonDataRDDFactory.startCompactionThreads(
+              sqlContext,
+              carbonLoadModel,
+              storeLocation,
+              compactionModel,
+              lock,
+              compactedSegments,
+              operationContext
+            )
+          } else {
+            LOGGER.error(s"Not able to acquire the compaction lock for table" +
+                         s" ${ carbonLoadModel.getDatabaseName }.${ carbonLoadModel.getTableName }")
+            CarbonException.analysisException(
+              "Table is already locked for compaction. Please try after some time.")
+          }
+        } else {
+          throw new ConcurrentOperationException(carbonTable, "update", "compaction")
         }
-      } else {
-        LOGGER.error(s"Not able to acquire the compaction lock for table" +
-                     s" ${ carbonLoadModel.getDatabaseName }.${ carbonLoadModel.getTableName }")
-        CarbonException.analysisException(
-          "Table is already locked for compaction. Please try after some time.")
+      } catch {
+        case e: Exception =>
+          LOGGER.error(s"Exception in start compaction thread.", e)
+          lock.unlock()
+          throw e
+      } finally {
+        updateLock.unlock()
       }
     }
   }
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForDeleteCommand.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForDeleteCommand.scala
index 70a4350..709260e 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForDeleteCommand.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForDeleteCommand.scala
@@ -51,10 +51,6 @@ private[sql] case class CarbonProjectForDeleteCommand(
       throw new MalformedCarbonCommandException("Unsupported operation on non transactional table")
     }
 
-    if (SegmentStatusManager.isCompactionInProgress(carbonTable)) {
-      throw new ConcurrentOperationException(carbonTable, "compaction", "data delete")
-    }
-
     if (SegmentStatusManager.isLoadInProgressInTable(carbonTable)) {
       throw new ConcurrentOperationException(carbonTable, "loading", "data delete")
     }
@@ -77,10 +73,22 @@ private[sql] case class CarbonProjectForDeleteCommand(
     val metadataLock = CarbonLockFactory
       .getCarbonLockObj(carbonTable.getAbsoluteTableIdentifier,
         LockUsage.METADATA_LOCK)
+    val compactionLock = CarbonLockFactory
+      .getCarbonLockObj(carbonTable.getAbsoluteTableIdentifier,
+        LockUsage.COMPACTION_LOCK)
+    val updateLock = CarbonLockFactory
+      .getCarbonLockObj(carbonTable.getAbsoluteTableIdentifier,
+        LockUsage.UPDATE_LOCK)
     var lockStatus = false
     try {
       lockStatus = metadataLock.lockWithRetries()
       if (lockStatus) {
+        if (!compactionLock.lockWithRetries(3, 3)) {
+          throw new ConcurrentOperationException(carbonTable, "compaction", "delete")
+        }
+        if (!updateLock.lockWithRetries(3, 3)) {
+          throw new ConcurrentOperationException(carbonTable, "update/delete", "delete")
+        }
         LOGGER.info("Successfully able to get the table metadata file lock")
       } else {
         throw new Exception("Table is locked for deletion. Please try after some time")
@@ -134,6 +142,8 @@ private[sql] case class CarbonProjectForDeleteCommand(
       if (lockStatus) {
         CarbonLockUtil.fileUnlock(metadataLock, LockUsage.METADATA_LOCK)
       }
+      updateLock.unlock()
+      compactionLock.unlock()
     }
     Seq.empty
   }
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForUpdateCommand.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForUpdateCommand.scala
index e4abae1..705ba4b 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForUpdateCommand.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForUpdateCommand.scala
@@ -79,9 +79,6 @@ private[sql] case class CarbonProjectForUpdateCommand(
     if (!carbonTable.getTableInfo.isTransactionalTable) {
       throw new MalformedCarbonCommandException("Unsupported operation on non transactional table")
     }
-    if (SegmentStatusManager.isCompactionInProgress(carbonTable)) {
-      throw new ConcurrentOperationException(carbonTable, "compaction", "data update")
-    }
     if (SegmentStatusManager.isLoadInProgressInTable(carbonTable)) {
       throw new ConcurrentOperationException(carbonTable, "loading", "data update")
     }
@@ -99,6 +96,10 @@ private[sql] case class CarbonProjectForUpdateCommand(
     val metadataLock = CarbonLockFactory
       .getCarbonLockObj(carbonTable.getAbsoluteTableIdentifier,
         LockUsage.METADATA_LOCK)
+    val compactionLock = CarbonLockFactory.getCarbonLockObj(carbonTable
+      .getAbsoluteTableIdentifier, LockUsage.COMPACTION_LOCK)
+    val updateLock = CarbonLockFactory.getCarbonLockObj(carbonTable.getAbsoluteTableIdentifier,
+      LockUsage.UPDATE_LOCK)
     var lockStatus = false
     // get the current time stamp which should be same for delete and update.
     val currentTime = CarbonUpdateUtil.readCurrentTime
@@ -113,45 +114,51 @@ private[sql] case class CarbonProjectForUpdateCommand(
       else {
         throw new Exception("Table is locked for updation. Please try after some time")
       }
-      // Get RDD.
 
-      dataSet = if (isPersistEnabled) {
-        Dataset.ofRows(sparkSession, plan).persist(StorageLevel.fromString(
-          CarbonProperties.getInstance.getUpdateDatasetStorageLevel()))
-      }
-      else {
-        Dataset.ofRows(sparkSession, plan)
-      }
       val executionErrors = new ExecutionErrors(FailureCauses.NONE, "")
-
-
-      // handle the clean up of IUD.
-      CarbonUpdateUtil.cleanUpDeltaFiles(carbonTable, false)
-
-      // do delete operation.
-      val segmentsToBeDeleted = DeleteExecution.deleteDeltaExecution(
-        databaseNameOp,
-        tableName,
-        sparkSession,
-        dataSet.rdd,
-        currentTime + "",
-        isUpdateOperation = true,
-        executionErrors)
-
-      if (executionErrors.failureCauses != FailureCauses.NONE) {
-        throw new Exception(executionErrors.errorMsg)
+      if (updateLock.lockWithRetries(3, 3)) {
+        if (compactionLock.lockWithRetries(3, 3)) {
+          // Get RDD.
+          dataSet = if (isPersistEnabled) {
+            Dataset.ofRows(sparkSession, plan).persist(StorageLevel.fromString(
+              CarbonProperties.getInstance.getUpdateDatasetStorageLevel()))
+          }
+          else {
+            Dataset.ofRows(sparkSession, plan)
+          }
+
+          // handle the clean up of IUD.
+          CarbonUpdateUtil.cleanUpDeltaFiles(carbonTable, false)
+
+          // do delete operation.
+          val segmentsToBeDeleted = DeleteExecution.deleteDeltaExecution(
+            databaseNameOp,
+            tableName,
+            sparkSession,
+            dataSet.rdd,
+            currentTime + "",
+            isUpdateOperation = true,
+            executionErrors)
+
+          if (executionErrors.failureCauses != FailureCauses.NONE) {
+            throw new Exception(executionErrors.errorMsg)
+          }
+
+          // do update operation.
+          performUpdate(dataSet,
+            databaseNameOp,
+            tableName,
+            plan,
+            sparkSession,
+            currentTime,
+            executionErrors,
+            segmentsToBeDeleted)
+        } else {
+          throw new ConcurrentOperationException(carbonTable, "compaction", "update")
+        }
+      } else {
+        throw new ConcurrentOperationException(carbonTable, "update/delete", "update")
       }
-
-      // do update operation.
-      performUpdate(dataSet,
-        databaseNameOp,
-        tableName,
-        plan,
-        sparkSession,
-        currentTime,
-        executionErrors,
-        segmentsToBeDeleted)
-
       if (executionErrors.failureCauses != FailureCauses.NONE) {
         throw new Exception(executionErrors.errorMsg)
       }
@@ -185,11 +192,12 @@ private[sql] case class CarbonProjectForUpdateCommand(
           sys.error("Update operation failed. " + e.getCause.getMessage)
         }
         sys.error("Update operation failed. please check logs.")
-    }
-    finally {
+    } finally {
       if (null != dataSet && isPersistEnabled) {
         dataSet.unpersist()
       }
+      updateLock.unlock()
+      compactionLock.unlock()
       if (lockStatus) {
         CarbonLockUtil.fileUnlock(metadataLock, LockUsage.METADATA_LOCK)
       }


[carbondata] 35/41: [CARBONDATA-3319][TestCase]Added condition to check if datamap exist or not before caching

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit e565d1f5b12c348b91be43403ce45d45ecc7bf08
Author: Aryan-Khaitan <ar...@gmail.com>
AuthorDate: Tue Mar 19 14:42:39 2019 +0530

    [CARBONDATA-3319][TestCase]Added condition to check if datamap
    exist or not before caching
    
    1. Syntax Error, bracket was missing in QueriesBVATestCase.scala
    2. In TestCreateTableUsingSparkCarbonFileFormat.scala,
    clearDataMapCache was checking whether datamap exist or not and
    then deleted it. Thus a new class has been added for ReplaceRule
    as there was no ReplaceRule applied earlier for SDV.
    3. The test case of parition by date is an old use of
    partition and thus has been ignored in test cases.
    4. The complex delimeters used has been corrected to "\001" instead of "$".
    
    This closes #3153
---
 .../cluster/sdv/generated/QueriesBVATestCase.scala |   4 +-
 ...teTableUsingSparkCarbonFileFormatTestCase.scala |   7 +-
 .../datasource/SparkCarbonDataSourceTestCase.scala |  19 +--
 .../spark/sql/common/util/DataSourceTestUtil.scala | 144 +++++++++++++++++++++
 .../TestAllDataTypeForPartitionTable.scala         |   4 +-
 5 files changed, 162 insertions(+), 16 deletions(-)

diff --git a/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/QueriesBVATestCase.scala b/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/QueriesBVATestCase.scala
index 130fe08..11c705d 100644
--- a/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/QueriesBVATestCase.scala
+++ b/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/QueriesBVATestCase.scala
@@ -10697,8 +10697,8 @@ class QueriesBVATestCase extends QueryTest with BeforeAndAfterAll {
   //PushUP_FILTER_test_boundary_TC194
   test("PushUP_FILTER_test_boundary_TC194", Include) {
 
-    checkAnswer(s"""select min(c2_Bigint),max(c2_Bigint),sum(c2_Bigint),avg(c2_Bigint) , count(c2_Bigint), variance(c2_Bigint) from (select c2_Bigint from Test_Boundary where sin(c1_int)=0.18796200317975467 or sin(c1_int)=-0.18796200317975467 order by c2_Bigint""",
-      s"""select min(c2_Bigint),max(c2_Bigint),sum(c2_Bigint),avg(c2_Bigint) , count(c2_Bigint), variance(c2_Bigint) from (select c2_Bigint from Test_Boundary_hive where sin(c1_int)=0.18796200317975467 or sin(c1_int)=-0.18796200317975467 order by c2_Bigint""", "QueriesBVATestCase_PushUP_FILTER_test_boundary_TC194")
+    checkAnswer(s"""select min(c2_Bigint),max(c2_Bigint),sum(c2_Bigint),avg(c2_Bigint) , count(c2_Bigint), variance(c2_Bigint) from (select c2_Bigint from Test_Boundary where sin(c1_int)=0.18796200317975467 or sin(c1_int)=-0.18796200317975467 order by c2_Bigint)""",
+      s"""select min(c2_Bigint),max(c2_Bigint),sum(c2_Bigint),avg(c2_Bigint) , count(c2_Bigint), variance(c2_Bigint) from (select c2_Bigint from Test_Boundary_hive where sin(c1_int)=0.18796200317975467 or sin(c1_int)=-0.18796200317975467 order by c2_Bigint)""", "QueriesBVATestCase_PushUP_FILTER_test_boundary_TC194")
 
   }
 
diff --git a/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/datasource/CreateTableUsingSparkCarbonFileFormatTestCase.scala b/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/datasource/CreateTableUsingSparkCarbonFileFormatTestCase.scala
index b96fe10..ecf9ff4 100644
--- a/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/datasource/CreateTableUsingSparkCarbonFileFormatTestCase.scala
+++ b/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/datasource/CreateTableUsingSparkCarbonFileFormatTestCase.scala
@@ -23,7 +23,8 @@ import java.util.{Date, Random}
 
 import org.apache.commons.io.FileUtils
 import org.apache.commons.lang.RandomStringUtils
-import org.scalatest.BeforeAndAfterAll
+import org.scalatest.{BeforeAndAfterAll, FunSuite}
+import org.apache.spark.sql.common.util.DataSourceTestUtil._
 import org.apache.spark.util.SparkUtil
 import org.apache.carbondata.core.datastore.filesystem.CarbonFile
 import org.apache.carbondata.core.datastore.impl.FileFactory
@@ -37,8 +38,8 @@ import org.apache.carbondata.core.constants.CarbonCommonConstants
 import org.apache.carbondata.core.datamap.DataMapStoreManager
 import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier
 
-class CreateTableUsingSparkCarbonFileFormatTestCase extends QueryTest with BeforeAndAfterAll {
-
+class CreateTableUsingSparkCarbonFileFormatTestCase extends FunSuite with BeforeAndAfterAll {
+  import spark._
   override def beforeAll(): Unit = {
     sql("DROP TABLE IF EXISTS sdkOutputTable")
   }
diff --git a/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/datasource/SparkCarbonDataSourceTestCase.scala b/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/datasource/SparkCarbonDataSourceTestCase.scala
index 8f41ba7..abdced7 100644
--- a/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/datasource/SparkCarbonDataSourceTestCase.scala
+++ b/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/datasource/SparkCarbonDataSourceTestCase.scala
@@ -25,10 +25,10 @@ import org.apache.avro.file.DataFileWriter
 import org.apache.avro.generic.{GenericDatumReader, GenericDatumWriter, GenericRecord}
 import org.apache.avro.io.{DecoderFactory, Encoder}
 import org.apache.spark.sql.{AnalysisException, Row}
-import org.apache.spark.sql.common.util.QueryTest
+import org.apache.spark.sql.common.util.DataSourceTestUtil._
 import org.apache.spark.sql.test.TestQueryExecutor
 import org.junit.Assert
-import org.scalatest.BeforeAndAfterAll
+import org.scalatest.{BeforeAndAfterAll,FunSuite}
 
 import org.apache.carbondata.core.datamap.DataMapStoreManager
 import org.apache.carbondata.core.datastore.impl.FileFactory
@@ -37,7 +37,8 @@ import org.apache.carbondata.core.metadata.datatype.{DataTypes, StructField}
 import org.apache.carbondata.hadoop.testutil.StoreCreator
 import org.apache.carbondata.sdk.file.{CarbonWriter, Field, Schema}
 
-class SparkCarbonDataSourceTestCase extends QueryTest with BeforeAndAfterAll {
+class SparkCarbonDataSourceTestCase extends FunSuite with BeforeAndAfterAll {
+  import spark._
 
   val warehouse1 = s"${TestQueryExecutor.projectPath}/integration/spark-datasource/target/warehouse"
 
@@ -616,7 +617,7 @@ class SparkCarbonDataSourceTestCase extends QueryTest with BeforeAndAfterAll {
       "double, HQ_DEPOSIT double) row format delimited fields terminated by ',' collection items " +
       "terminated by '$'")
     val sourceFile = FileFactory
-      .getPath(s"$resourcesPath" + "../../../../../spark-datasource/src/test/resources/Array.csv")
+      .getPath(s"$resource" + "../../../../../spark-datasource/src/test/resources/Array.csv")
       .toString
     sql(s"load data local inpath '$sourceFile' into table array_com_hive")
     sql(
@@ -643,7 +644,7 @@ class SparkCarbonDataSourceTestCase extends QueryTest with BeforeAndAfterAll {
       "terminated by '$' map keys terminated by '&'")
     val sourceFile = FileFactory
       .getPath(
-        s"$resourcesPath" + "../../../../../spark-datasource/src/test/resources/structofarray.csv")
+        s"$resource" + "../../../../../spark-datasource/src/test/resources/structofarray.csv")
       .toString
     sql(s"load data local inpath '$sourceFile' into table STRUCT_OF_ARRAY_com_hive")
     sql(
@@ -890,7 +891,7 @@ class SparkCarbonDataSourceTestCase extends QueryTest with BeforeAndAfterAll {
 
       var i = 0
       while (i < 11) {
-        val array = Array[String](s"name$i", s"$i" + "$" + s"$i.${ i }12")
+        val array = Array[String](s"name$i", s"$i" + "\001" + s"$i.${ i }12")
         writer.write(array)
         i += 1
       }
@@ -992,8 +993,8 @@ class SparkCarbonDataSourceTestCase extends QueryTest with BeforeAndAfterAll {
       var i = 0
       while (i < 10) {
         val array = Array[String](s"name$i",
-          s"$i" + "$" + s"${ i * 2 }",
-          s"${ i / 2 }" + "$" + s"${ i / 3 }")
+          s"$i" + "\001" + s"${ i * 2 }",
+          s"${ i / 2 }" + "\001" + s"${ i / 3 }")
         writer.write(array)
         i += 1
       }
@@ -1273,7 +1274,7 @@ class SparkCarbonDataSourceTestCase extends QueryTest with BeforeAndAfterAll {
       " Timestamp,deliveryDate timestamp,deliverycharge double)row format delimited FIELDS " +
       "terminated by ',' LINES terminated by '\n' stored as textfile")
     val sourceFile = FileFactory
-      .getPath(s"$resourcesPath" +
+      .getPath(s"$resource" +
                "../../../../../spark-datasource/src/test/resources/vardhandaterestruct.csv")
       .toString
     sql(s"load data local inpath '$sourceFile' into table fileformat_drop_hive")
diff --git a/integration/spark-common-cluster-test/src/test/scala/org/apache/spark/sql/common/util/DataSourceTestUtil.scala b/integration/spark-common-cluster-test/src/test/scala/org/apache/spark/sql/common/util/DataSourceTestUtil.scala
new file mode 100644
index 0000000..8a5b154
--- /dev/null
+++ b/integration/spark-common-cluster-test/src/test/scala/org/apache/spark/sql/common/util/DataSourceTestUtil.scala
@@ -0,0 +1,144 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.common.util
+
+import java.io.File
+
+import scala.collection.JavaConverters._
+
+
+import org.apache.spark.sql.carbondata.execution.datasources.CarbonFileIndexReplaceRule
+import org.apache.spark.sql.{DataFrame, Row, SparkSession}
+import org.apache.spark.sql.catalyst.plans.logical
+import org.apache.spark.sql.catalyst.util.sideBySide
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.util.CarbonProperties
+
+
+object DataSourceTestUtil {
+
+  val rootPath = new File(this.getClass.getResource("/").getPath
+                          + "../../../..").getCanonicalPath
+  val warehouse1 = FileFactory.getPath(s"$rootPath/integration/spark-datasource/target/warehouse")
+    .toString
+  val resource = s"$rootPath/integration/spark-datasource/src/test/resources"
+  val metaStoreDB1 = s"$rootPath/integration/spark-datasource/target"
+  val spark = SparkSession
+    .builder()
+    .enableHiveSupport()
+    .master("local")
+    .config("spark.sql.warehouse.dir", warehouse1)
+    .config("spark.driver.host", "localhost")
+    .config("spark.sql.crossJoin.enabled", "true")
+    .config("spark.sql.hive.caseSensitiveInferenceMode", "INFER_AND_SAVE")
+    .getOrCreate()
+  spark.sparkContext.setLogLevel("ERROR")
+  if (!spark.sparkContext.version.startsWith("2.1")) {
+    spark.experimental.extraOptimizations = Seq(new CarbonFileIndexReplaceRule)
+  }
+  CarbonProperties.getInstance()
+    .addProperty(CarbonCommonConstants.CARBON_MINMAX_ALLOWED_BYTE_COUNT, "40")
+
+  def checkAnswer(df: DataFrame, expectedAnswer: java.util.List[Row]): Unit = {
+    checkAnswer(df, expectedAnswer.asScala)
+  }
+
+  def checkExistence(df: DataFrame, exists: Boolean, keywords: String*) {
+    val outputs = df.collect().map(_.mkString).mkString
+    for (key <- keywords) {
+      if (exists) {
+        assert(outputs.contains(key), s"Failed for $df ($key doesn't exist in result)")
+      } else {
+        assert(!outputs.contains(key), s"Failed for $df ($key existed in the result)")
+      }
+    }
+  }
+
+  def checkAnswer(df: DataFrame, expectedAnswer: DataFrame): Unit = {
+    checkAnswer(df, expectedAnswer.collect())
+  }
+
+  /**
+   * Runs the plan and makes sure the answer matches the expected result.
+   * If there was exception during the execution or the contents of the DataFrame does not
+   * match the expected result, an error message will be returned. Otherwise, a [[None]] will
+   * be returned.
+   *
+   * @param df             the [[DataFrame]] to be executed
+   * @param expectedAnswer the expected result in a [[Seq]] of [[Row]]s.
+   */
+  def checkAnswer(df: DataFrame, expectedAnswer: Seq[Row]): Unit = {
+    val isSorted = df.logicalPlan.collect { case s: logical.Sort => s }.nonEmpty
+
+    def prepareAnswer(answer: Seq[Row]): Seq[Row] = {
+      // Converts data to types that we can do equality comparison using Scala collections.
+      // For BigDecimal type, the Scala type has a better definition of equality test (similar to
+      // Java's java.math.BigDecimal.compareTo).
+      // For binary arrays, we convert it to Seq to avoid of calling java.util.Arrays.equals for
+      // equality test.
+      val converted: Seq[Row] = answer.map { s =>
+        Row.fromSeq(s.toSeq.map {
+          case d: java.math.BigDecimal => BigDecimal(d)
+          case b: Array[Byte] => b.toSeq
+          case d: Double =>
+            if (!d.isInfinite && !d.isNaN) {
+              var bd = BigDecimal(d)
+              bd = bd.setScale(5, BigDecimal.RoundingMode.UP)
+              bd.doubleValue()
+            }
+            else {
+              d
+            }
+          case o => o
+        })
+      }
+      if (!isSorted) converted.sortBy(_.toString()) else converted
+    }
+
+    val sparkAnswer = try df.collect().toSeq catch {
+      case e: Exception =>
+        val errorMessage =
+          s"""
+             |Exception thrown while executing query:
+             |${ df.queryExecution }
+             |== Exception ==
+             |$e
+             |${ org.apache.spark.sql.catalyst.util.stackTraceToString(e) }
+          """.stripMargin
+        return Some(errorMessage)
+    }
+
+    if (prepareAnswer(expectedAnswer) != prepareAnswer(sparkAnswer)) {
+      val errorMessage =
+        s"""
+           |Results do not match for query:
+           |${ df.queryExecution }
+           |== Results ==
+           |${
+          sideBySide(
+            s"== Correct Answer - ${ expectedAnswer.size } ==" +:
+            prepareAnswer(expectedAnswer).map(_.toString()),
+            s"== Spark Answer - ${ sparkAnswer.size } ==" +:
+            prepareAnswer(sparkAnswer).map(_.toString())).mkString("\n")
+        }
+      """.stripMargin
+      assert(false, errorMessage)
+    }
+  }
+}
\ No newline at end of file
diff --git a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/partition/TestAllDataTypeForPartitionTable.scala b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/partition/TestAllDataTypeForPartitionTable.scala
index 54586c2..82c6d48 100644
--- a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/partition/TestAllDataTypeForPartitionTable.scala
+++ b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/partition/TestAllDataTypeForPartitionTable.scala
@@ -328,7 +328,7 @@ class TestAllDataTypeForPartitionTable extends QueryTest with BeforeAndAfterAll
       Seq(Row(32767, 2147483647, 9223372036854775807L, 2147483648.1, 9223372036854775808.1, BigDecimal("9223372036854775808.1234"), Timestamp.valueOf("2017-06-13 23:59:59"), Date.valueOf("2017-06-13"), "abc3", "abcd3", "abcde3", new mutable.WrappedArray.ofRef[String](Array("a", "b", "c", "3")), Row("a", "b", "3"))))
   }
 
-  test("allTypeTable_hash_date") {
+  ignore("allTypeTable_hash_date") {
     val tableName = "allTypeTable_hash_date"
 
     sql(
@@ -1096,7 +1096,7 @@ class TestAllDataTypeForPartitionTable extends QueryTest with BeforeAndAfterAll
       Seq(Row(32767, 2147483647, 9223372036854775807L, 2147483648.1, 9223372036854775808.1, BigDecimal("9223372036854775808.1234"), Timestamp.valueOf("2017-06-13 23:59:59"), Date.valueOf("2017-06-13"), "abc3", "abcd3", "abcde3", new mutable.WrappedArray.ofRef[String](Array("a", "b", "c", "3")), Row("a", "b", "3"))))
   }
 
-  test("allTypeTable_range_date") {
+  ignore("allTypeTable_range_date") {
     val tableName = "allTypeTable_range_date"
 
     sql(


[carbondata] 34/41: [HOTFIX][DOC] Optimize quick-start-guide.md and dml-of-carbondata.md

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit cde80f28618c3b187dc3c4a63dd53cb5c9ee5b0b
Author: Zhang Zhichao <44...@qq.com>
AuthorDate: Sun Mar 17 23:29:06 2019 +0800

    [HOTFIX][DOC] Optimize quick-start-guide.md and dml-of-carbondata.md
    
    add note for using Spark + Hive 1.1.X in 'quick-start-guide.md' file
    separate 'load data' and 'insert into' sql to avoid ambiguity
    remove 'LOAD DATA LOCAL' semantic, currently the 'LOCAL' is invalid.
    
    This closes #3151
---
 docs/dml-of-carbondata.md | 24 +++++++++++++++---------
 docs/quick-start-guide.md |  6 ++++--
 2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/docs/dml-of-carbondata.md b/docs/dml-of-carbondata.md
index f89c49a..6ec0520 100644
--- a/docs/dml-of-carbondata.md
+++ b/docs/dml-of-carbondata.md
@@ -35,10 +35,13 @@ CarbonData DML statements are documented here,which includes:
   This command is used to load csv files to carbondata, OPTIONS are not mandatory for data loading process. 
 
   ```
-  LOAD DATA [LOCAL] INPATH 'folder_path' 
+  LOAD DATA INPATH 'folder_path'
   INTO TABLE [db_name.]table_name 
   OPTIONS(property_name=property_value, ...)
   ```
+  **NOTE**:
+    * Use 'file://' prefix to indicate local input files path, but it just supports local mode.
+    * If run on cluster mode, please upload all input files to distributed file system, for example 'hdfs://' for hdfs.
 
   **Supported Properties:**
 
@@ -232,7 +235,7 @@ CarbonData DML statements are documented here,which includes:
    Example:
 
    ```
-   LOAD DATA local inpath '/opt/rawdata/data.csv' INTO table carbontable
+   LOAD DATA inpath '/opt/rawdata/data.csv' INTO table carbontable
    options('DELIMITER'=',', 'QUOTECHAR'='"','COMMENTCHAR'='#',
    'HEADER'='false',
    'FILEHEADER'='empno,empname,designation,doj,workgroupcategory,
@@ -350,17 +353,19 @@ CarbonData DML statements are documented here,which includes:
   This command allows you to load data using static partition.
 
   ```
-  LOAD DATA [LOCAL] INPATH 'folder_path' 
+  LOAD DATA INPATH 'folder_path'
   INTO TABLE [db_name.]table_name PARTITION (partition_spec) 
-  OPTIONS(property_name=property_value, ...)    
-  INSERT INTO INTO TABLE [db_name.]table_name PARTITION (partition_spec) <SELECT STATEMENT>
+  OPTIONS(property_name=property_value, ...)
+
+  INSERT INTO TABLE [db_name.]table_name PARTITION (partition_spec) <SELECT STATEMENT>
   ```
 
   Example:
   ```
-  LOAD DATA LOCAL INPATH '${env:HOME}/staticinput.csv'
+  LOAD DATA INPATH '${env:HOME}/staticinput.csv'
   INTO TABLE locationTable
-  PARTITION (country = 'US', state = 'CA')  
+  PARTITION (country = 'US', state = 'CA')
+
   INSERT INTO TABLE locationTable
   PARTITION (country = 'US', state = 'AL')
   SELECT <columns list excluding partition columns> FROM another_user
@@ -372,8 +377,9 @@ CarbonData DML statements are documented here,which includes:
 
   Example:
   ```
-  LOAD DATA LOCAL INPATH '${env:HOME}/staticinput.csv'
-  INTO TABLE locationTable          
+  LOAD DATA INPATH '${env:HOME}/staticinput.csv'
+  INTO TABLE locationTable
+
   INSERT INTO TABLE locationTable
   SELECT <columns list excluding partition columns> FROM another_user
   ```
diff --git a/docs/quick-start-guide.md b/docs/quick-start-guide.md
index 244a9ae..316fa26 100644
--- a/docs/quick-start-guide.md
+++ b/docs/quick-start-guide.md
@@ -241,7 +241,9 @@ mv carbondata.tar.gz carbonlib/
 --executor-cores 2
 ```
 
-**NOTE**: Make sure you have permissions for CarbonData JARs and files through which driver and executor will start.
+**NOTE**:
+ - Make sure you have permissions for CarbonData JARs and files through which driver and executor will start.
+ - If use Spark + Hive 1.1.X, it needs to add carbondata assembly jar and carbondata-hive jar into parameter 'spark.sql.hive.metastore.jars' in spark-default.conf file.
 
 
 
@@ -485,4 +487,4 @@ select * from carbon_table;
 
 **Note :** Create Tables and data loads should be done before executing queries as we can not create carbon table from this interface.
 
-```
\ No newline at end of file
+```


[carbondata] 23/41: [CARBONDATA-3311] support presto 0.217 #3142

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit bfc912a1b3548de3be6d43dac733ba5c9221cdb8
Author: ajantha-bhat <aj...@gmail.com>
AuthorDate: Fri Mar 8 14:47:36 2019 +0800

    [CARBONDATA-3311] support presto 0.217 #3142
    
    supporting the latest version of the presto. please refer the release doc of presto for more details,
    there is a change in presto-hive interfaces and hive analyser is added.
    
    This closes #3142
---
 integration/presto/pom.xml                                        | 2 +-
 .../org/apache/carbondata/presto/CarbondataConnectorFactory.java  | 7 +++++--
 .../main/java/org/apache/carbondata/presto/CarbondataModule.java  | 8 +++++---
 .../java/org/apache/carbondata/presto/CarbondataSplitManager.java | 2 +-
 4 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/integration/presto/pom.xml b/integration/presto/pom.xml
index 91221d6..5253677 100644
--- a/integration/presto/pom.xml
+++ b/integration/presto/pom.xml
@@ -31,7 +31,7 @@
   <packaging>presto-plugin</packaging>
 
   <properties>
-    <presto.version>0.210</presto.version>
+    <presto.version>0.217</presto.version>
     <httpcore.version>4.4.9</httpcore.version>
     <dev.path>${basedir}/../../dev</dev.path>
     <jacoco.append>true</jacoco.append>
diff --git a/integration/presto/src/main/java/org/apache/carbondata/presto/CarbondataConnectorFactory.java b/integration/presto/src/main/java/org/apache/carbondata/presto/CarbondataConnectorFactory.java
index 1dd5176..eefcc5c 100755
--- a/integration/presto/src/main/java/org/apache/carbondata/presto/CarbondataConnectorFactory.java
+++ b/integration/presto/src/main/java/org/apache/carbondata/presto/CarbondataConnectorFactory.java
@@ -29,6 +29,7 @@ import org.apache.carbondata.hadoop.api.CarbonTableInputFormat;
 import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat;
 import org.apache.carbondata.presto.impl.CarbonTableConfig;
 
+import com.facebook.presto.hive.HiveAnalyzeProperties;
 import com.facebook.presto.hive.HiveConnector;
 import com.facebook.presto.hive.HiveConnectorFactory;
 import com.facebook.presto.hive.HiveMetadataFactory;
@@ -86,7 +87,7 @@ public class CarbondataConnectorFactory extends HiveConnectorFactory {
   private final ClassLoader classLoader;
 
   public CarbondataConnectorFactory(String connectorName, ClassLoader classLoader) {
-    super(connectorName, classLoader, null);
+    super(connectorName, classLoader, Optional.empty());
     this.classLoader = requireNonNull(classLoader, "classLoader is null");
   }
 
@@ -132,6 +133,8 @@ public class CarbondataConnectorFactory extends HiveConnectorFactory {
       HiveSessionProperties hiveSessionProperties =
           injector.getInstance(HiveSessionProperties.class);
       HiveTableProperties hiveTableProperties = injector.getInstance(HiveTableProperties.class);
+      HiveAnalyzeProperties hiveAnalyzeProperties =
+          injector.getInstance(HiveAnalyzeProperties.class);
       ConnectorAccessControl accessControl =
           new PartitionsAwareAccessControl(injector.getInstance(ConnectorAccessControl.class));
       Set<Procedure> procedures = injector.getInstance(Key.get(new TypeLiteral<Set<Procedure>>() {
@@ -144,7 +147,7 @@ public class CarbondataConnectorFactory extends HiveConnectorFactory {
           new ClassLoaderSafeNodePartitioningProvider(connectorDistributionProvider, classLoader),
           ImmutableSet.of(), procedures, hiveSessionProperties.getSessionProperties(),
           HiveSchemaProperties.SCHEMA_PROPERTIES, hiveTableProperties.getTableProperties(),
-          accessControl, classLoader);
+          hiveAnalyzeProperties.getAnalyzeProperties(), accessControl, classLoader);
     } catch (Exception e) {
       throwIfUnchecked(e);
       throw new RuntimeException(e);
diff --git a/integration/presto/src/main/java/org/apache/carbondata/presto/CarbondataModule.java b/integration/presto/src/main/java/org/apache/carbondata/presto/CarbondataModule.java
index 1f63b98..98bacf0 100755
--- a/integration/presto/src/main/java/org/apache/carbondata/presto/CarbondataModule.java
+++ b/integration/presto/src/main/java/org/apache/carbondata/presto/CarbondataModule.java
@@ -31,6 +31,7 @@ import com.facebook.presto.hive.HadoopDirectoryLister;
 import com.facebook.presto.hive.HdfsConfiguration;
 import com.facebook.presto.hive.HdfsConfigurationUpdater;
 import com.facebook.presto.hive.HdfsEnvironment;
+import com.facebook.presto.hive.HiveAnalyzeProperties;
 import com.facebook.presto.hive.HiveClientConfig;
 import com.facebook.presto.hive.HiveClientModule;
 import com.facebook.presto.hive.HiveCoercionPolicy;
@@ -55,6 +56,7 @@ import com.facebook.presto.hive.LocationService;
 import com.facebook.presto.hive.NamenodeStats;
 import com.facebook.presto.hive.OrcFileWriterConfig;
 import com.facebook.presto.hive.OrcFileWriterFactory;
+import com.facebook.presto.hive.ParquetFileWriterConfig;
 import com.facebook.presto.hive.PartitionUpdate;
 import com.facebook.presto.hive.RcFileFileWriterFactory;
 import com.facebook.presto.hive.TableParameterCodec;
@@ -63,7 +65,6 @@ import com.facebook.presto.hive.TypeTranslator;
 import com.facebook.presto.hive.orc.DwrfPageSourceFactory;
 import com.facebook.presto.hive.orc.OrcPageSourceFactory;
 import com.facebook.presto.hive.parquet.ParquetPageSourceFactory;
-import com.facebook.presto.hive.parquet.ParquetRecordCursorProvider;
 import com.facebook.presto.hive.rcfile.RcFilePageSourceFactory;
 import com.facebook.presto.spi.connector.ConnectorNodePartitioningProvider;
 import com.facebook.presto.spi.connector.ConnectorPageSinkProvider;
@@ -107,6 +108,7 @@ public class CarbondataModule extends HiveClientModule {
 
     binder.bind(HiveSessionProperties.class).in(Scopes.SINGLETON);
     binder.bind(HiveTableProperties.class).in(Scopes.SINGLETON);
+    binder.bind(HiveAnalyzeProperties.class).in(Scopes.SINGLETON);
 
     binder.bind(NamenodeStats.class).in(Scopes.SINGLETON);
     newExporter(binder).export(NamenodeStats.class)
@@ -114,8 +116,6 @@ public class CarbondataModule extends HiveClientModule {
 
     Multibinder<HiveRecordCursorProvider> recordCursorProviderBinder =
         newSetBinder(binder, HiveRecordCursorProvider.class);
-    recordCursorProviderBinder.addBinding().to(ParquetRecordCursorProvider.class)
-        .in(Scopes.SINGLETON);
     recordCursorProviderBinder.addBinding().to(GenericHiveRecordCursorProvider.class)
         .in(Scopes.SINGLETON);
 
@@ -164,6 +164,8 @@ public class CarbondataModule extends HiveClientModule {
     fileWriterFactoryBinder.addBinding().to(OrcFileWriterFactory.class).in(Scopes.SINGLETON);
     fileWriterFactoryBinder.addBinding().to(RcFileFileWriterFactory.class).in(Scopes.SINGLETON);
     binder.bind(CarbonTableReader.class).in(Scopes.SINGLETON);
+
+    configBinder(binder).bindConfig(ParquetFileWriterConfig.class);
   }
 
 }
diff --git a/integration/presto/src/main/java/org/apache/carbondata/presto/CarbondataSplitManager.java b/integration/presto/src/main/java/org/apache/carbondata/presto/CarbondataSplitManager.java
index 0902058..50dcdc8 100755
--- a/integration/presto/src/main/java/org/apache/carbondata/presto/CarbondataSplitManager.java
+++ b/integration/presto/src/main/java/org/apache/carbondata/presto/CarbondataSplitManager.java
@@ -143,7 +143,7 @@ public class CarbondataSplitManager extends HiveSplitManager {
         cSplits.add(new HiveSplit(schemaTableName.getSchemaName(), schemaTableName.getTableName(),
             schemaTableName.getTableName(), "", 0, 0, 0, properties, new ArrayList(),
             getHostAddresses(split.getLocations()), OptionalInt.empty(), false, predicate,
-            new HashMap<>(), Optional.empty()));
+            new HashMap<>(), Optional.empty(), false));
       }
 
       statisticRecorder.logStatisticsAsTableDriver();


[carbondata] 31/41: [CARBONDATA-3320]fix number of partitions issue in describe formatted and drop partition issue

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit b0810149d99c5d051efebaa6ba362775f97ed387
Author: akashrn5 <ak...@gmail.com>
AuthorDate: Wed Mar 20 14:59:10 2019 +0530

    [CARBONDATA-3320]fix number of partitions issue in describe formatted and drop partition issue
    
    Problem:
    For hive native partition, number of partitions are always zero in describe formatted
    when drop partition is done, all the empty directories are not deleting
    
    Solution:
    in describe formatted, get the list of partitions from session catalog and get the size.
    get parent of partition to drop and if empty delete the directory during clean files
    
    This closes #3156
---
 .../carbondata/core/metadata/SegmentFileStore.java | 21 +++++++++++++++++----
 .../partition/TestDDLForPartitionTable.scala       | 13 ++++++++++++-
 ...StandardPartitionWithPreaggregateTestCase.scala | 22 ++++++++++++++++++++++
 .../table/CarbonDescribeFormattedCommand.scala     | 19 ++++++++++++++++++-
 4 files changed, 69 insertions(+), 6 deletions(-)

diff --git a/core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java b/core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
index 1e1e303..224b230 100644
--- a/core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
+++ b/core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
@@ -845,10 +845,7 @@ public class SegmentFileStore {
           }
         }
         CarbonFile path = FileFactory.getCarbonFile(location.getParent().toString());
-        if (path.listFiles().length == 0) {
-          FileFactory.deleteAllCarbonFilesOfDir(
-              FileFactory.getCarbonFile(location.getParent().toString()));
-        }
+        deleteEmptyPartitionFolders(path);
       } else {
         Path location = new Path(entry.getKey()).getParent();
         // delete the segment folder
@@ -861,6 +858,22 @@ public class SegmentFileStore {
     }
   }
 
+  /**
+   * This method deletes the directories recursively if there are no files under corresponding
+   * folder.
+   * Ex: If partition folder is year=2015, month=2,day=5 and drop partition is day=5, it will delete
+   * till year partition folder if there are no other folder or files present under each folder till
+   * year partition
+   */
+  private static void deleteEmptyPartitionFolders(CarbonFile path) {
+    if (path != null && path.listFiles().length == 0) {
+      FileFactory.deleteAllCarbonFilesOfDir(path);
+      Path parentsLocation = new Path(path.getAbsolutePath()).getParent();
+      deleteEmptyPartitionFolders(
+          FileFactory.getCarbonFile(parentsLocation.toString()));
+    }
+  }
+
   private static boolean pathExistsInPartitionSpec(List<PartitionSpec> partitionSpecs,
       Path partitionPath) {
     for (PartitionSpec spec : partitionSpecs) {
diff --git a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/partition/TestDDLForPartitionTable.scala b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/partition/TestDDLForPartitionTable.scala
index d5673bf..7322b95 100644
--- a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/partition/TestDDLForPartitionTable.scala
+++ b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/partition/TestDDLForPartitionTable.scala
@@ -380,16 +380,27 @@ class TestDDLForPartitionTable  extends QueryTest with BeforeAndAfterAll {
           |  'RANGE_INFO'='2017-06-11 00:00:02')
         """.stripMargin)
     }
-
     assert(exceptionMessage.getMessage
       .contains("Range info must define a valid range.Please check again!"))
   }
 
+  test("test number of partitions for default partition") {
+    sql("drop table if exists desc")
+    sql("create table desc(name string) partitioned by (num int) stored by 'carbondata'")
+    sql("insert into desc select 'abc',3")
+    sql("insert into desc select 'abc',5")
+    val descFormatted1 = sql("describe formatted desc").collect
+    descFormatted1.find(_.get(0).toString.contains("Number of Partitions")) match {
+      case Some(row) => assert(row.get(1).toString.contains("2"))
+    }
+  }
+
   override def afterAll = {
     dropTable
   }
 
   def dropTable = {
+    sql("drop table if exists desc")
     sql("drop table if exists hashTable")
     sql("drop table if exists rangeTable")
     sql("drop table if exists listTable")
diff --git a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionWithPreaggregateTestCase.scala b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionWithPreaggregateTestCase.scala
index 84c07c4..c3d3456 100644
--- a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionWithPreaggregateTestCase.scala
+++ b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionWithPreaggregateTestCase.scala
@@ -26,6 +26,8 @@ import org.apache.spark.sql.{CarbonDatasourceHadoopRelation, CarbonEnv, Row}
 import org.apache.spark.sql.test.util.QueryTest
 import org.scalatest.BeforeAndAfterAll
 
+import org.apache.carbondata.core.datastore.impl.FileFactory
+
 class StandardPartitionWithPreaggregateTestCase extends QueryTest with BeforeAndAfterAll {
 
   val testData = s"$resourcesPath/sample.csv"
@@ -225,6 +227,26 @@ class StandardPartitionWithPreaggregateTestCase extends QueryTest with BeforeAnd
     checkAnswer(sql("select * from partitionone_p1"), Seq(Row("k",2014,2014,1,2), Row("k",2015,2015,2,3)))
   }
 
+  test("test drop partition directory") {
+    sql("drop table if exists droppartition")
+    sql(
+      """
+        | CREATE TABLE if not exists droppartition (empname String)
+        | PARTITIONED BY (year int, month int,day int)
+        | STORED BY 'org.apache.carbondata.format'
+      """.stripMargin)
+    sql("insert into droppartition values('k',2014,1,1)")
+    sql("insert into droppartition values('k',2015,2,3)")
+    sql("alter table droppartition drop partition(year=2015,month=2,day=3)")
+    sql("clean files for table droppartition")
+    val table = CarbonEnv.getCarbonTable(Option("partition_preaggregate"), "droppartition")(sqlContext.sparkSession)
+    val tablePath = table.getTablePath
+    val carbonFiles = FileFactory.getCarbonFile(tablePath).listFiles().filter{
+      file => file.getName.equalsIgnoreCase("year=2015")
+    }
+    assert(carbonFiles.length == 0)
+  }
+
   test("test data with filter query") {
     sql("drop table if exists partitionone")
     sql(
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDescribeFormattedCommand.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDescribeFormattedCommand.scala
index 7468ece..e2a2451 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDescribeFormattedCommand.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDescribeFormattedCommand.scala
@@ -32,6 +32,7 @@ import org.apache.spark.sql.hive.CarbonRelation
 import org.apache.carbondata.common.Strings
 import org.apache.carbondata.core.constants.{CarbonCommonConstants, CarbonLoadOptionConstants}
 import org.apache.carbondata.core.metadata.datatype.DataTypes
+import org.apache.carbondata.core.metadata.schema.PartitionInfo
 import org.apache.carbondata.core.metadata.schema.partition.PartitionType
 import org.apache.carbondata.core.metadata.schema.table.CarbonTable
 import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
@@ -185,7 +186,7 @@ private[sql] case class CarbonDescribeFormattedCommand(
         ("Partition Columns",
           partitionInfo.getColumnSchemaList.asScala.map {
             col => s"${col.getColumnName}:${col.getDataType.getName}"}.mkString(", "), ""),
-        ("Number of Partitions", partitionInfo.getNumPartitions.toString, ""),
+        ("Number of Partitions", getNumberOfPartitions(carbonTable, sparkSession), ""),
         ("Partitions Ids", partitionInfo.getPartitionIds.asScala.mkString(","), "")
       )
       if (partitionInfo.getPartitionType == PartitionType.RANGE) {
@@ -239,6 +240,22 @@ private[sql] case class CarbonDescribeFormattedCommand(
     results.map{case (c1, c2, c3) => Row(c1, c2, c3)}
   }
 
+  /**
+   * This method returns the number of partitions based on the partition type
+   */
+  private def getNumberOfPartitions(carbonTable: CarbonTable,
+      sparkSession: SparkSession): String = {
+    val partitionType = carbonTable.getPartitionInfo.getPartitionType
+    partitionType match {
+      case PartitionType.NATIVE_HIVE =>
+        sparkSession.sessionState.catalog
+          .listPartitions(new TableIdentifier(carbonTable.getTableName,
+            Some(carbonTable.getDatabaseName))).size.toString
+      case _ =>
+        carbonTable.getPartitionInfo.getNumPartitions.toString
+    }
+  }
+
   private def getLocalDictDesc(
       carbonTable: CarbonTable,
       tblProps: Map[String, String]): Seq[(String, String, String)] = {


[carbondata] 13/41: [LOG] Optimize the logs of CarbonProperties

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 7e83df1d16ee0db6d586d294d665166cf943c1ae
Author: qiuchenjian <80...@qq.com>
AuthorDate: Sat Feb 9 17:10:17 2019 +0800

    [LOG] Optimize the logs of CarbonProperties
    
    This closes #3123
---
 .../carbondata/core/util/CarbonProperties.java     | 132 ++++++++++++---------
 1 file changed, 74 insertions(+), 58 deletions(-)

diff --git a/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java b/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
index ad27045..004a51e 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
@@ -344,10 +344,11 @@ public final class CarbonProperties {
             CARBON_SCHEDULER_MIN_REGISTERED_RESOURCES_RATIO_DEFAULT);
       }
     } catch (NumberFormatException e) {
-      LOGGER.warn("The value \"" + value
-          + "\" configured for key " + CARBON_SCHEDULER_MIN_REGISTERED_RESOURCES_RATIO
-          + "\" is invalid. Using the default value \""
-          + CARBON_SCHEDULER_MIN_REGISTERED_RESOURCES_RATIO_DEFAULT);
+      LOGGER.warn(String.format("The value \"%s\" configured for key  \"%s\" is invalid. " +
+                      "Using the default value \"%s\"",
+              value,
+              CARBON_SCHEDULER_MIN_REGISTERED_RESOURCES_RATIO,
+              CARBON_SCHEDULER_MIN_REGISTERED_RESOURCES_RATIO_DEFAULT));
       carbonProperties.setProperty(CARBON_SCHEDULER_MIN_REGISTERED_RESOURCES_RATIO,
           CARBON_SCHEDULER_MIN_REGISTERED_RESOURCES_RATIO_DEFAULT);
     }
@@ -363,10 +364,8 @@ public final class CarbonProperties {
     try {
       new SimpleDateFormat(dateFormat);
     } catch (Exception e) {
-      LOGGER.warn("The value \"" + dateFormat + "\" configured for key "
-          + key
-          + "\" is invalid. Using the default value \""
-          + key);
+      LOGGER.warn(String.format("The value \"%s\" configured for key \"%s\" is invalid. " +
+              "Using the default value \"%s\"",dateFormat, key, key));
       carbonProperties.setProperty(key, defaultValue);
     }
   }
@@ -430,9 +429,10 @@ public final class CarbonProperties {
         carbonProperties.getProperty(ENABLE_VECTOR_READER);
     boolean isValidBooleanValue = CarbonUtil.validateBoolean(vectorReaderStr);
     if (!isValidBooleanValue) {
-      LOGGER.warn("The enable vector reader value \"" + vectorReaderStr
-          + "\" is invalid. Using the default value \""
-          + CarbonCommonConstants.ENABLE_VECTOR_READER_DEFAULT);
+      LOGGER.warn(String.format("The enable vector reader value \"%s\" is invalid. " +
+                      "Using the default value \"%s\"",
+              vectorReaderStr,
+              CarbonCommonConstants.ENABLE_VECTOR_READER_DEFAULT));
       carbonProperties.setProperty(ENABLE_VECTOR_READER,
           CarbonCommonConstants.ENABLE_VECTOR_READER_DEFAULT);
     }
@@ -443,9 +443,8 @@ public final class CarbonProperties {
         carbonProperties.getProperty(CARBON_CUSTOM_BLOCK_DISTRIBUTION);
     boolean isValidBooleanValue = CarbonUtil.validateBoolean(customBlockDistributionStr);
     if (!isValidBooleanValue) {
-      LOGGER.warn("The custom block distribution value \"" + customBlockDistributionStr
-          + "\" is invalid. Using the default value \""
-          + false);
+      LOGGER.warn(String.format("The custom block distribution value \"%s\" is invalid. " +
+              "Using the default value \"false\"",customBlockDistributionStr));
       carbonProperties.setProperty(CARBON_CUSTOM_BLOCK_DISTRIBUTION, "false");
     }
   }
@@ -458,9 +457,10 @@ public final class CarbonProperties {
             || carbonTaskDistribution.equalsIgnoreCase(CARBON_TASK_DISTRIBUTION_BLOCK)
             || carbonTaskDistribution.equalsIgnoreCase(CARBON_TASK_DISTRIBUTION_CUSTOM));
     if (!isValid) {
-      LOGGER.warn("The carbon task distribution value \"" + carbonTaskDistribution
-          + "\" is invalid. Using the default value \""
-          + CarbonCommonConstants.CARBON_TASK_DISTRIBUTION_DEFAULT);
+      LOGGER.warn(String.format("The carbon task distribution value \"%s\" is invalid. " +
+                      "Using the default value \"%s\"",
+              carbonTaskDistribution,
+              CarbonCommonConstants.CARBON_TASK_DISTRIBUTION_DEFAULT));
       carbonProperties.setProperty(CARBON_TASK_DISTRIBUTION,
           CarbonCommonConstants.CARBON_TASK_DISTRIBUTION_DEFAULT);
     }
@@ -470,9 +470,11 @@ public final class CarbonProperties {
     String unSafeSortStr = carbonProperties.getProperty(ENABLE_UNSAFE_SORT);
     boolean isValidBooleanValue = CarbonUtil.validateBoolean(unSafeSortStr);
     if (!isValidBooleanValue) {
-      LOGGER.warn("The enable unsafe sort value \"" + unSafeSortStr
-          + "\" is invalid. Using the default value \""
-          + CarbonCommonConstants.ENABLE_UNSAFE_SORT_DEFAULT);
+      LOGGER.warn(String.format("The enable unsafe sort value \"%s\" is invalid. " +
+                      "Using the default value \"%s\"",
+              unSafeSortStr,
+              CarbonCommonConstants.ENABLE_UNSAFE_SORT_DEFAULT
+      ));
       carbonProperties.setProperty(ENABLE_UNSAFE_SORT,
           CarbonCommonConstants.ENABLE_UNSAFE_SORT_DEFAULT);
     }
@@ -482,9 +484,10 @@ public final class CarbonProperties {
     String value = carbonProperties.getProperty(ENABLE_OFFHEAP_SORT);
     boolean isValidBooleanValue = CarbonUtil.validateBoolean(value);
     if (!isValidBooleanValue) {
-      LOGGER.warn("The enable off heap sort value \"" + value
-          + "\" is invalid. Using the default value \""
-          + CarbonCommonConstants.ENABLE_OFFHEAP_SORT_DEFAULT);
+      LOGGER.warn(String.format("The enable off heap sort value \"%s\" is invalid. " +
+                      "Using the default value \"%s\"",
+              value,
+              CarbonCommonConstants.ENABLE_OFFHEAP_SORT_DEFAULT));
       carbonProperties.setProperty(ENABLE_OFFHEAP_SORT,
           CarbonCommonConstants.ENABLE_OFFHEAP_SORT_DEFAULT);
     }
@@ -561,9 +564,10 @@ public final class CarbonProperties {
         carbonProperties.getProperty(ENABLE_AUTO_HANDOFF);
     boolean isValid = CarbonUtil.validateBoolean(enableAutoHandoffStr);
     if (!isValid) {
-      LOGGER.warn("The enable auto handoff value \"" + enableAutoHandoffStr
-          + "\" is invalid. Using the default value \""
-          + CarbonCommonConstants.ENABLE_AUTO_HANDOFF_DEFAULT);
+      LOGGER.warn(String.format("The enable auto handoff value \"%s\" is invalid. " +
+                      "Using the default value \"%s\"",
+              enableAutoHandoffStr,
+              CarbonCommonConstants.ENABLE_AUTO_HANDOFF_DEFAULT));
       carbonProperties.setProperty(ENABLE_AUTO_HANDOFF,
           CarbonCommonConstants.ENABLE_AUTO_HANDOFF_DEFAULT);
     }
@@ -579,22 +583,24 @@ public final class CarbonProperties {
     try {
       short numberOfPagePerBlockletColumn = Short.parseShort(numberOfPagePerBlockletColumnString);
       if (numberOfPagePerBlockletColumn < CarbonV3DataFormatConstants.BLOCKLET_SIZE_IN_MB_MIN) {
-        LOGGER.info("Blocklet Size Configured value \"" + numberOfPagePerBlockletColumnString
-            + "\" is invalid. Using the default value \""
-            + CarbonV3DataFormatConstants.BLOCKLET_SIZE_IN_MB_DEFAULT_VALUE);
+        LOGGER.info(String.format("Blocklet Size Configured value \"%s\" is invalid. " +
+                        "Using the default value \"%s\"",
+                numberOfPagePerBlockletColumnString,
+                CarbonV3DataFormatConstants.BLOCKLET_SIZE_IN_MB_DEFAULT_VALUE));
         carbonProperties.setProperty(BLOCKLET_SIZE_IN_MB,
             CarbonV3DataFormatConstants.BLOCKLET_SIZE_IN_MB_DEFAULT_VALUE);
       }
     } catch (NumberFormatException e) {
-      LOGGER.info("Blocklet Size Configured value \"" + numberOfPagePerBlockletColumnString
-          + "\" is invalid. Using the default value \""
-          + CarbonV3DataFormatConstants.BLOCKLET_SIZE_IN_MB_DEFAULT_VALUE);
+      LOGGER.info(String.format("Blocklet Size Configured value \"%s\" is invalid. " +
+                      "Using the default value \"%s\"",
+              numberOfPagePerBlockletColumnString,
+              CarbonV3DataFormatConstants.BLOCKLET_SIZE_IN_MB_DEFAULT_VALUE));
       carbonProperties.setProperty(BLOCKLET_SIZE_IN_MB,
           CarbonV3DataFormatConstants.BLOCKLET_SIZE_IN_MB_DEFAULT_VALUE);
     }
-    LOGGER.info("Blocklet Size Configured value is \"" + carbonProperties
+    LOGGER.info(String.format("Blocklet Size Configured value is \"%s\"", carbonProperties
         .getProperty(BLOCKLET_SIZE_IN_MB,
-            CarbonV3DataFormatConstants.BLOCKLET_SIZE_IN_MB_DEFAULT_VALUE));
+            CarbonV3DataFormatConstants.BLOCKLET_SIZE_IN_MB_DEFAULT_VALUE)));
   }
 
   /**
@@ -660,15 +666,19 @@ public final class CarbonProperties {
 
       if (sortSize < CarbonCommonConstants.SORT_SIZE_MIN_VAL) {
         LOGGER.info(
-            "The batch size value \"" + sortSizeStr + "\" is invalid. Using the default value \""
-                + CarbonCommonConstants.SORT_SIZE_DEFAULT_VAL);
+            String.format("The batch size value \"%s\" is invalid. " +
+                            "Using the default value \"%s\"",
+                    sortSizeStr,
+                    CarbonCommonConstants.SORT_SIZE_DEFAULT_VAL));
         carbonProperties.setProperty(SORT_SIZE,
             CarbonCommonConstants.SORT_SIZE_DEFAULT_VAL);
       }
     } catch (NumberFormatException e) {
       LOGGER.info(
-          "The batch size value \"" + sortSizeStr + "\" is invalid. Using the default value \""
-              + CarbonCommonConstants.SORT_SIZE_DEFAULT_VAL);
+          String.format("The batch size value \"%s\" is invalid. " +
+                          "Using the default value \"%s\"",
+                  sortSizeStr,
+                  CarbonCommonConstants.SORT_SIZE_DEFAULT_VAL));
       carbonProperties.setProperty(SORT_SIZE,
           CarbonCommonConstants.SORT_SIZE_DEFAULT_VAL);
     }
@@ -1304,7 +1314,7 @@ public final class CarbonProperties {
           .setProperty(CarbonCommonConstants.UNSAFE_WORKING_MEMORY_IN_MB, unsafeWorkingMemory + "");
     } catch (NumberFormatException e) {
       LOGGER.warn("The specified value for property "
-          + CarbonCommonConstants.UNSAFE_WORKING_MEMORY_IN_MB_DEFAULT + "is invalid.");
+          + CarbonCommonConstants.UNSAFE_WORKING_MEMORY_IN_MB_DEFAULT + " is invalid.");
     }
   }
 
@@ -1314,10 +1324,10 @@ public final class CarbonProperties {
       unsafeSortStorageMemory = Integer.parseInt(carbonProperties
           .getProperty(CarbonCommonConstants.CARBON_SORT_STORAGE_INMEMORY_IN_MB));
     } catch (NumberFormatException e) {
-      LOGGER.warn("The specified value for property "
-          + CarbonCommonConstants.CARBON_SORT_STORAGE_INMEMORY_IN_MB + "is invalid."
-          + " Taking the default value."
-          + CarbonCommonConstants.CARBON_SORT_STORAGE_INMEMORY_IN_MB_DEFAULT);
+      LOGGER.warn(String.format("The specified value for property %s is invalid."
+          + " Taking the default value.%s",
+              CarbonCommonConstants.CARBON_SORT_STORAGE_INMEMORY_IN_MB,
+              CarbonCommonConstants.CARBON_SORT_STORAGE_INMEMORY_IN_MB_DEFAULT));
       unsafeSortStorageMemory = CarbonCommonConstants.CARBON_SORT_STORAGE_INMEMORY_IN_MB_DEFAULT;
     }
     if (unsafeSortStorageMemory
@@ -1338,9 +1348,10 @@ public final class CarbonProperties {
         CarbonCommonConstants.ENABLE_QUERY_STATISTICS_DEFAULT);
     boolean isValidBooleanValue = CarbonUtil.validateBoolean(enableQueryStatistics);
     if (!isValidBooleanValue) {
-      LOGGER.warn("The enable query statistics value \"" + enableQueryStatistics
-          + "\" is invalid. Using the default value \""
-          + CarbonCommonConstants.ENABLE_QUERY_STATISTICS_DEFAULT);
+      LOGGER.warn(String.format("The enable query statistics value \"%s\" is invalid. " +
+                      "Using the default value \"%s\"",
+              enableQueryStatistics,
+              CarbonCommonConstants.ENABLE_QUERY_STATISTICS_DEFAULT));
       carbonProperties.setProperty(CarbonCommonConstants.ENABLE_QUERY_STATISTICS,
           CarbonCommonConstants.ENABLE_QUERY_STATISTICS_DEFAULT);
     }
@@ -1505,18 +1516,20 @@ public final class CarbonProperties {
       int spillPercentage = Integer.parseInt(spillPercentageStr);
       if (spillPercentage > 100 || spillPercentage < 0) {
         LOGGER.info(
-            "The sort memory spill percentage value \"" + spillPercentageStr +
-                "\" is invalid. Using the default value \""
-                + CarbonLoadOptionConstants.CARBON_LOAD_SORT_MEMORY_SPILL_PERCENTAGE_DEFAULT);
+            String.format("The sort memory spill percentage value \"%s\" is invalid. " +
+                            "Using the default value \"%s\"",
+                    spillPercentageStr,
+                    CarbonLoadOptionConstants.CARBON_LOAD_SORT_MEMORY_SPILL_PERCENTAGE_DEFAULT));
         carbonProperties.setProperty(
             CARBON_LOAD_SORT_MEMORY_SPILL_PERCENTAGE,
             CarbonLoadOptionConstants.CARBON_LOAD_SORT_MEMORY_SPILL_PERCENTAGE_DEFAULT);
       }
     } catch (NumberFormatException e) {
       LOGGER.info(
-          "The sort memory spill percentage value \"" + spillPercentageStr +
-              "\" is invalid. Using the default value \""
-              + CarbonLoadOptionConstants.CARBON_LOAD_SORT_MEMORY_SPILL_PERCENTAGE_DEFAULT);
+          String.format("The sort memory spill percentage value \"%s\" is invalid. " +
+                          "Using the default value \"%s\"",
+                  spillPercentageStr,
+                  CarbonLoadOptionConstants.CARBON_LOAD_SORT_MEMORY_SPILL_PERCENTAGE_DEFAULT));
       carbonProperties.setProperty(
           CARBON_LOAD_SORT_MEMORY_SPILL_PERCENTAGE,
           CarbonLoadOptionConstants.CARBON_LOAD_SORT_MEMORY_SPILL_PERCENTAGE_DEFAULT);
@@ -1535,9 +1548,11 @@ public final class CarbonProperties {
       if (allowedCharactersLimit < CarbonCommonConstants.CARBON_MINMAX_ALLOWED_BYTE_COUNT_MIN
           || allowedCharactersLimit
           > CarbonCommonConstants.CARBON_MINMAX_ALLOWED_BYTE_COUNT_MAX) {
-        LOGGER.info("The min max byte limit for string type value \"" + allowedCharactersLimit
-            + "\" is invalid. Using the default value \""
-            + CarbonCommonConstants.CARBON_MINMAX_ALLOWED_BYTE_COUNT_DEFAULT);
+        LOGGER.info(String.format("The min max byte limit for " +
+                        "string type value \"%s\" is invalid. " +
+                        "Using the default value \"%s\"",
+                allowedCharactersLimit,
+                CarbonCommonConstants.CARBON_MINMAX_ALLOWED_BYTE_COUNT_DEFAULT));
         carbonProperties.setProperty(CARBON_MINMAX_ALLOWED_BYTE_COUNT,
             CarbonCommonConstants.CARBON_MINMAX_ALLOWED_BYTE_COUNT_DEFAULT);
       } else {
@@ -1547,9 +1562,10 @@ public final class CarbonProperties {
             .setProperty(CARBON_MINMAX_ALLOWED_BYTE_COUNT, allowedCharactersLimit + "");
       }
     } catch (NumberFormatException e) {
-      LOGGER.info("The min max byte limit for string type value \"" + allowedCharactersLimit
-          + "\" is invalid. Using the default value \""
-          + CarbonCommonConstants.CARBON_MINMAX_ALLOWED_BYTE_COUNT_DEFAULT);
+      LOGGER.info(String.format("The min max byte limit for string type value \"%s\" is invalid. " +
+                      "Using the default value \"%s\"",
+              allowedCharactersLimit,
+              CarbonCommonConstants.CARBON_MINMAX_ALLOWED_BYTE_COUNT_DEFAULT));
       carbonProperties.setProperty(CARBON_MINMAX_ALLOWED_BYTE_COUNT,
           CarbonCommonConstants.CARBON_MINMAX_ALLOWED_BYTE_COUNT_DEFAULT);
     }


[carbondata] 15/41: [CARBONDATA-3297] Fix that the IndexoutOfBoundsException when creating table and dropping table are at the same time

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 9ed81844eb46a965566855a6bd71ac21bf48a12f
Author: qiuchenjian <80...@qq.com>
AuthorDate: Wed Feb 20 17:16:34 2019 +0800

    [CARBONDATA-3297] Fix that the IndexoutOfBoundsException when creating table and dropping table are at the same time
    
    [Problem]
    Throw the IndexoutOfBoundsException when creating table and dropping table are at the same time
    
    [Solution]
    The type of carbonTables in MetaData.class is ArrayBuffer, and the ArrayBuffer is not thread-safe,
    so it throw this exception when creating table and dropping table are at the same time
    
    Use read write lock to guarantee the thread-safe
    
    This closes #3130
---
 .../spark/sql/hive/CarbonFileMetastore.scala       | 37 +++++++++++++++++++---
 1 file changed, 33 insertions(+), 4 deletions(-)

diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala
index c1be154..ea3bba8 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala
@@ -19,6 +19,7 @@ package org.apache.spark.sql.hive
 
 import java.io.IOException
 import java.net.URI
+import java.util.concurrent.locks.{Lock, ReentrantReadWriteLock}
 
 import scala.collection.mutable.ArrayBuffer
 
@@ -43,7 +44,8 @@ import org.apache.carbondata.core.fileoperations.FileWriteOperation
 import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, CarbonMetadata, CarbonTableIdentifier}
 import org.apache.carbondata.core.metadata.converter.ThriftWrapperSchemaConverterImpl
 import org.apache.carbondata.core.metadata.schema
-import org.apache.carbondata.core.metadata.schema.{table, SchemaReader}
+import org.apache.carbondata.core.metadata.schema.SchemaReader
+import org.apache.carbondata.core.metadata.schema.table
 import org.apache.carbondata.core.metadata.schema.table.CarbonTable
 import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
 import org.apache.carbondata.core.util.path.CarbonTablePath
@@ -53,9 +55,16 @@ import org.apache.carbondata.format.{SchemaEvolutionEntry, TableInfo}
 import org.apache.carbondata.spark.util.CarbonSparkUtil
 
 case class MetaData(var carbonTables: ArrayBuffer[CarbonTable]) {
+  // use to lock the carbonTables
+  val lock : ReentrantReadWriteLock = new ReentrantReadWriteLock
+  val readLock: Lock = lock.readLock()
+  val writeLock: Lock = lock.writeLock()
+
   // clear the metadata
   def clear(): Unit = {
+    writeLock.lock()
     carbonTables.clear()
+    writeLock.unlock()
   }
 }
 
@@ -192,9 +201,12 @@ class CarbonFileMetastore extends CarbonMetaStore {
    * @return
    */
   def getTableFromMetadataCache(database: String, tableName: String): Option[CarbonTable] = {
-    metadata.carbonTables
+    metadata.readLock.lock()
+    val ret = metadata.carbonTables
       .find(table => table.getDatabaseName.equalsIgnoreCase(database) &&
         table.getTableName.equalsIgnoreCase(tableName))
+    metadata.readLock.unlock()
+    ret
   }
 
   def tableExists(
@@ -270,11 +282,14 @@ class CarbonFileMetastore extends CarbonMetaStore {
       }
     }
 
+
     wrapperTableInfo.map { tableInfo =>
       CarbonMetadata.getInstance().removeTable(tableUniqueName)
       CarbonMetadata.getInstance().loadTableMetadata(tableInfo)
       val carbonTable = CarbonMetadata.getInstance().getCarbonTable(tableUniqueName)
+      metadata.writeLock.lock()
       metadata.carbonTables += carbonTable
+      metadata.writeLock.unlock()
       carbonTable
     }
   }
@@ -413,8 +428,11 @@ class CarbonFileMetastore extends CarbonMetaStore {
     CarbonMetadata.getInstance.removeTable(tableInfo.getTableUniqueName)
     removeTableFromMetadata(identifier.getDatabaseName, identifier.getTableName)
     CarbonMetadata.getInstance().loadTableMetadata(tableInfo)
+    metadata.writeLock.lock()
     metadata.carbonTables +=
       CarbonMetadata.getInstance().getCarbonTable(identifier.getTableUniqueName)
+    metadata.writeLock.unlock()
+    metadata.carbonTables
   }
 
   /**
@@ -427,7 +445,9 @@ class CarbonFileMetastore extends CarbonMetaStore {
     val carbonTableToBeRemoved: Option[CarbonTable] = getTableFromMetadataCache(dbName, tableName)
     carbonTableToBeRemoved match {
       case Some(carbonTable) =>
+        metadata.writeLock.lock()
         metadata.carbonTables -= carbonTable
+        metadata.writeLock.unlock()
       case None =>
         if (LOGGER.isDebugEnabled) {
           LOGGER.debug(s"No entry for table $tableName in database $dbName")
@@ -443,10 +463,12 @@ class CarbonFileMetastore extends CarbonMetaStore {
     val carbonTable = CarbonMetadata.getInstance().getCarbonTable(
       wrapperTableInfo.getTableUniqueName)
     for (i <- metadata.carbonTables.indices) {
+      metadata.writeLock.lock()
       if (wrapperTableInfo.getTableUniqueName.equals(
         metadata.carbonTables(i).getTableUniqueName)) {
         metadata.carbonTables(i) = carbonTable
       }
+      metadata.writeLock.unlock()
     }
   }
 
@@ -579,12 +601,14 @@ class CarbonFileMetastore extends CarbonMetaStore {
         FileFactory.getCarbonFile(timestampFile, timestampFileType).getLastModifiedTime
       if (!(lastModifiedTime ==
             tableModifiedTimeStore.get(CarbonCommonConstants.DATABASE_DEFAULT_NAME))) {
+        metadata.writeLock.lock()
         metadata.carbonTables = metadata.carbonTables.filterNot(
           table => table.getTableName.equalsIgnoreCase(tableIdentifier.table) &&
                    table.getDatabaseName
                      .equalsIgnoreCase(tableIdentifier.database
                        .getOrElse(SparkSession.getActiveSession.get.sessionState.catalog
                          .getCurrentDatabase)))
+        metadata.writeLock.unlock()
         updateSchemasUpdatedTime(lastModifiedTime)
         isRefreshed = true
       }
@@ -594,8 +618,13 @@ class CarbonFileMetastore extends CarbonMetaStore {
 
   override def isReadFromHiveMetaStore: Boolean = false
 
-  override def listAllTables(sparkSession: SparkSession): Seq[CarbonTable] =
-    metadata.carbonTables
+  override def listAllTables(sparkSession: SparkSession): Seq[CarbonTable] = {
+    metadata.readLock.lock
+    val ret = metadata.carbonTables.clone()
+    metadata.readLock.unlock
+    ret
+  }
+
 
   override def getThriftTableInfo(carbonTable: CarbonTable): TableInfo = {
     val tableMetadataFile = CarbonTablePath.getSchemaFilePath(carbonTable.getTablePath)


[carbondata] 01/41: [CARBONDATA-3287]Remove the validation for same schema in a location and fix drop datamap issue

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit d55164f39acd8438bc6105cbb1b18fbe33122bbe
Author: akashrn5 <ak...@gmail.com>
AuthorDate: Mon Feb 4 16:07:02 2019 +0530

    [CARBONDATA-3287]Remove the validation for same schema in a location and fix drop datamap issue
    
    ### Why this PR?
    Currently we have a validation that if there are two carbondata files in a location with different schema, then we fail the query. I think there is no need to fail. If you see the parquet behavior also we cna understand.
    
    Here i think failing is not good, we can read the latets schema from latest carbondata file in the given location and based on that read all the files and give query output. For the columns which are not present in some data files, it wil have null values for the new column.
    
    But here basically we do not merge schema. we can maintain the same now also, only thing is can take latest schma.
    
    ### Points to Observe
    1. one data file with columns a,b and c. 2nd file is with columns a,b,c,d,e. then can read and create table with 5 columns or 3 columns which ever is latest and create table(This will be when user does not specify schema). If he species table will be created with specified schema
    2. Only **validation** happens is if the column name is same in both the data files present at location, and the datatype is different, then the query fails
    3. When ffirst query is fired the dtamap is created for the table, and if new column is present in other data file, the the datamap is not updated as the table name will be same, so if column list is different we can drop datamap and create again
    
    This closes #3121
---
 .../core/datamap/DataMapStoreManager.java          | 12 +++++++-
 .../carbondata/core/datamap/TableDataMap.java      | 16 +++++++---
 .../carbondata/core/datamap/dev/DataMap.java       |  4 +--
 .../datamap/dev/cgdatamap/CoarseGrainDataMap.java  |  4 +--
 .../datamap/dev/fgdatamap/FineGrainDataMap.java    |  4 +--
 .../indexstore/blockletindex/BlockDataMap.java     |  8 ++---
 .../core/metadata/schema/table/CarbonTable.java    | 27 +++++++++++------
 .../scan/executor/impl/AbstractQueryExecutor.java  | 22 ++++++++------
 .../carbondata/core/scan/model/QueryModel.java     | 34 ++++++++++++++++------
 .../carbondata/core/util/BlockletDataMapUtil.java  | 33 ++++++++++++---------
 .../apache/carbondata/core/util/CarbonUtil.java    | 18 +++++++++---
 .../PrestoTestNonTransactionalTableFiles.scala     |  2 +-
 .../datasource/SparkCarbonDataSourceTest.scala     |  8 ++---
 13 files changed, 126 insertions(+), 66 deletions(-)

diff --git a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
index baf4739..c5cf55d 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
@@ -52,6 +52,7 @@ import org.apache.carbondata.core.util.ThreadLocalSessionInfo;
 import static org.apache.carbondata.core.metadata.schema.datamap.DataMapClassProvider.MV;
 import static org.apache.carbondata.core.metadata.schema.datamap.DataMapClassProvider.PREAGGREGATE;
 
+import org.apache.commons.collections.CollectionUtils;
 import org.apache.hadoop.fs.Path;
 import org.apache.log4j.Logger;
 
@@ -322,6 +323,15 @@ public final class DataMapStoreManager {
         tableIndices = allDataMaps.get(tableUniqueName);
       }
     }
+    // in case of fileformat or sdk, when table is dropped or schema is changed the datamaps are
+    // not cleared, they need to be cleared by using API, so compare the columns, if not same, clear
+    // the datamaps on that table
+    if (allDataMaps.size() > 0 && !CollectionUtils.isEmpty(allDataMaps.get(tableUniqueName))
+        && !allDataMaps.get(tableUniqueName).get(0).getTable().getTableInfo().getFactTable()
+        .getListOfColumns().equals(table.getTableInfo().getFactTable().getListOfColumns())) {
+      clearDataMaps(tableUniqueName);
+      tableIndices = null;
+    }
     TableDataMap dataMap = null;
     if (tableIndices != null) {
       dataMap = getTableDataMap(dataMapSchema.getDataMapName(), tableIndices);
@@ -422,7 +432,7 @@ public final class DataMapStoreManager {
       blockletDetailsFetcher = getBlockletDetailsFetcher(table);
     }
     segmentPropertiesFetcher = (SegmentPropertiesFetcher) blockletDetailsFetcher;
-    TableDataMap dataMap = new TableDataMap(table.getAbsoluteTableIdentifier(),
+    TableDataMap dataMap = new TableDataMap(table,
         dataMapSchema, dataMapFactory, blockletDetailsFetcher, segmentPropertiesFetcher);
 
     tableIndices.add(dataMap);
diff --git a/core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java b/core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java
index 86390e8..0d46fd8 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java
@@ -43,6 +43,7 @@ import org.apache.carbondata.core.indexstore.ExtendedBlocklet;
 import org.apache.carbondata.core.indexstore.PartitionSpec;
 import org.apache.carbondata.core.indexstore.SegmentPropertiesFetcher;
 import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
 import org.apache.carbondata.core.metadata.schema.table.DataMapSchema;
 import org.apache.carbondata.core.scan.expression.Expression;
 import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;
@@ -64,6 +65,8 @@ import org.apache.log4j.Logger;
 @InterfaceAudience.Internal
 public final class TableDataMap extends OperationEventListener {
 
+  private CarbonTable table;
+
   private AbsoluteTableIdentifier identifier;
 
   private DataMapSchema dataMapSchema;
@@ -80,10 +83,11 @@ public final class TableDataMap extends OperationEventListener {
   /**
    * It is called to initialize and load the required table datamap metadata.
    */
-  TableDataMap(AbsoluteTableIdentifier identifier, DataMapSchema dataMapSchema,
+  TableDataMap(CarbonTable table, DataMapSchema dataMapSchema,
       DataMapFactory dataMapFactory, BlockletDetailsFetcher blockletDetailsFetcher,
       SegmentPropertiesFetcher segmentPropertiesFetcher) {
-    this.identifier = identifier;
+    this.identifier = table.getAbsoluteTableIdentifier();
+    this.table = table;
     this.dataMapSchema = dataMapSchema;
     this.dataMapFactory = dataMapFactory;
     this.blockletDetailsFetcher = blockletDetailsFetcher;
@@ -115,8 +119,8 @@ public final class TableDataMap extends OperationEventListener {
       } else {
         segmentProperties = segmentPropertiesFetcher.getSegmentProperties(segment);
         for (DataMap dataMap : dataMaps.get(segment)) {
-          pruneBlocklets
-              .addAll(dataMap.prune(filterExp, segmentProperties, partitions, identifier));
+          pruneBlocklets.addAll(dataMap
+              .prune(filterExp, segmentProperties, partitions, table));
         }
       }
       blocklets.addAll(addSegmentId(
@@ -126,6 +130,10 @@ public final class TableDataMap extends OperationEventListener {
     return blocklets;
   }
 
+  public CarbonTable getTable() {
+    return table;
+  }
+
   /**
    * Pass the valid segments and prune the datamap using filter expression
    *
diff --git a/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java b/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java
index f31b7f3..c52cc41 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java
@@ -24,7 +24,7 @@ import org.apache.carbondata.core.datastore.block.SegmentProperties;
 import org.apache.carbondata.core.indexstore.Blocklet;
 import org.apache.carbondata.core.indexstore.PartitionSpec;
 import org.apache.carbondata.core.memory.MemoryException;
-import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
 import org.apache.carbondata.core.scan.expression.Expression;
 import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;
 
@@ -52,7 +52,7 @@ public interface DataMap<T extends Blocklet> {
    * blocklets where these filters can exist.
    */
   List<T> prune(Expression filter, SegmentProperties segmentProperties,
-      List<PartitionSpec> partitions, AbsoluteTableIdentifier identifier) throws IOException;
+      List<PartitionSpec> partitions, CarbonTable carbonTable) throws IOException;
 
   // TODO Move this method to Abstract class
   /**
diff --git a/core/src/main/java/org/apache/carbondata/core/datamap/dev/cgdatamap/CoarseGrainDataMap.java b/core/src/main/java/org/apache/carbondata/core/datamap/dev/cgdatamap/CoarseGrainDataMap.java
index fc1f104..b4af9d9 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/dev/cgdatamap/CoarseGrainDataMap.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/dev/cgdatamap/CoarseGrainDataMap.java
@@ -25,7 +25,7 @@ import org.apache.carbondata.core.datamap.dev.DataMap;
 import org.apache.carbondata.core.datastore.block.SegmentProperties;
 import org.apache.carbondata.core.indexstore.Blocklet;
 import org.apache.carbondata.core.indexstore.PartitionSpec;
-import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
 import org.apache.carbondata.core.scan.expression.Expression;
 
 /**
@@ -37,7 +37,7 @@ public abstract class CoarseGrainDataMap implements DataMap<Blocklet> {
 
   @Override
   public List<Blocklet> prune(Expression expression, SegmentProperties segmentProperties,
-      List<PartitionSpec> partitions, AbsoluteTableIdentifier identifier) throws IOException {
+      List<PartitionSpec> partitions, CarbonTable carbonTable) throws IOException {
     throw new UnsupportedOperationException("Filter expression not supported");
   }
 
diff --git a/core/src/main/java/org/apache/carbondata/core/datamap/dev/fgdatamap/FineGrainDataMap.java b/core/src/main/java/org/apache/carbondata/core/datamap/dev/fgdatamap/FineGrainDataMap.java
index a6732a6..03b2bfb 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/dev/fgdatamap/FineGrainDataMap.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/dev/fgdatamap/FineGrainDataMap.java
@@ -24,7 +24,7 @@ import org.apache.carbondata.common.annotations.InterfaceStability;
 import org.apache.carbondata.core.datamap.dev.DataMap;
 import org.apache.carbondata.core.datastore.block.SegmentProperties;
 import org.apache.carbondata.core.indexstore.PartitionSpec;
-import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
 import org.apache.carbondata.core.scan.expression.Expression;
 
 /**
@@ -36,7 +36,7 @@ public abstract class FineGrainDataMap implements DataMap<FineGrainBlocklet> {
 
   @Override
   public List<FineGrainBlocklet> prune(Expression filter, SegmentProperties segmentProperties,
-      List<PartitionSpec> partitions, AbsoluteTableIdentifier identifier) throws IOException {
+      List<PartitionSpec> partitions, CarbonTable carbonTable) throws IOException {
     throw new UnsupportedOperationException("Filter expression not supported");
   }
 
diff --git a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
index e29dfef..a7818c2 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
@@ -45,7 +45,6 @@ import org.apache.carbondata.core.indexstore.row.DataMapRow;
 import org.apache.carbondata.core.indexstore.row.DataMapRowImpl;
 import org.apache.carbondata.core.indexstore.schema.CarbonRowSchema;
 import org.apache.carbondata.core.memory.MemoryException;
-import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
 import org.apache.carbondata.core.metadata.blocklet.DataFileFooter;
 import org.apache.carbondata.core.metadata.blocklet.index.BlockletIndex;
 import org.apache.carbondata.core.metadata.blocklet.index.BlockletMinMaxIndex;
@@ -708,17 +707,18 @@ public class BlockDataMap extends CoarseGrainDataMap
 
   @Override
   public List<Blocklet> prune(Expression expression, SegmentProperties properties,
-      List<PartitionSpec> partitions, AbsoluteTableIdentifier identifier) throws IOException {
+      List<PartitionSpec> partitions, CarbonTable carbonTable) throws IOException {
     FilterResolverIntf filterResolverIntf = null;
     if (expression != null) {
       QueryModel.FilterProcessVO processVO =
           new QueryModel.FilterProcessVO(properties.getDimensions(), properties.getMeasures(),
               new ArrayList<CarbonDimension>());
-      QueryModel.processFilterExpression(processVO, expression, null, null);
+      QueryModel.processFilterExpression(processVO, expression, null, null, carbonTable);
       // Optimize Filter Expression and fit RANGE filters is conditions apply.
       FilterOptimizer rangeFilterOptimizer = new RangeFilterOptmizer(expression);
       rangeFilterOptimizer.optimizeFilter();
-      filterResolverIntf = CarbonTable.resolveFilter(expression, identifier);
+      filterResolverIntf =
+          CarbonTable.resolveFilter(expression, carbonTable.getAbsoluteTableIdentifier());
     }
     return prune(filterResolverIntf, properties, partitions);
   }
diff --git a/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java b/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
index c4adab4..8ed781a 100644
--- a/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
+++ b/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
@@ -246,7 +246,7 @@ public class CarbonTable implements Serializable {
       String tableName,
       Configuration configuration) throws IOException {
     TableInfo tableInfoInfer = CarbonUtil.buildDummyTableInfo(tablePath, "null", "null");
-    CarbonFile carbonFile = getFirstIndexFile(FileFactory.getCarbonFile(tablePath, configuration));
+    CarbonFile carbonFile = getLatestIndexFile(FileFactory.getCarbonFile(tablePath, configuration));
     if (carbonFile == null) {
       throw new RuntimeException("Carbon index file not exists.");
     }
@@ -265,22 +265,31 @@ public class CarbonTable implements Serializable {
     return CarbonTable.buildFromTableInfo(tableInfoInfer);
   }
 
-  private static CarbonFile getFirstIndexFile(CarbonFile tablePath) {
+  private static CarbonFile getLatestIndexFile(CarbonFile tablePath) {
     CarbonFile[] carbonFiles = tablePath.listFiles();
+    CarbonFile latestCarbonIndexFile = null;
+    long latestIndexFileTimestamp = 0L;
     for (CarbonFile carbonFile : carbonFiles) {
-      if (carbonFile.isDirectory()) {
+      if (carbonFile.getName().endsWith(CarbonTablePath.INDEX_FILE_EXT)
+          && carbonFile.getLastModifiedTime() > latestIndexFileTimestamp) {
+        latestCarbonIndexFile = carbonFile;
+        latestIndexFileTimestamp = carbonFile.getLastModifiedTime();
+      } else if (carbonFile.isDirectory()) {
         // if the list has directories that doesn't contain index files,
         // continue checking other files/directories in the list.
-        if (getFirstIndexFile(carbonFile) == null) {
+        if (getLatestIndexFile(carbonFile) == null) {
           continue;
         } else {
-          return getFirstIndexFile(carbonFile);
+          return getLatestIndexFile(carbonFile);
         }
-      } else if (carbonFile.getName().endsWith(CarbonTablePath.INDEX_FILE_EXT)) {
-        return carbonFile;
       }
     }
-    return null;
+    if (latestCarbonIndexFile != null) {
+      return latestCarbonIndexFile;
+    } else {
+      // returning null only if the path doesn't have index files.
+      return null;
+    }
   }
 
   public static CarbonTable buildDummyTable(String tablePath) throws IOException {
@@ -1058,7 +1067,7 @@ public class CarbonTable implements Serializable {
         new QueryModel.FilterProcessVO(getDimensionByTableName(getTableName()),
             getMeasureByTableName(getTableName()), getImplicitDimensionByTableName(getTableName()));
     QueryModel.processFilterExpression(processVO, filterExpression, isFilterDimensions,
-        isFilterMeasures);
+        isFilterMeasures, this);
 
     if (null != filterExpression) {
       // Optimize Filter Expression and fit RANGE filters is conditions apply.
diff --git a/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java b/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
index ab7c577..f81a3dc 100644
--- a/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
+++ b/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
@@ -273,12 +273,15 @@ public abstract class AbstractQueryExecutor<E> implements QueryExecutor<E> {
     if (queryModel.getTable().isTransactionalTable()) {
       return;
     }
-    // First validate the schema of the carbondata file
-    boolean sameColumnSchemaList = BlockletDataMapUtil.isSameColumnSchemaList(columnsInTable,
-        queryModel.getTable().getTableInfo().getFactTable().getListOfColumns());
+    // First validate the schema of the carbondata file if the same column name have different
+    // datatype
+    boolean sameColumnSchemaList = BlockletDataMapUtil
+        .isSameColumnAndDifferentDatatypeInSchema(columnsInTable,
+            queryModel.getTable().getTableInfo().getFactTable().getListOfColumns());
     if (!sameColumnSchemaList) {
-      LOGGER.error("Schema of " + filePath + " doesn't match with the table's schema");
-      throw new IOException("All the files doesn't have same schema. "
+      LOGGER.error("Datatype of the common columns present in " + filePath + " doesn't match with"
+          + "the column's datatype in table schema");
+      throw new IOException("All common columns present in the files doesn't have same datatype. "
           + "Unsupported operation on nonTransactional table. Check logs.");
     }
     List<ProjectionDimension> dimensions = queryModel.getProjectionDimensions();
@@ -331,10 +334,11 @@ public abstract class AbstractQueryExecutor<E> implements QueryExecutor<E> {
   private void createFilterExpression(QueryModel queryModel, SegmentProperties properties) {
     Expression expression = queryModel.getFilterExpression();
     if (expression != null) {
-      QueryModel.FilterProcessVO processVO =
-          new QueryModel.FilterProcessVO(properties.getDimensions(), properties.getMeasures(),
-              new ArrayList<CarbonDimension>());
-      QueryModel.processFilterExpression(processVO, expression, null, null);
+      QueryModel.FilterProcessVO processVO = new QueryModel.FilterProcessVO(
+          properties.getDimensions(),
+          properties.getMeasures(),
+          new ArrayList<CarbonDimension>());
+      QueryModel.processFilterExpression(processVO, expression, null, null, queryModel.getTable());
       // Optimize Filter Expression and fit RANGE filters is conditions apply.
       FilterOptimizer rangeFilterOptimizer = new RangeFilterOptmizer(expression);
       rangeFilterOptimizer.optimizeFilter();
diff --git a/core/src/main/java/org/apache/carbondata/core/scan/model/QueryModel.java b/core/src/main/java/org/apache/carbondata/core/scan/model/QueryModel.java
index d7dcee0..d6017f5 100644
--- a/core/src/main/java/org/apache/carbondata/core/scan/model/QueryModel.java
+++ b/core/src/main/java/org/apache/carbondata/core/scan/model/QueryModel.java
@@ -145,29 +145,33 @@ public class QueryModel {
   }
 
   public static void processFilterExpression(FilterProcessVO processVO, Expression filterExpression,
-      final boolean[] isFilterDimensions, final boolean[] isFilterMeasures) {
+      final boolean[] isFilterDimensions, final boolean[] isFilterMeasures,
+      CarbonTable carbonTable) {
     if (null != filterExpression) {
       if (null != filterExpression.getChildren() && filterExpression.getChildren().size() == 0) {
         if (filterExpression instanceof ConditionalExpression) {
           List<ColumnExpression> listOfCol =
               ((ConditionalExpression) filterExpression).getColumnList();
           for (ColumnExpression expression : listOfCol) {
-            setDimAndMsrColumnNode(processVO, expression, isFilterDimensions, isFilterMeasures);
+            setDimAndMsrColumnNode(processVO, expression, isFilterDimensions, isFilterMeasures,
+                carbonTable);
           }
         }
       }
       for (Expression expression : filterExpression.getChildren()) {
         if (expression instanceof ColumnExpression) {
           setDimAndMsrColumnNode(processVO, (ColumnExpression) expression, isFilterDimensions,
-              isFilterMeasures);
+              isFilterMeasures, carbonTable);
         } else if (expression instanceof UnknownExpression) {
           UnknownExpression exp = ((UnknownExpression) expression);
           List<ColumnExpression> listOfColExpression = exp.getAllColumnList();
           for (ColumnExpression col : listOfColExpression) {
-            setDimAndMsrColumnNode(processVO, col, isFilterDimensions, isFilterMeasures);
+            setDimAndMsrColumnNode(processVO, col, isFilterDimensions, isFilterMeasures,
+                carbonTable);
           }
         } else {
-          processFilterExpression(processVO, expression, isFilterDimensions, isFilterMeasures);
+          processFilterExpression(processVO, expression, isFilterDimensions, isFilterMeasures,
+              carbonTable);
         }
       }
     }
@@ -184,7 +188,7 @@ public class QueryModel {
   }
 
   private static void setDimAndMsrColumnNode(FilterProcessVO processVO, ColumnExpression col,
-      boolean[] isFilterDimensions, boolean[] isFilterMeasures) {
+      boolean[] isFilterDimensions, boolean[] isFilterMeasures, CarbonTable table) {
     CarbonDimension dim;
     CarbonMeasure msr;
     String columnName;
@@ -209,13 +213,25 @@ public class QueryModel {
       if (null != isFilterMeasures) {
         isFilterMeasures[msr.getOrdinal()] = true;
       }
-    } else {
+    } else if (null != CarbonUtil.findDimension(processVO.getImplicitDimensions(), columnName)) {
       // check if this is an implicit dimension
-      dim = CarbonUtil
-          .findDimension(processVO.getImplicitDimensions(), columnName);
+      dim = CarbonUtil.findDimension(processVO.getImplicitDimensions(), columnName);
       col.setCarbonColumn(dim);
       col.setDimension(dim);
       col.setDimension(true);
+    } else {
+      // in case of sdk or fileformat, there can be chance that each carbondata file may have
+      // different schema, so every segment properties will have dims and measures based on
+      // corresponding segment. So the filter column may not be present in it. so generate the
+      // dimension and measure from the carbontable
+      CarbonDimension dimension =
+          table.getDimensionByName(table.getTableName(), col.getColumnName());
+      CarbonMeasure measure = table.getMeasureByName(table.getTableName(), col.getColumnName());
+      col.setDimension(dimension);
+      col.setMeasure(measure);
+      col.setCarbonColumn(dimension == null ? measure : dimension);
+      col.setDimension(null != dimension);
+      col.setMeasure(null != measure);
     }
   }
 
diff --git a/core/src/main/java/org/apache/carbondata/core/util/BlockletDataMapUtil.java b/core/src/main/java/org/apache/carbondata/core/util/BlockletDataMapUtil.java
index b81bc75..68aad72 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/BlockletDataMapUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/BlockletDataMapUtil.java
@@ -109,10 +109,10 @@ public class BlockletDataMapUtil {
         isTransactionalTable);
     for (DataFileFooter footer : indexInfo) {
       if ((!isTransactionalTable) && (tableColumnList.size() != 0) &&
-          !isSameColumnSchemaList(footer.getColumnInTable(), tableColumnList)) {
-        LOG.error("Schema of " + identifier.getIndexFileName()
-            + " doesn't match with the table's schema");
-        throw new IOException("All the files doesn't have same schema. "
+          !isSameColumnAndDifferentDatatypeInSchema(footer.getColumnInTable(), tableColumnList)) {
+        LOG.error("Datatype of the common columns present in " + identifier.getIndexFileName()
+            + " doesn't match with the column's datatype in table schema");
+        throw new IOException("All common columns present in the files doesn't have same datatype. "
             + "Unsupported operation on nonTransactional table. Check logs.");
       }
       if ((tableColumnList != null) && (tableColumnList.size() == 0)) {
@@ -252,16 +252,23 @@ public class BlockletDataMapUtil {
     return true;
   }
 
-  public static boolean isSameColumnSchemaList(List<ColumnSchema> indexFileColumnList,
-      List<ColumnSchema> tableColumnList) {
-    if (indexFileColumnList.size() != tableColumnList.size()) {
-      LOG.error("Index file's column size is " + indexFileColumnList.size()
-          + " but table's column size is " + tableColumnList.size());
-      return false;
-    }
+  /**
+   * This method validates whether the schema present in index and table contains the same column
+   * name but with different dataType.
+   */
+  public static boolean isSameColumnAndDifferentDatatypeInSchema(
+      List<ColumnSchema> indexFileColumnList, List<ColumnSchema> tableColumnList) {
     for (int i = 0; i < tableColumnList.size(); i++) {
-      if (!tableColumnList.contains(indexFileColumnList.get(i))) {
-        return false;
+      for (int j = 0; j < indexFileColumnList.size(); j++) {
+        if (indexFileColumnList.get(j).getColumnName()
+            .equalsIgnoreCase(tableColumnList.get(i).getColumnName()) && !indexFileColumnList.get(j)
+            .getDataType().getName()
+            .equalsIgnoreCase(tableColumnList.get(i).getDataType().getName())) {
+          LOG.error("Datatype of the Column " + indexFileColumnList.get(j).getColumnName()
+              + " present in index file, is not same as datatype of the column with same name"
+              + "present in table");
+          return false;
+        }
       }
     }
     return true;
diff --git a/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java b/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
index 3fb54f0..2b1cd6e 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
@@ -2188,9 +2188,14 @@ public final class CarbonUtil {
     CarbonFile segment = FileFactory.getCarbonFile(path, configuration);
 
     CarbonFile[] dataFiles = segment.listFiles();
+    CarbonFile latestCarbonFile = null;
+    long latestDatafileTimestamp = 0L;
+    // get the latest carbondatafile to get the latest schema in the folder
     for (CarbonFile dataFile : dataFiles) {
-      if (dataFile.getName().endsWith(CarbonCommonConstants.FACT_FILE_EXT)) {
-        return dataFile.getAbsolutePath();
+      if (dataFile.getName().endsWith(CarbonCommonConstants.FACT_FILE_EXT)
+          && dataFile.getLastModifiedTime() > latestDatafileTimestamp) {
+        latestCarbonFile = dataFile;
+        latestDatafileTimestamp = dataFile.getLastModifiedTime();
       } else if (dataFile.isDirectory()) {
         // if the list has directories that doesn't contain data files,
         // continue checking other files/directories in the list.
@@ -2201,8 +2206,13 @@ public final class CarbonUtil {
         }
       }
     }
-    //returning null only if the path doesn't have data files.
-    return null;
+
+    if (latestCarbonFile != null) {
+      return latestCarbonFile.getAbsolutePath();
+    } else {
+      //returning null only if the path doesn't have data files.
+      return null;
+    }
   }
 
   /**
diff --git a/integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala b/integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala
index bdee4a1..97691d6 100644
--- a/integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala
+++ b/integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala
@@ -272,7 +272,7 @@ class PrestoTestNonTransactionalTableFiles extends FunSuiteLike with BeforeAndAf
           .executeQuery("select count(*) as RESULT from files ")
       }
     assert(exception.getMessage()
-      .contains("All the files doesn't have same schema"))
+      .contains("All common columns present in the files doesn't have same datatype"))
     cleanTestData()
   }
 
diff --git a/integration/spark-datasource/src/test/scala/org/apache/spark/sql/carbondata/datasource/SparkCarbonDataSourceTest.scala b/integration/spark-datasource/src/test/scala/org/apache/spark/sql/carbondata/datasource/SparkCarbonDataSourceTest.scala
index 329a250..fa37548 100644
--- a/integration/spark-datasource/src/test/scala/org/apache/spark/sql/carbondata/datasource/SparkCarbonDataSourceTest.scala
+++ b/integration/spark-datasource/src/test/scala/org/apache/spark/sql/carbondata/datasource/SparkCarbonDataSourceTest.scala
@@ -1225,12 +1225,8 @@ class SparkCarbonDataSourceTest extends FunSuite with BeforeAndAfterAll {
     assert(spark.sql("select * from sdkout").collect().length == 5)
     buildTestDataOtherDataType(5, null, warehouse1+"/sdk1", 2)
     spark.sql("refresh table sdkout")
-    intercept[Exception] {
-      spark.sql("select * from sdkout").show()
-    }
-    intercept[Exception] {
-      spark.sql("select * from sdkout where salary=100").show()
-    }
+    assert(spark.sql("select * from sdkout").count() == 10)
+    assert(spark.sql("select * from sdkout where salary=100").count() == 1)
     FileFactory.deleteAllFilesOfDir(new File(warehouse1+"/sdk1"))
   }
 


[carbondata] 06/41: [CARBONDATA-3299] Desc Formatted Issue Fixed

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit eef1eb1b7c4f2756d3ce2350532cc669f80bce9b
Author: shivamasn <sh...@gmail.com>
AuthorDate: Fri Feb 15 18:48:29 2019 +0530

    [CARBONDATA-3299] Desc Formatted Issue Fixed
    
    After changing the carbon properties related to Compaction, changed value is not being reflected in the desc formatted command.
    
    This closes #3127
---
 .../table/CarbonDescribeFormattedCommand.scala       | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDescribeFormattedCommand.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDescribeFormattedCommand.scala
index 1a1473b..7468ece 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDescribeFormattedCommand.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDescribeFormattedCommand.scala
@@ -148,19 +148,29 @@ private[sql] case class CarbonDescribeFormattedCommand(
       ("## Compaction Information", "", ""),
       (CarbonCommonConstants.TABLE_MAJOR_COMPACTION_SIZE.toUpperCase,
         tblProps.getOrElse(CarbonCommonConstants.TABLE_MAJOR_COMPACTION_SIZE,
-        CarbonCommonConstants.DEFAULT_CARBON_MAJOR_COMPACTION_SIZE), ""),
+          CarbonProperties.getInstance()
+            .getProperty(CarbonCommonConstants.CARBON_MAJOR_COMPACTION_SIZE,
+              CarbonCommonConstants.DEFAULT_CARBON_MAJOR_COMPACTION_SIZE)), ""),
       (CarbonCommonConstants.TABLE_AUTO_LOAD_MERGE.toUpperCase,
         tblProps.getOrElse(CarbonCommonConstants.TABLE_AUTO_LOAD_MERGE,
-        CarbonCommonConstants.DEFAULT_ENABLE_AUTO_LOAD_MERGE), ""),
+          CarbonProperties.getInstance()
+            .getProperty(CarbonCommonConstants.ENABLE_AUTO_LOAD_MERGE,
+              CarbonCommonConstants.DEFAULT_ENABLE_AUTO_LOAD_MERGE)), ""),
       (CarbonCommonConstants.TABLE_COMPACTION_LEVEL_THRESHOLD.toUpperCase,
         tblProps.getOrElse(CarbonCommonConstants.TABLE_COMPACTION_LEVEL_THRESHOLD,
-        CarbonCommonConstants.DEFAULT_SEGMENT_LEVEL_THRESHOLD), ""),
+          CarbonProperties.getInstance()
+            .getProperty(CarbonCommonConstants.COMPACTION_SEGMENT_LEVEL_THRESHOLD,
+              CarbonCommonConstants.DEFAULT_SEGMENT_LEVEL_THRESHOLD)), ""),
       (CarbonCommonConstants.TABLE_COMPACTION_PRESERVE_SEGMENTS.toUpperCase,
         tblProps.getOrElse(CarbonCommonConstants.TABLE_COMPACTION_PRESERVE_SEGMENTS,
-        CarbonCommonConstants.DEFAULT_PRESERVE_LATEST_SEGMENTS_NUMBER), ""),
+          CarbonProperties.getInstance()
+            .getProperty(CarbonCommonConstants.PRESERVE_LATEST_SEGMENTS_NUMBER,
+              CarbonCommonConstants.DEFAULT_PRESERVE_LATEST_SEGMENTS_NUMBER)), ""),
       (CarbonCommonConstants.TABLE_ALLOWED_COMPACTION_DAYS.toUpperCase,
         tblProps.getOrElse(CarbonCommonConstants.TABLE_ALLOWED_COMPACTION_DAYS,
-        CarbonCommonConstants.DEFAULT_DAYS_ALLOWED_TO_COMPACT), "")
+          CarbonProperties.getInstance()
+            .getProperty(CarbonCommonConstants.DAYS_ALLOWED_TO_COMPACT,
+              CarbonCommonConstants.DEFAULT_DAYS_ALLOWED_TO_COMPACT)), "")
     )
 
     //////////////////////////////////////////////////////////////////////////////


[carbondata] 32/41: [CARBONDATA-3329] Fixed deadlock issue during failed query

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 236b5e1421e9da665471bc22da78fc94ef198e61
Author: kunal642 <ku...@gmail.com>
AuthorDate: Fri Mar 22 14:02:07 2019 +0530

    [CARBONDATA-3329] Fixed deadlock issue during failed query
    
    [200~Problem: When query fail, SparkExecuteStatementOperation.logError will trigger, a call to CarbonDatasourceHadoopRelation.toString which will try to extract carbontable from relation. For this a lock is acquired on HiveExternalCatalog and then in logger. But SparkExecuteStatementOperation.logError has already acquired lock on logger and is internally expecting a lock on HiveExternalCatalog which will lead to a deadlock
    
    This closes #3158
---
 .../scala/org/apache/spark/sql/CarbonDatasourceHadoopRelation.scala | 6 ++----
 .../org/apache/spark/sql/execution/command/cache/CacheUtil.scala    | 6 ++++++
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDatasourceHadoopRelation.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDatasourceHadoopRelation.scala
index 672508f..57dd356 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDatasourceHadoopRelation.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDatasourceHadoopRelation.scala
@@ -56,8 +56,6 @@ case class CarbonDatasourceHadoopRelation(
     paths.head,
     CarbonEnv.getDatabaseName(caseInsensitiveMap.get("dbname"))(sparkSession),
     caseInsensitiveMap("tablename"))
-  lazy val databaseName: String = carbonTable.getDatabaseName
-  lazy val tableName: String = carbonTable.getTableName
   CarbonSession.updateSessionInfoToCurrentThread(sparkSession)
 
   @transient lazy val carbonRelation: CarbonRelation =
@@ -198,8 +196,8 @@ case class CarbonDatasourceHadoopRelation(
   override def unhandledFilters(filters: Array[Filter]): Array[Filter] = new Array[Filter](0)
 
   override def toString: String = {
-    "CarbonDatasourceHadoopRelation [ " + "Database name :" + databaseName +
-    ", " + "Table name :" + tableName + ", Schema :" + tableSchema + " ]"
+    "CarbonDatasourceHadoopRelation [ " + "Database name :" + identifier.getDatabaseName +
+    ", " + "Table name :" + identifier.getTableName + ", Schema :" + tableSchema + " ]"
   }
 
   override def sizeInBytes: Long = carbonRelation.sizeInBytes
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CacheUtil.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CacheUtil.scala
index 615d8e0..18402e9 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CacheUtil.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CacheUtil.scala
@@ -44,6 +44,9 @@ object CacheUtil {
       CarbonDataMergerUtil.getValidSegmentList(absoluteTableIdentifier).asScala.flatMap {
         segment =>
           segment.getCommittedIndexFile.keySet().asScala
+      }.map { indexFile =>
+        indexFile.replace(CarbonCommonConstants.WINDOWS_FILE_SEPARATOR,
+          CarbonCommonConstants.FILE_SEPARATOR)
       }.toList
     } else {
       val tablePath = carbonTable.getTablePath
@@ -53,6 +56,9 @@ object CacheUtil {
         load =>
           val seg = new Segment(load.getLoadName, null, readCommittedScope)
           seg.getCommittedIndexFile.keySet().asScala
+      }.map { indexFile =>
+        indexFile.replace(CarbonCommonConstants.WINDOWS_FILE_SEPARATOR,
+          CarbonCommonConstants.FILE_SEPARATOR)
       }.toList
     }
   }


[carbondata] 20/41: [CARBONDATA-3315] Fix for Range Filter failing with two between clauses as children of OR expression

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit d78eed4befc06b9793a2164eb7276032b9ff0a6c
Author: manishnalla1994 <ma...@gmail.com>
AuthorDate: Tue Mar 12 18:58:43 2019 +0530

    [CARBONDATA-3315] Fix for Range Filter failing with two between clauses as children of OR expression
    
    Problem : Range filter with Or and two between clauses as children fails because while evaluating the Or we combine both the sub-trees into one and then evaluate, which is wrong.
    
    Solution : Instead of combining them we have to treat both the children of 'OR' as different expressions and evaluate separately.
    
    This closes #3145
---
 .../scan/expression/RangeExpressionEvaluator.java  | 30 +++++++++++++----
 .../detailquery/RangeFilterTestCase.scala          | 38 ++++++++++++++++++++++
 2 files changed, 61 insertions(+), 7 deletions(-)

diff --git a/core/src/main/java/org/apache/carbondata/core/scan/expression/RangeExpressionEvaluator.java b/core/src/main/java/org/apache/carbondata/core/scan/expression/RangeExpressionEvaluator.java
index c47d5ff..f131c92 100644
--- a/core/src/main/java/org/apache/carbondata/core/scan/expression/RangeExpressionEvaluator.java
+++ b/core/src/main/java/org/apache/carbondata/core/scan/expression/RangeExpressionEvaluator.java
@@ -208,6 +208,13 @@ public class RangeExpressionEvaluator {
     return filterExpressionMap;
   }
 
+  private void evaluateOrExpression(Expression currentNode, Expression orExpChild) {
+    Map<String, List<FilterModificationNode>> filterExpressionMapNew =
+        new HashMap<>(CarbonCommonConstants.DEFAULT_COLLECTION_SIZE);
+    fillExpressionMap(filterExpressionMapNew, orExpChild, currentNode);
+    replaceWithRangeExpression(filterExpressionMapNew);
+    filterExpressionMapNew.clear();
+  }
 
   private void fillExpressionMap(Map<String, List<FilterModificationNode>> filterExpressionMap,
       Expression currentNode, Expression parentNode) {
@@ -222,13 +229,22 @@ public class RangeExpressionEvaluator {
         && eligibleForRangeExpConv(currentNode))) {
       addFilterExpressionMap(filterExpressionMap, currentNode, parentNode);
     }
-
-    for (Expression exp : currentNode.getChildren()) {
-      if (null != exp) {
-        fillExpressionMap(filterExpressionMap, exp, currentNode);
-        if (exp instanceof OrExpression) {
-          replaceWithRangeExpression(filterExpressionMap);
-          filterExpressionMap.clear();
+    // In case of Or Exp we have to evaluate both the subtrees of expression separately
+    // else it will combine the results of both the subtrees into one expression
+    // which wont give us correct result
+    if (currentNode instanceof OrExpression) {
+      Expression leftChild = ((OrExpression) currentNode).left;
+      Expression rightChild = ((OrExpression) currentNode).right;
+      if (null != leftChild) {
+        evaluateOrExpression(currentNode, leftChild);
+      }
+      if (null != rightChild) {
+        evaluateOrExpression(currentNode, rightChild);
+      }
+    } else {
+      for (Expression exp : currentNode.getChildren()) {
+        if (null != exp) {
+          fillExpressionMap(filterExpressionMap, exp, currentNode);
         }
       }
     }
diff --git a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/detailquery/RangeFilterTestCase.scala b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/detailquery/RangeFilterTestCase.scala
index cd7f7fc..e9a83ec 100644
--- a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/detailquery/RangeFilterTestCase.scala
+++ b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/detailquery/RangeFilterTestCase.scala
@@ -39,6 +39,20 @@ class RangeFilterTestCase extends QueryTest with BeforeAndAfterAll {
     sql("drop table if exists DICTIONARY_CARBON_6")
     sql("drop table if exists NO_DICTIONARY_CARBON_7")
     sql("drop table if exists NO_DICTIONARY_HIVE_8")
+    sql("drop table if exists carbontest")
+    sql("drop table if exists carbontest_hive")
+    sql(
+      "create table carbontest(c1 string, c2 string, c3 int) stored by 'carbondata' tblproperties" +
+      "('sort_columns'='c3')")
+    (0 to 10).foreach { index =>
+      sql(s"insert into carbontest select '$index','$index',$index")
+    }
+    sql(
+      "create table carbontest_hive(c1 string, c2 string, c3 int) row format delimited fields " +
+      "terminated by ',' tblproperties('sort_columns'='c3')")
+    (0 to 10).foreach { index =>
+      sql(s"insert into carbontest_hive select '$index','$index',$index")
+    }
 
     sql("CREATE TABLE NO_DICTIONARY_HIVE_1 (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION " +
         "string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint," +
@@ -587,7 +601,31 @@ class RangeFilterTestCase extends QueryTest with BeforeAndAfterAll {
       sql("select empname from NO_DICTIONARY_HIVE_8 where empname <= '107'"))
   }
 
+  test("Range filter with two between clauses") {
+
+    checkAnswer(sql("select * from carbontest where c3 between 2 and 3 or c3 between 3 and 4"),
+      sql("select * from carbontest_hive where c3 between 2 and 3 or c3 between 3 and 4"))
+
+    checkAnswer(sql("select * from carbontest where c3 >= 2 and c3 <= 3 or c3 >= 3 and c3 <= 4"),
+      sql("select * from carbontest_hive where c3 >= 2 and c3 <= 3 or c3 >= 3 and c3 <= 4"))
+
+    checkAnswer(sql(
+      "select * from carbontest where (c3 between 2 and 3 or c3 between 3 and 4) and (c3 between " +
+      "2 and 4 or c3 between 4 and 5)"),
+      sql(
+        "select * from carbontest_hive where (c3 between 2 and 3 or c3 between 3 and 4) and (c3 " +
+        "between 2 and 4 or c3 between 4 and 5)"))
+
+    checkAnswer(sql(
+      "select * from carbontest where c3 >= 2 and c3 <= 5 and (c3 between 1 and 3 or c3 between 3" +
+      " and 6)"),
+      sql("select * from carbontest_hive where c3 >= 2 and c3 <= 5 and (c3 between 1 and 3 or c3 " +
+        "between 3 and 6)"))
+  }
+
   override def afterAll {
+    sql("drop table if exists carbontest")
+    sql("drop table if exists carbontest_hive")
     sql("drop table if exists filtertestTable")
     sql("drop table if exists NO_DICTIONARY_HIVE_1")
     sql("drop table if exists NO_DICTIONARY_CARBON_1")


[carbondata] 24/41: [TestCase][HOTFIX] Added drop database in beforeEach to avoid exception

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 43644738a86087020e59074eaa51f0939fa07678
Author: yatingudwani <ya...@gmail.com>
AuthorDate: Mon Mar 18 18:59:28 2019 +0530

    [TestCase][HOTFIX] Added drop database in beforeEach to avoid exception
    
    Problem: Sometimes while running these test suites in local some test
    cases are failing with database already exists exception.
    
    Solution: Add drop database command in beforeEach so that each test
    case can easily create new one.
    
    This closes #3152
---
 .../dblocation/DBLocationCarbonTableTestCase.scala | 26 +++++++---------------
 .../register/TestRegisterCarbonTable.scala         | 24 +++++++-------------
 2 files changed, 16 insertions(+), 34 deletions(-)

diff --git a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dblocation/DBLocationCarbonTableTestCase.scala b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dblocation/DBLocationCarbonTableTestCase.scala
index 7b80c72..50fb8c5 100644
--- a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dblocation/DBLocationCarbonTableTestCase.scala
+++ b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dblocation/DBLocationCarbonTableTestCase.scala
@@ -21,12 +21,12 @@ import org.apache.carbondata.core.datastore.impl.FileFactory
 import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
 import org.apache.spark.sql.{AnalysisException, CarbonEnv, Row}
 import org.apache.spark.sql.test.util.QueryTest
-import org.scalatest.BeforeAndAfterAll
+import org.scalatest.BeforeAndAfterEach
 
 /**
  *
  */
-class DBLocationCarbonTableTestCase extends QueryTest with BeforeAndAfterAll {
+class DBLocationCarbonTableTestCase extends QueryTest with BeforeAndAfterEach {
 
   def getMdtFileAndType() = {
     // if mdt file path is configured then take configured path else take default path
@@ -40,13 +40,14 @@ class DBLocationCarbonTableTestCase extends QueryTest with BeforeAndAfterAll {
 
   }
 
-  override def beforeAll {
+  override def beforeEach {
     sql("drop database if exists carbon cascade")
+    sql("drop database if exists carbon1 cascade")
+    sql("drop database if exists carbon2 cascade")
   }
 
   //TODO fix this test case
   test("Update operation on carbon table with insert into") {
-    sql("drop database if exists carbon2 cascade")
     sql(s"create database carbon2 location '$dblocation'")
     sql("use carbon2")
     sql("""create table carbontable (c1 string,c2 int,c3 string,c5 string) STORED BY 'org.apache.carbondata.format'""")
@@ -78,7 +79,6 @@ class DBLocationCarbonTableTestCase extends QueryTest with BeforeAndAfterAll {
   }
 
   test("create table and load data") {
-    sql("drop database if exists carbon cascade")
     sql(s"create database carbon location '$dblocation'")
     sql("use carbon")
     sql("""create table carbon.carbontable (c1 string,c2 int,c3 string,c5 string) STORED BY 'org.apache.carbondata.format'""")
@@ -87,7 +87,6 @@ class DBLocationCarbonTableTestCase extends QueryTest with BeforeAndAfterAll {
   }
 
   test("create table and insert data") {
-    sql("drop database if exists carbon cascade")
     sql(s"create database carbon location '$dblocation'")
     sql("use carbon")
     sql("""create table carbon.carbontable (c1 string,c2 int,c3 string,c5 string) STORED BY 'org.apache.carbondata.format'""")
@@ -97,7 +96,6 @@ class DBLocationCarbonTableTestCase extends QueryTest with BeforeAndAfterAll {
   }
 
   test("create table and 2 times data load") {
-    sql("drop database if exists carbon cascade")
     sql(s"create database carbon location '$dblocation'")
     sql("use carbon")
     sql("""create table carbon.carbontable (c1 string,c2 int,c3 string,c5 string) STORED BY 'org.apache.carbondata.format'""")
@@ -109,7 +107,6 @@ class DBLocationCarbonTableTestCase extends QueryTest with BeforeAndAfterAll {
 
 
   test("Update operation on carbon table") {
-    sql("drop database if exists carbon1 cascade")
     sql(s"create database carbon1 location '$dblocation'")
     sql("use carbon1")
     sql(
@@ -129,7 +126,6 @@ class DBLocationCarbonTableTestCase extends QueryTest with BeforeAndAfterAll {
   }
 
   test("Delete operation on carbon table") {
-    sql("drop database if exists carbon1 cascade")
     sql(s"create database carbon1 location '$dblocation'")
     sql("use carbon1")
     sql("""create table carbon1.carbontable (c1 string,c2 int,c3 string,c5 string) STORED BY 'org.apache.carbondata.format'""")
@@ -145,7 +141,6 @@ class DBLocationCarbonTableTestCase extends QueryTest with BeforeAndAfterAll {
   }
 
   test("Alter table add column test") {
-    sql("drop database if exists carbon cascade")
     sql(s"create database carbon location '$dblocation'")
     sql("use carbon")
     sql("""create table carbon.carbontable (c1 string,c2 int,c3 string,c5 string) STORED BY 'org.apache.carbondata.format'""")
@@ -161,7 +156,6 @@ class DBLocationCarbonTableTestCase extends QueryTest with BeforeAndAfterAll {
   }
 
   test("Alter table change column datatype test") {
-    sql("drop database if exists carbon cascade")
     sql(s"create database carbon location '$dblocation'")
     sql("use carbon")
     sql("""create table carbon.carbontable (c1 string,c2 int,c3 string,c5 string) STORED BY 'org.apache.carbondata.format'""")
@@ -176,7 +170,6 @@ class DBLocationCarbonTableTestCase extends QueryTest with BeforeAndAfterAll {
   }
 
   test("Alter table change dataType with sort column after adding measure column test"){
-    sql("drop database if exists carbon cascade")
     sql(s"create database carbon location '$dblocation'")
     sql("use carbon")
     sql(
@@ -196,7 +189,6 @@ class DBLocationCarbonTableTestCase extends QueryTest with BeforeAndAfterAll {
   }
 
   test("Alter table change dataType with sort column after adding date datatype with default value test"){
-    sql("drop database if exists carbon cascade")
     sql(s"create database carbon location '$dblocation'")
     sql("use carbon")
     sql(
@@ -216,7 +208,6 @@ class DBLocationCarbonTableTestCase extends QueryTest with BeforeAndAfterAll {
   }
 
   test("Alter table change dataType with sort column after adding dimension column with default value test"){
-    sql("drop database if exists carbon cascade")
     sql(s"create database carbon location '$dblocation'")
     sql("use carbon")
     sql(
@@ -236,7 +227,6 @@ class DBLocationCarbonTableTestCase extends QueryTest with BeforeAndAfterAll {
   }
 
   test("Alter table change dataType with sort column after rename test"){
-    sql("drop database if exists carbon cascade")
     sql(s"create database carbon location '$dblocation'")
     sql("use carbon")
     sql(
@@ -258,7 +248,6 @@ class DBLocationCarbonTableTestCase extends QueryTest with BeforeAndAfterAll {
   }
 
   test("Alter table drop column test") {
-    sql("drop database if exists carbon cascade")
     sql(s"create database carbon location '$dblocation'")
     sql("use carbon")
     sql("""create table carbon.carbontable (c1 string,c2 int,c3 string,c5 string) STORED BY 'org.apache.carbondata.format'""")
@@ -273,7 +262,6 @@ class DBLocationCarbonTableTestCase extends QueryTest with BeforeAndAfterAll {
   }
 
   test("test mdt file path with configured paths") {
-    sql("drop database if exists carbon cascade")
     sql(s"create database carbon location '$dblocation'")
     sql("use carbon")
     CarbonProperties.getInstance()
@@ -298,11 +286,13 @@ class DBLocationCarbonTableTestCase extends QueryTest with BeforeAndAfterAll {
            CarbonEnv.getInstance(sqlContext.sparkSession).carbonMetaStore.isReadFromHiveMetaStore)
   }
 
-  override def afterAll {
+  override def afterEach {
     CarbonProperties.getInstance()
       .addProperty(CarbonCommonConstants.CARBON_UPDATE_SYNC_FOLDER,
         CarbonCommonConstants.CARBON_UPDATE_SYNC_FOLDER_DEFAULT)
     sql("use default")
     sql("drop database if exists carbon cascade")
+    sql("drop database if exists carbon1 cascade")
+    sql("drop database if exists carbon2 cascade")
   }
 }
diff --git a/integration/spark2/src/test/scala/org/apache/spark/carbondata/register/TestRegisterCarbonTable.scala b/integration/spark2/src/test/scala/org/apache/spark/carbondata/register/TestRegisterCarbonTable.scala
index e4e7d92..ccdee35 100644
--- a/integration/spark2/src/test/scala/org/apache/spark/carbondata/register/TestRegisterCarbonTable.scala
+++ b/integration/spark2/src/test/scala/org/apache/spark/carbondata/register/TestRegisterCarbonTable.scala
@@ -21,7 +21,7 @@ import java.io.{File, IOException}
 import org.apache.commons.io.FileUtils
 import org.apache.spark.sql.test.util.QueryTest
 import org.apache.spark.sql.{AnalysisException, CarbonEnv, Row}
-import org.scalatest.BeforeAndAfterAll
+import org.scalatest.BeforeAndAfterEach
 
 import org.apache.carbondata.core.constants.CarbonCommonConstants
 import org.apache.carbondata.spark.exception.ProcessMetaDataException
@@ -29,10 +29,12 @@ import org.apache.carbondata.spark.exception.ProcessMetaDataException
 /**
  *
  */
-class TestRegisterCarbonTable extends QueryTest with BeforeAndAfterAll {
+class TestRegisterCarbonTable extends QueryTest with BeforeAndAfterEach {
 
-  override def beforeAll {
+  override def beforeEach {
     sql("drop database if exists carbon cascade")
+    sql("drop database if exists carbon1 cascade")
+    sql("drop database if exists carbon2 cascade")
   }
 
   def restoreData(dblocation: String, tableName: String) = {
@@ -60,7 +62,6 @@ class TestRegisterCarbonTable extends QueryTest with BeforeAndAfterAll {
   }
 
   test("register tables test") {
-    sql("drop database if exists carbon cascade")
     sql(s"create database carbon location '$dblocation'")
     sql("use carbon")
     sql("""create table carbon.carbontable (c1 string,c2 int,c3 string,c5 string) STORED BY 'org.apache.carbondata.format'""")
@@ -76,7 +77,6 @@ class TestRegisterCarbonTable extends QueryTest with BeforeAndAfterAll {
   }
 
   test("register table test") {
-    sql("drop database if exists carbon cascade")
     sql(s"create database carbon location '$dblocation'")
     sql("use carbon")
     sql("""create table carbon.carbontable (c1 string,c2 int,c3 string,c5 string) STORED BY 'org.apache.carbondata.format'""")
@@ -92,7 +92,6 @@ class TestRegisterCarbonTable extends QueryTest with BeforeAndAfterAll {
   }
 
    test("register pre aggregate tables test") {
-    sql("drop database if exists carbon cascade")
     sql(s"create database carbon location '$dblocation'")
     sql("use carbon")
     sql("""create table carbon.carbontable (c1 string,c2 int,c3 string,c5 string) STORED BY 'org.apache.carbondata.format'""")
@@ -115,7 +114,6 @@ class TestRegisterCarbonTable extends QueryTest with BeforeAndAfterAll {
   }
 
   test("register pre aggregate table test") {
-    sql("drop database if exists carbon cascade")
     sql(s"create database carbon location '$dblocation'")
     sql("use carbon")
     sql("""create table carbon.carbontable (c1 string,c2 int,c3 string,c5 string) STORED BY 'org.apache.carbondata.format'""")
@@ -138,7 +136,6 @@ class TestRegisterCarbonTable extends QueryTest with BeforeAndAfterAll {
   }
 
   test("register pre aggregate table should fail if the aggregate table not copied") {
-    sql("drop database if exists carbon cascade")
     sql(s"create database carbon location '$dblocation'")
     sql("use carbon")
     sql("""create table carbon.carbontable (c1 string,c2 int,c3 string,c5 string) STORED BY 'org.apache.carbondata.format'""")
@@ -159,8 +156,6 @@ class TestRegisterCarbonTable extends QueryTest with BeforeAndAfterAll {
   }
 
   test("Update operation on carbon table should pass after registration or refresh") {
-    sql("drop database if exists carbon cascade")
-    sql("drop database if exists carbon1 cascade")
     sql(s"create database carbon1 location '$dblocation'")
     sql("use carbon1")
     sql("drop table if exists carbontable")
@@ -183,7 +178,6 @@ class TestRegisterCarbonTable extends QueryTest with BeforeAndAfterAll {
   }
 
   test("Update operation on carbon table") {
-    sql("drop database if exists carbon1 cascade")
     sql(s"create database carbon1 location '$dblocation'")
     sql("use carbon1")
     sql(
@@ -208,7 +202,6 @@ class TestRegisterCarbonTable extends QueryTest with BeforeAndAfterAll {
   }
 
   test("Delete operation on carbon table") {
-    sql("drop database if exists carbon cascade")
     sql(s"create database carbon location '$dblocation'")
     sql("use carbon")
     sql("""create table carbon.carbontable (c1 string,c2 int,c3 string,c5 string) STORED BY 'org.apache.carbondata.format'""")
@@ -230,7 +223,6 @@ class TestRegisterCarbonTable extends QueryTest with BeforeAndAfterAll {
   }
 
   test("Alter table add column test") {
-    sql("drop database if exists carbon cascade")
     sql(s"create database carbon location '$dblocation'")
     sql("use carbon")
     sql("""create table carbon.carbontable (c1 string,c2 int,c3 string,c5 string) STORED BY 'org.apache.carbondata.format'""")
@@ -252,7 +244,6 @@ class TestRegisterCarbonTable extends QueryTest with BeforeAndAfterAll {
   }
 
   test("Alter table change column datatype test") {
-    sql("drop database if exists carbon cascade")
     sql(s"create database carbon location '$dblocation'")
     sql("use carbon")
     sql("""create table carbon.carbontable (c1 string,c2 int,c3 string,c5 string) STORED BY 'org.apache.carbondata.format'""")
@@ -273,7 +264,6 @@ class TestRegisterCarbonTable extends QueryTest with BeforeAndAfterAll {
   }
 
   test("Alter table drop column test") {
-    sql("drop database if exists carbon cascade")
     sql(s"create database carbon location '$dblocation'")
     sql("use carbon")
     sql("""create table carbon.carbontable (c1 string,c2 int,c3 string,c5 string) STORED BY 'org.apache.carbondata.format'""")
@@ -293,8 +283,10 @@ class TestRegisterCarbonTable extends QueryTest with BeforeAndAfterAll {
     }
   }
 
-  override def afterAll {
+  override def afterEach {
     sql("use default")
     sql("drop database if exists carbon cascade")
+    sql("drop database if exists carbon1 cascade")
+    sql("drop database if exists carbon2 cascade")
   }
 }


[carbondata] 04/41: [CARBONDATA-3107] Optimize error/exception coding for better debugging

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 459ad23576eb33010c801e6461b7dfec3f3434d2
Author: Manhua <ke...@qq.com>
AuthorDate: Tue Oct 30 09:44:59 2018 +0800

    [CARBONDATA-3107] Optimize error/exception coding for better debugging
    
    Some error log in carbon is only a single line of conclusion message
    (like "Dataload failed"), and when we look into codes may found that
    is newly created exception without original exception, so more jobs
    need to be done to find the root cause.
    
    To better locate the root cause when carbon fails, this PR propose to
    keep the original throwable for logging when wrapping it to another
    exception, and also log stack trace alone with error message.
    
       Changes in this PR follows these rules: (`e` is an exception)
    
       | Code Sample | Problem | Suggest Modification |
       | --- | --- | --- |
       | `LOGGER.error(e);` | no stack trace(e is taken as message instead of throwable) | `LOGGER.error(e.getMessage(), e);` |
       | `LOGGER.error(e.getMessage());` | no stack trace | `LOGGER.error(e.getMessage(), e);` |
       | `catch ... throw new Exception("Error occur")` | useless message, no stack trace | `throw new Exception(e)` |
       | `catch ... throw new Exception(e.getMessage())` | no stack trace | `throw new Exception(e)` |
       | `catch ... throw new Exception(e.getMessage(), e)` | no need to call `getMessage()` | `throw new Exception(e)` |
       | `catch ... throw new Exception("Error occur: " + e.getMessage(), e)` | useless message | `throw new Exception(e)` |
       | `catch ... throw new Exception("DataLoad fail: " + e.getMessage())` | no stack trace | `throw new Exception("DataLoad fail: " + e.getMessage(), e)` |
    
       Some exceptions, such as MalformedCarbonCommandException,
       InterruptException, NumberFormatException, InvalidLoadOptionException
       and NoRetryException, do not have Constructor using `Throwable`,
       so this PR does not change it.
    
    This closes #2878
---
 .../cache/dictionary/ForwardDictionaryCache.java   |  2 +-
 .../cache/dictionary/ReverseDictionaryCache.java   |  2 +-
 .../core/datamap/DataMapStoreManager.java          |  2 +-
 .../carbondata/core/datamap/DataMapUtil.java       |  2 +-
 .../filesystem/AbstractDFSCarbonFile.java          | 14 ++++++-------
 .../datastore/filesystem/AlluxioCarbonFile.java    |  2 +-
 .../core/datastore/filesystem/HDFSCarbonFile.java  |  2 +-
 .../core/datastore/filesystem/LocalCarbonFile.java |  2 +-
 .../core/datastore/filesystem/S3CarbonFile.java    |  2 +-
 .../datastore/filesystem/ViewFSCarbonFile.java     |  2 +-
 .../core/datastore/impl/FileFactory.java           |  4 ++--
 .../client/NonSecureDictionaryClient.java          |  2 +-
 .../client/NonSecureDictionaryClientHandler.java   |  4 ++--
 .../generator/TableDictionaryGenerator.java        |  2 +-
 .../server/NonSecureDictionaryServerHandler.java   |  2 +-
 .../service/AbstractDictionaryServer.java          |  8 ++++----
 .../core/indexstore/BlockletDataMapIndexStore.java |  4 ++--
 .../core/indexstore/BlockletDetailInfo.java        |  2 +-
 .../timestamp/DateDirectDictionaryGenerator.java   |  4 ++--
 .../carbondata/core/locks/ZookeeperInit.java       |  2 +-
 .../core/metadata/schema/table/CarbonTable.java    |  2 +-
 .../carbondata/core/mutate/CarbonUpdateUtil.java   |  4 ++--
 .../core/reader/CarbonDeleteFilesDataReader.java   | 14 ++++++-------
 .../carbondata/core/scan/filter/FilterUtil.java    | 16 +++++++--------
 .../core/scan/result/BlockletScannedResult.java    |  6 +++---
 .../AbstractDetailQueryResultIterator.java         |  2 +-
 .../core/statusmanager/LoadMetadataDetails.java    |  6 +++---
 .../core/statusmanager/SegmentStatusManager.java   |  6 +++---
 .../carbondata/core/util/CarbonProperties.java     |  2 +-
 .../apache/carbondata/core/util/CarbonUtil.java    |  8 ++++----
 .../apache/carbondata/core/util/DataTypeUtil.java  |  8 ++++----
 .../core/util/ObjectSerializationUtil.java         |  4 ++--
 .../carbondata/core/util/path/HDFSLeaseUtils.java  |  2 +-
 .../datastore/filesystem/HDFSCarbonFileTest.java   |  2 +-
 .../core/load/LoadMetadataDetailsUnitTest.java     |  2 +-
 .../bloom/BloomCoarseGrainDataMapFactory.java      |  2 +-
 .../datamap/lucene/LuceneFineGrainDataMap.java     |  6 +++---
 .../lucene/LuceneFineGrainDataMapFactory.java      |  2 +-
 .../hive/CarbonDictionaryDecodeReadSupport.java    |  2 +-
 .../carbondata/hive/MapredCarbonInputFormat.java   |  2 +-
 .../carbondata/presto/impl/CarbonTableReader.java  | 12 +++++------
 .../client/SecureDictionaryClientHandler.java      |  4 ++--
 .../server/SecureDictionaryServerHandler.java      |  2 +-
 .../vectorreader/VectorizedCarbonRecordReader.java |  4 ++--
 .../processing/datamap/DataMapWriterListener.java  |  2 +-
 .../loading/AbstractDataLoadProcessorStep.java     |  2 +-
 .../loading/converter/impl/RowConverterImpl.java   |  2 +-
 .../loading/model/CarbonLoadModelBuilder.java      |  2 +-
 .../loading/parser/impl/JsonRowParser.java         |  2 +-
 .../sort/impl/ParallelReadMergeSorterImpl.java     |  2 +-
 ...ParallelReadMergeSorterWithColumnRangeImpl.java |  2 +-
 .../UnsafeBatchParallelReadMergeSorterImpl.java    |  4 ++--
 .../impl/UnsafeParallelReadMergeSorterImpl.java    |  2 +-
 ...ParallelReadMergeSorterWithColumnRangeImpl.java |  2 +-
 .../loading/sort/unsafe/UnsafeSortDataRows.java    | 14 ++++++-------
 .../holder/UnsafeSortTempFileChunkHolder.java      | 10 ++++-----
 .../unsafe/merger/UnsafeIntermediateMerger.java    |  2 +-
 .../UnsafeSingleThreadFinalSortFilesMerger.java    |  4 ++--
 .../CarbonRowDataWriterProcessorStepImpl.java      |  8 ++++----
 .../steps/DataWriterBatchProcessorStepImpl.java    |  4 ++--
 .../processing/merger/CarbonCompactionUtil.java    |  8 ++++----
 .../processing/merger/CarbonDataMergerUtil.java    |  4 ++--
 .../merger/CompactionResultSortProcessor.java      | 24 ++++++++++++----------
 .../merger/RowResultMergerProcessor.java           |  2 +-
 .../partition/spliter/RowResultProcessor.java      |  6 +++---
 .../sortdata/SingleThreadFinalSortFilesMerger.java |  4 ++--
 .../processing/sort/sortdata/SortDataRows.java     |  2 +-
 .../sort/sortdata/SortTempFileChunkHolder.java     | 10 ++++-----
 .../store/CarbonFactDataHandlerColumnar.java       | 12 +++++------
 .../processing/util/CarbonLoaderUtil.java          |  6 +++---
 .../apache/carbondata/store/LocalCarbonStore.java  |  4 ++--
 .../java/org/apache/carbondata/tool/CarbonCli.java |  4 ++--
 72 files changed, 169 insertions(+), 167 deletions(-)

diff --git a/core/src/main/java/org/apache/carbondata/core/cache/dictionary/ForwardDictionaryCache.java b/core/src/main/java/org/apache/carbondata/core/cache/dictionary/ForwardDictionaryCache.java
index dad6c8f..8b3d649 100644
--- a/core/src/main/java/org/apache/carbondata/core/cache/dictionary/ForwardDictionaryCache.java
+++ b/core/src/main/java/org/apache/carbondata/core/cache/dictionary/ForwardDictionaryCache.java
@@ -138,7 +138,7 @@ public class ForwardDictionaryCache<K extends
       executorService.shutdown();
       executorService.awaitTermination(2, TimeUnit.HOURS);
     } catch (InterruptedException e) {
-      LOGGER.error("Error loading the dictionary: " + e.getMessage());
+      LOGGER.error("Error loading the dictionary: " + e.getMessage(), e);
     }
     for (int i = 0; i < taskSubmitList.size(); i++) {
       try {
diff --git a/core/src/main/java/org/apache/carbondata/core/cache/dictionary/ReverseDictionaryCache.java b/core/src/main/java/org/apache/carbondata/core/cache/dictionary/ReverseDictionaryCache.java
index f40b611..fc4bbe5 100644
--- a/core/src/main/java/org/apache/carbondata/core/cache/dictionary/ReverseDictionaryCache.java
+++ b/core/src/main/java/org/apache/carbondata/core/cache/dictionary/ReverseDictionaryCache.java
@@ -115,7 +115,7 @@ public class ReverseDictionaryCache<K extends DictionaryColumnUniqueIdentifier,
       executorService.shutdown();
       executorService.awaitTermination(2, TimeUnit.HOURS);
     } catch (InterruptedException e) {
-      LOGGER.error("Error loading the dictionary: " + e.getMessage());
+      LOGGER.error("Error loading the dictionary: " + e.getMessage(), e);
     }
     for (int i = 0; i < taskSubmitList.size(); i++) {
       try {
diff --git a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
index c5cf55d..085d98a 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
@@ -512,7 +512,7 @@ public final class DataMapStoreManager {
             .buildFromTablePath(identifier.getTableName(), identifier.getDatabaseName(),
                 identifier.getTablePath(), identifier.getCarbonTableIdentifier().getTableId());
       } catch (IOException e) {
-        LOGGER.warn("failed to get carbon table from table Path" + e.getMessage());
+        LOGGER.warn("failed to get carbon table from table Path" + e.getMessage(), e);
         // ignoring exception
       }
     }
diff --git a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapUtil.java b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapUtil.java
index 138bd62..bea1cca 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapUtil.java
@@ -51,7 +51,7 @@ public class DataMapUtil {
     try {
       return Class.forName(className).getDeclaredConstructors()[0].newInstance();
     } catch (Exception e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
       return null;
     }
   }
diff --git a/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AbstractDFSCarbonFile.java b/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AbstractDFSCarbonFile.java
index d56caac..a90648e 100644
--- a/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AbstractDFSCarbonFile.java
+++ b/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AbstractDFSCarbonFile.java
@@ -74,7 +74,7 @@ public abstract class AbstractDFSCarbonFile implements CarbonFile {
       fs = path.getFileSystem(this.hadoopConf);
       fileStatus = fs.getFileStatus(path);
     } catch (IOException e) {
-      LOGGER.debug("Exception occurred:" + e.getMessage());
+      LOGGER.debug("Exception occurred:" + e.getMessage(), e);
     }
   }
 
@@ -89,7 +89,7 @@ public abstract class AbstractDFSCarbonFile implements CarbonFile {
       fs = path.getFileSystem(this.hadoopConf);
       fileStatus = fs.getFileStatus(path);
     } catch (IOException e) {
-      LOGGER.debug("Exception occurred:" + e.getMessage());
+      LOGGER.debug("Exception occurred:" + e.getMessage(), e);
     }
   }
 
@@ -129,7 +129,7 @@ public abstract class AbstractDFSCarbonFile implements CarbonFile {
         return fs.exists(fileStatus.getPath());
       }
     } catch (IOException e) {
-      LOGGER.error("Exception occurred:" + e.getMessage());
+      LOGGER.error("Exception occurred:" + e.getMessage(), e);
     }
     return false;
   }
@@ -154,7 +154,7 @@ public abstract class AbstractDFSCarbonFile implements CarbonFile {
         return fs.rename(fileStatus.getPath(), new Path(changeToName));
       }
     } catch (IOException e) {
-      LOGGER.error("Exception occurred:" + e.getMessage());
+      LOGGER.error("Exception occurred:" + e.getMessage(), e);
       return false;
     }
     return false;
@@ -168,7 +168,7 @@ public abstract class AbstractDFSCarbonFile implements CarbonFile {
         return fs.delete(fileStatus.getPath(), true);
       }
     } catch (IOException e) {
-      LOGGER.error("Exception occurred:" + e.getMessage());
+      LOGGER.error("Exception occurred:" + e.getMessage(), e);
       return false;
     }
     return false;
@@ -238,7 +238,7 @@ public abstract class AbstractDFSCarbonFile implements CarbonFile {
       tempFile.renameForce(fileName);
       fileTruncatedSuccessfully = true;
     } catch (IOException e) {
-      LOGGER.error("Exception occurred while truncating the file " + e.getMessage());
+      LOGGER.error("Exception occurred while truncating the file " + e.getMessage(), e);
     } finally {
       CarbonUtil.closeStreams(dataOutputStream, dataInputStream);
     }
@@ -506,7 +506,7 @@ public abstract class AbstractDFSCarbonFile implements CarbonFile {
         return new CarbonFile[0];
       }
     } catch (IOException e) {
-      LOGGER.error("Exception occured: " + e.getMessage());
+      LOGGER.error("Exception occured: " + e.getMessage(), e);
       return new CarbonFile[0];
     }
     return getFiles(listStatus);
diff --git a/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AlluxioCarbonFile.java b/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AlluxioCarbonFile.java
index 216af53..affb469 100644
--- a/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AlluxioCarbonFile.java
+++ b/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AlluxioCarbonFile.java
@@ -136,7 +136,7 @@ public class AlluxioCarbonFile extends HDFSCarbonFile {
       }
       return false;
     } catch (IOException e) {
-      LOGGER.error("Exception occured: " + e.getMessage());
+      LOGGER.error("Exception occured: " + e.getMessage(), e);
       return false;
     }
   }
diff --git a/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/HDFSCarbonFile.java b/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/HDFSCarbonFile.java
index 35b0f0f..306d8f6 100644
--- a/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/HDFSCarbonFile.java
+++ b/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/HDFSCarbonFile.java
@@ -111,7 +111,7 @@ public class HDFSCarbonFile extends AbstractDFSCarbonFile {
         return fs.rename(fileStatus.getPath(), new Path(changeToName));
       }
     } catch (IOException e) {
-      LOGGER.error("Exception occured: " + e.getMessage());
+      LOGGER.error("Exception occured: " + e.getMessage(), e);
       return false;
     }
   }
diff --git a/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/LocalCarbonFile.java b/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/LocalCarbonFile.java
index 2cace55..6f55586 100644
--- a/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/LocalCarbonFile.java
+++ b/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/LocalCarbonFile.java
@@ -254,7 +254,7 @@ public class LocalCarbonFile implements CarbonFile {
       tempFile.renameForce(fileName);
       fileTruncatedSuccessfully = true;
     } catch (IOException e) {
-      LOGGER.error("Exception occured while truncating the file " + e.getMessage());
+      LOGGER.error("Exception occured while truncating the file " + e.getMessage(), e);
     } finally {
       CarbonUtil.closeStreams(source, destination);
     }
diff --git a/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/S3CarbonFile.java b/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/S3CarbonFile.java
index ee67097..ffbe2d8 100644
--- a/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/S3CarbonFile.java
+++ b/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/S3CarbonFile.java
@@ -66,7 +66,7 @@ public class S3CarbonFile extends HDFSCarbonFile {
       fs = fileStatus.getPath().getFileSystem(hadoopConf);
       return fs.rename(fileStatus.getPath(), new Path(changeToName));
     } catch (IOException e) {
-      LOGGER.error("Exception occured: " + e.getMessage());
+      LOGGER.error("Exception occured: " + e.getMessage(), e);
       return false;
     }
   }
diff --git a/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/ViewFSCarbonFile.java b/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/ViewFSCarbonFile.java
index 84a5abd..c55f85c 100644
--- a/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/ViewFSCarbonFile.java
+++ b/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/ViewFSCarbonFile.java
@@ -101,7 +101,7 @@ public class ViewFSCarbonFile extends AbstractDFSCarbonFile {
         return false;
       }
     } catch (IOException e) {
-      LOGGER.error("Exception occured" + e.getMessage());
+      LOGGER.error("Exception occured" + e.getMessage(), e);
       return false;
     }
   }
diff --git a/core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java b/core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java
index e951f58..7dbbe2a 100644
--- a/core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java
+++ b/core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java
@@ -335,7 +335,7 @@ public final class FileFactory {
           CarbonFile carbonFile = FileFactory.getCarbonFile(path, fileType);
           carbonFile.truncate(path, newSize);
         } catch (Exception e) {
-          LOGGER.error("Other exception occurred while truncating the file " + e.getMessage());
+          LOGGER.error("Other exception occurred while truncating the file " + e.getMessage(), e);
         }
         return;
       default:
@@ -505,7 +505,7 @@ public final class FileFactory {
             fs.setPermission(path, permission);
           }
         } catch (IOException e) {
-          LOGGER.error("Exception occurred : " + e.getMessage());
+          LOGGER.error("Exception occurred : " + e.getMessage(), e);
           throw e;
         }
         return;
diff --git a/core/src/main/java/org/apache/carbondata/core/dictionary/client/NonSecureDictionaryClient.java b/core/src/main/java/org/apache/carbondata/core/dictionary/client/NonSecureDictionaryClient.java
index d5c2072..60872c1 100644
--- a/core/src/main/java/org/apache/carbondata/core/dictionary/client/NonSecureDictionaryClient.java
+++ b/core/src/main/java/org/apache/carbondata/core/dictionary/client/NonSecureDictionaryClient.java
@@ -91,7 +91,7 @@ public class NonSecureDictionaryClient implements DictionaryClient {
     try {
       workerGroup.terminationFuture().sync();
     } catch (InterruptedException e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
     }
   }
 }
diff --git a/core/src/main/java/org/apache/carbondata/core/dictionary/client/NonSecureDictionaryClientHandler.java b/core/src/main/java/org/apache/carbondata/core/dictionary/client/NonSecureDictionaryClientHandler.java
index 17e9c7c..8f44da6 100644
--- a/core/src/main/java/org/apache/carbondata/core/dictionary/client/NonSecureDictionaryClientHandler.java
+++ b/core/src/main/java/org/apache/carbondata/core/dictionary/client/NonSecureDictionaryClientHandler.java
@@ -61,7 +61,7 @@ public class NonSecureDictionaryClientHandler extends ChannelInboundHandlerAdapt
       data.release();
       responseMsgQueue.add(key);
     } catch (Exception e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
       throw e;
     }
   }
@@ -102,7 +102,7 @@ public class NonSecureDictionaryClientHandler extends ChannelInboundHandlerAdapt
       }
       return dictionaryMessage;
     } catch (Exception e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
       throw new RuntimeException(e);
     }
   }
diff --git a/core/src/main/java/org/apache/carbondata/core/dictionary/generator/TableDictionaryGenerator.java b/core/src/main/java/org/apache/carbondata/core/dictionary/generator/TableDictionaryGenerator.java
index 003ab5a..461d34a 100644
--- a/core/src/main/java/org/apache/carbondata/core/dictionary/generator/TableDictionaryGenerator.java
+++ b/core/src/main/java/org/apache/carbondata/core/dictionary/generator/TableDictionaryGenerator.java
@@ -89,7 +89,7 @@ public class TableDictionaryGenerator
       executorService.shutdown();
       executorService.awaitTermination(1, TimeUnit.HOURS);
     } catch (InterruptedException e) {
-      LOGGER.error("Error loading the dictionary: " + e.getMessage());
+      LOGGER.error("Error loading the dictionary: " + e.getMessage(), e);
     }
     LOGGER.info("Total time taken to write dictionary file is: " +
             (System.currentTimeMillis() - start));
diff --git a/core/src/main/java/org/apache/carbondata/core/dictionary/server/NonSecureDictionaryServerHandler.java b/core/src/main/java/org/apache/carbondata/core/dictionary/server/NonSecureDictionaryServerHandler.java
index 0f076a4..e261910 100644
--- a/core/src/main/java/org/apache/carbondata/core/dictionary/server/NonSecureDictionaryServerHandler.java
+++ b/core/src/main/java/org/apache/carbondata/core/dictionary/server/NonSecureDictionaryServerHandler.java
@@ -65,7 +65,7 @@ import org.apache.log4j.Logger;
       key.writeData(buffer);
       ctx.writeAndFlush(buffer);
     } catch (Exception e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
       throw e;
     }
   }
diff --git a/core/src/main/java/org/apache/carbondata/core/dictionary/service/AbstractDictionaryServer.java b/core/src/main/java/org/apache/carbondata/core/dictionary/service/AbstractDictionaryServer.java
index 5703051..f548646 100644
--- a/core/src/main/java/org/apache/carbondata/core/dictionary/service/AbstractDictionaryServer.java
+++ b/core/src/main/java/org/apache/carbondata/core/dictionary/service/AbstractDictionaryServer.java
@@ -78,11 +78,11 @@ public abstract class AbstractDictionaryServer {
         return address.getHostAddress();
       }
     } catch (UnknownHostException e) {
-      LOGGER.error("do not get local host address:" + e.getMessage());
-      throw new RuntimeException(e.getMessage());
+      LOGGER.error("do not get local host address:" + e.getMessage(), e);
+      throw new RuntimeException(e);
     } catch (SocketException e) {
-      LOGGER.error("do not get net work interface:" + e.getMessage());
-      throw new RuntimeException(e.getMessage());
+      LOGGER.error("do not get net work interface:" + e.getMessage(), e);
+      throw new RuntimeException(e);
     }
   }
 }
diff --git a/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDataMapIndexStore.java b/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDataMapIndexStore.java
index 0f64086..a9667a8 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDataMapIndexStore.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDataMapIndexStore.java
@@ -148,8 +148,8 @@ public class BlockletDataMapIndexStore
         for (DataMap dataMap : dataMaps) {
           dataMap.clear();
         }
-        LOGGER.error("memory exception when loading datamap: " + e.getMessage());
-        throw new RuntimeException(e.getMessage(), e);
+        LOGGER.error("memory exception when loading datamap: " + e.getMessage(), e);
+        throw new RuntimeException(e);
       }
     }
     return blockletDataMapIndexWrapper;
diff --git a/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDetailInfo.java b/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDetailInfo.java
index 9ce932c..a5aa899 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDetailInfo.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDetailInfo.java
@@ -129,7 +129,7 @@ public class BlockletDetailInfo implements Serializable, Writable {
         blockletInfo.readFields(inputStream);
       } catch (IOException e) {
         LOGGER.error("Problem in reading blocklet info", e);
-        throw new IOException("Problem in reading blocklet info." + e.getMessage());
+        throw new IOException("Problem in reading blocklet info." + e.getMessage(), e);
       } finally {
         try {
           inputStream.close();
diff --git a/core/src/main/java/org/apache/carbondata/core/keygenerator/directdictionary/timestamp/DateDirectDictionaryGenerator.java b/core/src/main/java/org/apache/carbondata/core/keygenerator/directdictionary/timestamp/DateDirectDictionaryGenerator.java
index 67d70e3..3a134c3 100644
--- a/core/src/main/java/org/apache/carbondata/core/keygenerator/directdictionary/timestamp/DateDirectDictionaryGenerator.java
+++ b/core/src/main/java/org/apache/carbondata/core/keygenerator/directdictionary/timestamp/DateDirectDictionaryGenerator.java
@@ -153,8 +153,8 @@ public class DateDirectDictionaryGenerator implements DirectDictionaryGenerator
       timeValue = Long.parseLong(memberStr) / 1000;
     } catch (NumberFormatException e) {
       if (LOGGER.isDebugEnabled()) {
-        LOGGER.debug(
-            "Cannot convert value to Long type value. Value considered as null." + e.getMessage());
+        LOGGER.debug("Cannot convert value to Long type value. Value considered as null."
+            + e.getMessage(), e);
       }
     }
     if (timeValue == -1) {
diff --git a/core/src/main/java/org/apache/carbondata/core/locks/ZookeeperInit.java b/core/src/main/java/org/apache/carbondata/core/locks/ZookeeperInit.java
index 5e59593..f227cf9 100644
--- a/core/src/main/java/org/apache/carbondata/core/locks/ZookeeperInit.java
+++ b/core/src/main/java/org/apache/carbondata/core/locks/ZookeeperInit.java
@@ -47,7 +47,7 @@ public class ZookeeperInit {
       zk = new ZooKeeper(zooKeeperUrl, sessionTimeOut, new DummyWatcher());
 
     } catch (IOException e) {
-      LOGGER.error(e.getMessage());
+      LOGGER.error(e.getMessage(), e);
     }
 
   }
diff --git a/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java b/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
index 8ed781a..3623147 100644
--- a/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
+++ b/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
@@ -1133,7 +1133,7 @@ public class CarbonTable implements Serializable {
     } catch (Exception e) {
       // since method returns true or false and based on that calling function throws exception, no
       // need to throw the catched exception
-      LOGGER.error(e.getMessage());
+      LOGGER.error(e.getMessage(), e);
       return true;
     }
     return true;
diff --git a/core/src/main/java/org/apache/carbondata/core/mutate/CarbonUpdateUtil.java b/core/src/main/java/org/apache/carbondata/core/mutate/CarbonUpdateUtil.java
index 3924c0d..bd8c465 100644
--- a/core/src/main/java/org/apache/carbondata/core/mutate/CarbonUpdateUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/mutate/CarbonUpdateUtil.java
@@ -691,9 +691,9 @@ public class CarbonUpdateUtil {
         CarbonUtil.deleteFoldersAndFiles(invalidFile);
         return true;
       } catch (IOException e) {
-        LOGGER.error("error in clean up of compacted files." + e.getMessage());
+        LOGGER.error("error in clean up of compacted files." + e.getMessage(), e);
       } catch (InterruptedException e) {
-        LOGGER.error("error in clean up of compacted files." + e.getMessage());
+        LOGGER.error("error in clean up of compacted files." + e.getMessage(), e);
       }
     }
     return false;
diff --git a/core/src/main/java/org/apache/carbondata/core/reader/CarbonDeleteFilesDataReader.java b/core/src/main/java/org/apache/carbondata/core/reader/CarbonDeleteFilesDataReader.java
index ee87a75..f16433c 100644
--- a/core/src/main/java/org/apache/carbondata/core/reader/CarbonDeleteFilesDataReader.java
+++ b/core/src/main/java/org/apache/carbondata/core/reader/CarbonDeleteFilesDataReader.java
@@ -91,7 +91,7 @@ public class CarbonDeleteFilesDataReader {
       executorService.shutdown();
       executorService.awaitTermination(30, TimeUnit.MINUTES);
     } catch (InterruptedException e) {
-      LOGGER.error("Error while reading the delete delta files : " + e.getMessage());
+      LOGGER.error("Error while reading the delete delta files : " + e.getMessage(), e);
     }
 
     Map<Integer, Integer[]> pageIdDeleteRowsMap =
@@ -109,8 +109,8 @@ public class CarbonDeleteFilesDataReader {
         }
 
       } catch (Throwable e) {
-        LOGGER.error(e.getMessage());
-        throw new Exception(e.getMessage());
+        LOGGER.error(e.getMessage(), e);
+        throw new Exception(e);
       }
     }
     return pageIdDeleteRowsMap;
@@ -134,7 +134,7 @@ public class CarbonDeleteFilesDataReader {
       executorService.shutdown();
       executorService.awaitTermination(30, TimeUnit.MINUTES);
     } catch (InterruptedException e) {
-      LOGGER.error("Error while reading the delete delta files : " + e.getMessage());
+      LOGGER.error("Error while reading the delete delta files : " + e.getMessage(), e);
     }
     Map<String, DeleteDeltaVo> pageIdToBlockLetVo = new HashMap<>();
     List<DeleteDeltaBlockletDetails> blockletDetails = null;
@@ -175,7 +175,7 @@ public class CarbonDeleteFilesDataReader {
       executorService.shutdown();
       executorService.awaitTermination(30, TimeUnit.MINUTES);
     } catch (InterruptedException e) {
-      LOGGER.error("Error while reading the delete delta files : " + e.getMessage());
+      LOGGER.error("Error while reading the delete delta files : " + e.getMessage(), e);
     }
 
     // Get a new DeleteDeltaBlockDetails as result set where all the data will me merged
@@ -190,8 +190,8 @@ public class CarbonDeleteFilesDataReader {
           deleteDeltaResultSet.addBlockletDetails(blocklet);
         }
       } catch (Throwable e) {
-        LOGGER.error(e.getMessage());
-        throw new Exception(e.getMessage());
+        LOGGER.error(e.getMessage(), e);
+        throw new Exception(e);
       }
     }
     return deleteDeltaResultSet;
diff --git a/core/src/main/java/org/apache/carbondata/core/scan/filter/FilterUtil.java b/core/src/main/java/org/apache/carbondata/core/scan/filter/FilterUtil.java
index 15b8cba..6cc13e2 100644
--- a/core/src/main/java/org/apache/carbondata/core/scan/filter/FilterUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/scan/filter/FilterUtil.java
@@ -940,7 +940,7 @@ public final class FilterUtil {
         columnFilterInfo.setFilterList(filterValuesList);
       }
     } catch (FilterIllegalMemberException e) {
-      LOGGER.error(e.getMessage());
+      LOGGER.error(e.getMessage(), e);
     }
     return columnFilterInfo;
   }
@@ -980,7 +980,7 @@ public final class FilterUtil {
         }
       }
     } catch (FilterIllegalMemberException e) {
-      LOGGER.error(e.getMessage());
+      LOGGER.error(e.getMessage(), e);
     }
 
     if (null == defaultValues) {
@@ -1020,7 +1020,7 @@ public final class FilterUtil {
               break;
             }
           } catch (KeyGenException e) {
-            LOGGER.error(e.getMessage());
+            LOGGER.error(e.getMessage(), e);
           }
         }
       }
@@ -1095,7 +1095,7 @@ public final class FilterUtil {
       keys[carbonDimension.getKeyOrdinal()] = surrogate;
       maskedKey = getMaskedKey(rangesForMaskedByte, blockLevelKeyGenerator.generateKey(keys));
     } catch (KeyGenException e) {
-      LOGGER.error(e.getMessage());
+      LOGGER.error(e.getMessage(), e);
     }
     return maskedKey;
   }
@@ -1438,7 +1438,7 @@ public final class FilterUtil {
       indexKey =
           new IndexKey(keyGenerator.generateKey(startOrEndKey), startOrEndKeyForNoDictDimension);
     } catch (KeyGenException e) {
-      LOGGER.error(e.getMessage());
+      LOGGER.error(e.getMessage(), e);
     }
     return indexKey;
   }
@@ -2124,7 +2124,7 @@ public final class FilterUtil {
             dummy[0] = i;
             encodedFilters.add(keyGenerator.generateKey(dummy));
           } catch (KeyGenException e) {
-            LOGGER.error(e);
+            LOGGER.error(e.getMessage(), e);
           }
           break;
         }
@@ -2215,7 +2215,7 @@ public final class FilterUtil {
           encodedFilterValues.add(keyGenerator.generateKey(dummy));
         }
       } catch (KeyGenException e) {
-        LOGGER.error(e);
+        LOGGER.error(e.getMessage(), e);
       }
       return encodedFilterValues.toArray(new byte[encodedFilterValues.size()][]);
     } else {
@@ -2227,7 +2227,7 @@ public final class FilterUtil {
           }
         }
       } catch (KeyGenException e) {
-        LOGGER.error(e);
+        LOGGER.error(e.getMessage(), e);
       }
     }
     return getSortedEncodedFilters(encodedFilterValues);
diff --git a/core/src/main/java/org/apache/carbondata/core/scan/result/BlockletScannedResult.java b/core/src/main/java/org/apache/carbondata/core/scan/result/BlockletScannedResult.java
index 4ec7b38..ad4d2b3 100644
--- a/core/src/main/java/org/apache/carbondata/core/scan/result/BlockletScannedResult.java
+++ b/core/src/main/java/org/apache/carbondata/core/scan/result/BlockletScannedResult.java
@@ -303,7 +303,7 @@ public abstract class BlockletScannedResult {
           reuseableDataOutput.reset();
         } catch (IOException e) {
           isExceptionThrown = true;
-          LOGGER.error(e);
+          LOGGER.error(e.getMessage(), e);
         } finally {
           if (isExceptionThrown) {
             CarbonUtil.closeStreams(reuseableDataOutput);
@@ -574,7 +574,7 @@ public abstract class BlockletScannedResult {
           reUseableDataOutput.reset();
         } catch (IOException e) {
           isExceptionThrown = true;
-          LOGGER.error(e);
+          LOGGER.error(e.getMessage(), e);
         } finally {
           if (isExceptionThrown) {
             CarbonUtil.closeStreams(reUseableDataOutput);
@@ -639,7 +639,7 @@ public abstract class BlockletScannedResult {
         reUsableDataOutput.reset();
       } catch (IOException e) {
         isExceptionThrown = true;
-        LOGGER.error(e);
+        LOGGER.error(e.getMessage(), e);
       } finally {
         if (isExceptionThrown) {
           CarbonUtil.closeStreams(reUsableDataOutput);
diff --git a/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/AbstractDetailQueryResultIterator.java b/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/AbstractDetailQueryResultIterator.java
index 9282d44..f39e549 100644
--- a/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/AbstractDetailQueryResultIterator.java
+++ b/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/AbstractDetailQueryResultIterator.java
@@ -314,7 +314,7 @@ public abstract class AbstractDetailQueryResultIterator<E> extends CarbonIterato
     try {
       fileReader.finish();
     } catch (IOException e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
     }
   }
 
diff --git a/core/src/main/java/org/apache/carbondata/core/statusmanager/LoadMetadataDetails.java b/core/src/main/java/org/apache/carbondata/core/statusmanager/LoadMetadataDetails.java
index 7a16379..99dddba 100644
--- a/core/src/main/java/org/apache/carbondata/core/statusmanager/LoadMetadataDetails.java
+++ b/core/src/main/java/org/apache/carbondata/core/statusmanager/LoadMetadataDetails.java
@@ -257,8 +257,8 @@ public class LoadMetadataDetails implements Serializable {
         dateToStr = parser.parse(factTimeStamp);
         return dateToStr.getTime();
       } catch (ParseException e) {
-        LOGGER
-            .error("Cannot convert" + factTimeStamp + " to Time/Long type value" + e.getMessage());
+        LOGGER.error("Cannot convert" + factTimeStamp + " to Time/Long type value"
+            + e.getMessage(), e);
         parser = new SimpleDateFormat(CarbonCommonConstants.CARBON_TIMESTAMP);
         try {
           // if the load is in progress, factTimeStamp will be null, so use current time
@@ -293,7 +293,7 @@ public class LoadMetadataDetails implements Serializable {
         return dateToStr.getTime() * 1000;
       } catch (ParseException e) {
         LOGGER.error("Cannot convert" + loadStartTime +
-            " to Time/Long type value" + e.getMessage());
+            " to Time/Long type value" + e.getMessage(), e);
         return null;
       }
     }
diff --git a/core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentStatusManager.java b/core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentStatusManager.java
index 4a5063f..fddce90 100755
--- a/core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentStatusManager.java
+++ b/core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentStatusManager.java
@@ -176,7 +176,7 @@ public class SegmentStatusManager {
         }
       }
     } catch (IOException e) {
-      LOG.error(e);
+      LOG.error(e.getMessage(), e);
       throw e;
     }
     return new ValidAndInvalidSegmentsInfo(listOfValidSegments, listOfValidUpdatedSegments,
@@ -391,7 +391,7 @@ public class SegmentStatusManager {
         throw new Exception(errorMsg + " Please try after some time.");
       }
     } catch (IOException e) {
-      LOG.error("IOException" + e.getMessage());
+      LOG.error("IOException" + e.getMessage(), e);
       throw e;
     } finally {
       CarbonLockUtil.fileUnlock(carbonTableStatusLock, LockUsage.TABLE_STATUS_LOCK);
@@ -472,7 +472,7 @@ public class SegmentStatusManager {
         throw new Exception(errorMsg + " Please try after some time.");
       }
     } catch (IOException e) {
-      LOG.error("Error message: " + "IOException" + e.getMessage());
+      LOG.error("Error message: " + "IOException" + e.getMessage(), e);
       throw e;
     } finally {
       CarbonLockUtil.fileUnlock(carbonTableStatusLock, LockUsage.TABLE_STATUS_LOCK);
diff --git a/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java b/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
index b337e40..ad27045 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
@@ -747,7 +747,7 @@ public final class CarbonProperties {
     try {
       initPropertySet();
     } catch (IllegalAccessException e) {
-      LOGGER.error("Illegal access to declared field" + e.getMessage());
+      LOGGER.error("Illegal access to declared field" + e.getMessage(), e);
     }
   }
 
diff --git a/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java b/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
index 7147bd6..ffab9c8 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
@@ -744,7 +744,7 @@ public final class CarbonUtil {
         created = FileFactory.mkdirs(path, fileType);
       }
     } catch (IOException e) {
-      LOGGER.error(e.getMessage());
+      LOGGER.error(e.getMessage(), e);
     }
     return created;
   }
@@ -765,7 +765,7 @@ public final class CarbonUtil {
         created = true;
       }
     } catch (IOException e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
     }
     return created;
   }
@@ -1446,7 +1446,7 @@ public final class CarbonUtil {
       stream.flush();
       thriftByteArray = stream.toByteArray();
     } catch (TException | IOException e) {
-      LOGGER.error("Error while converting to byte array from thrift object: " + e.getMessage());
+      LOGGER.error("Error while converting to byte array from thrift object: " + e.getMessage(), e);
       closeStreams(stream);
     } finally {
       closeStreams(stream);
@@ -1539,7 +1539,7 @@ public final class CarbonUtil {
       objStream = new ObjectInputStream(aos);
       meta = (ValueEncoderMeta) objStream.readObject();
     } catch (ClassNotFoundException e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
     } catch (IOException e) {
       CarbonUtil.closeStreams(objStream);
     }
diff --git a/core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java b/core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
index 995f80d..303cc80 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
@@ -356,7 +356,7 @@ public final class DataTypeUtil {
           Date dateToStr = dateformatter.get().parse(data);
           return dateToStr.getTime() * 1000;
         } catch (ParseException e) {
-          LOGGER.error("Cannot convert value to Time/Long type value" + e.getMessage());
+          LOGGER.error("Cannot convert value to Time/Long type value" + e.getMessage(), e);
           return null;
         }
       } else if (actualDataType == DataTypes.TIMESTAMP) {
@@ -367,7 +367,7 @@ public final class DataTypeUtil {
           Date dateToStr = timeStampformatter.get().parse(data);
           return dateToStr.getTime() * 1000;
         } catch (ParseException e) {
-          LOGGER.error("Cannot convert value to Time/Long type value" + e.getMessage());
+          LOGGER.error("Cannot convert value to Time/Long type value" + e.getMessage(), e);
           return null;
         }
       } else if (DataTypes.isDecimal(actualDataType)) {
@@ -675,7 +675,7 @@ public final class DataTypeUtil {
           Date dateToStr = dateformatter.get().parse(data5);
           return dateToStr.getTime() * 1000;
         } catch (ParseException e) {
-          LOGGER.error("Cannot convert value to Time/Long type value" + e.getMessage());
+          LOGGER.error("Cannot convert value to Time/Long type value" + e.getMessage(), e);
           return null;
         }
       } else if (dataType == DataTypes.TIMESTAMP) {
@@ -687,7 +687,7 @@ public final class DataTypeUtil {
           Date dateToStr = timeStampformatter.get().parse(data6);
           return dateToStr.getTime() * 1000;
         } catch (ParseException e) {
-          LOGGER.error("Cannot convert value to Time/Long type value" + e.getMessage());
+          LOGGER.error("Cannot convert value to Time/Long type value" + e.getMessage(), e);
           return null;
         }
       } else if (DataTypes.isDecimal(dataType)) {
diff --git a/core/src/main/java/org/apache/carbondata/core/util/ObjectSerializationUtil.java b/core/src/main/java/org/apache/carbondata/core/util/ObjectSerializationUtil.java
index 9a4c02d..48c6e65 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/ObjectSerializationUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/ObjectSerializationUtil.java
@@ -65,7 +65,7 @@ public class ObjectSerializationUtil {
           baos.close();
         }
       } catch (IOException e) {
-        LOG.error(e);
+        LOG.error(e.getMessage(), e);
       }
     }
 
@@ -110,7 +110,7 @@ public class ObjectSerializationUtil {
           bais.close();
         }
       } catch (IOException e) {
-        LOG.error(e);
+        LOG.error(e.getMessage(), e);
       }
     }
   }
diff --git a/core/src/main/java/org/apache/carbondata/core/util/path/HDFSLeaseUtils.java b/core/src/main/java/org/apache/carbondata/core/util/path/HDFSLeaseUtils.java
index 1a10f46..3058685 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/path/HDFSLeaseUtils.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/path/HDFSLeaseUtils.java
@@ -133,7 +133,7 @@ public class HDFSLeaseUtils {
           LOGGER.error("The given file does not exist at path " + filePath);
           throw e;
         } else {
-          LOGGER.error("Recover lease threw exception : " + e.getMessage());
+          LOGGER.error("Recover lease threw exception : " + e.getMessage(), e);
           ioException = e;
         }
       }
diff --git a/core/src/test/java/org/apache/carbondata/core/datastore/filesystem/HDFSCarbonFileTest.java b/core/src/test/java/org/apache/carbondata/core/datastore/filesystem/HDFSCarbonFileTest.java
index daebd9f..8adefb3 100644
--- a/core/src/test/java/org/apache/carbondata/core/datastore/filesystem/HDFSCarbonFileTest.java
+++ b/core/src/test/java/org/apache/carbondata/core/datastore/filesystem/HDFSCarbonFileTest.java
@@ -90,7 +90,7 @@ public class HDFSCarbonFileTest {
         try {
             fs.delete(pt, true);
         } catch (IOException e) {
-            LOGGER.error("Exception Occured" + e.getMessage());
+            LOGGER.error("Exception Occured" + e.getMessage(), e);
         }
     }
 
diff --git a/core/src/test/java/org/apache/carbondata/core/load/LoadMetadataDetailsUnitTest.java b/core/src/test/java/org/apache/carbondata/core/load/LoadMetadataDetailsUnitTest.java
index 3032016..e1e92f8 100644
--- a/core/src/test/java/org/apache/carbondata/core/load/LoadMetadataDetailsUnitTest.java
+++ b/core/src/test/java/org/apache/carbondata/core/load/LoadMetadataDetailsUnitTest.java
@@ -113,7 +113,7 @@ public class LoadMetadataDetailsUnitTest {
     try {
       return simpleDateFormat.parse(date).getTime() * 1000;
     } catch (ParseException e) {
-      LOGGER.error("Error while parsing " + date + " " + e.getMessage());
+      LOGGER.error("Error while parsing " + date + " " + e.getMessage(), e);
       return null;
     }
   }
diff --git a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
index e730635..9785549 100644
--- a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
+++ b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
@@ -123,7 +123,7 @@ public class BloomCoarseGrainDataMapFactory extends DataMapFactory<CoarseGrainDa
       this.cache = CacheProvider.getInstance()
           .createCache(new CacheType("bloom_cache"), BloomDataMapCache.class.getName());
     } catch (Exception e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
       throw new MalformedDataMapCommandException(e.getMessage());
     }
   }
diff --git a/datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java b/datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java
index 048d41a..da1fe5c 100644
--- a/datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java
+++ b/datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMap.java
@@ -236,7 +236,7 @@ public class LuceneFineGrainDataMap extends FineGrainDataMap {
     } catch (ParseException e) {
       String errorMessage = String.format(
           "failed to filter block with query %s, detail is %s", strQuery, e.getMessage());
-      LOGGER.error(errorMessage);
+      LOGGER.error(errorMessage, e);
       return null;
     }
     // temporary data, delete duplicated data
@@ -262,8 +262,8 @@ public class LuceneFineGrainDataMap extends FineGrainDataMap {
       } catch (IOException e) {
         String errorMessage =
             String.format("failed to search lucene data, detail is %s", e.getMessage());
-        LOGGER.error(errorMessage);
-        throw new IOException(errorMessage);
+        LOGGER.error(errorMessage, e);
+        throw new IOException(errorMessage, e);
       }
 
       ByteBuffer intBuffer = ByteBuffer.allocate(4);
diff --git a/datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapFactory.java b/datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapFactory.java
index 116370d..a3c4063 100644
--- a/datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapFactory.java
+++ b/datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapFactory.java
@@ -57,7 +57,7 @@ public class LuceneFineGrainDataMapFactory extends LuceneDataMapFactoryBase<Fine
           DataMapWriter.getDefaultDataMapPath(tableIdentifier.getTablePath(),
               segment.getSegmentNo(), dataMapName), segment.getConfiguration()));
     } catch (MemoryException e) {
-      LOGGER.error(String.format("failed to get lucene datamap, detail is %s", e.getMessage()));
+      LOGGER.error(String.format("failed to get lucene datamap, detail is %s", e.getMessage()), e);
       return lstDataMap;
     }
     lstDataMap.add(dataMap);
diff --git a/integration/hive/src/main/java/org/apache/carbondata/hive/CarbonDictionaryDecodeReadSupport.java b/integration/hive/src/main/java/org/apache/carbondata/hive/CarbonDictionaryDecodeReadSupport.java
index e95382c..fdccd09 100644
--- a/integration/hive/src/main/java/org/apache/carbondata/hive/CarbonDictionaryDecodeReadSupport.java
+++ b/integration/hive/src/main/java/org/apache/carbondata/hive/CarbonDictionaryDecodeReadSupport.java
@@ -107,7 +107,7 @@ public class CarbonDictionaryDecodeReadSupport<T> implements CarbonReadSupport<T
       try {
         writableArr[i] = createWritableObject(data[i], carbonColumns[i]);
       } catch (IOException e) {
-        throw new RuntimeException(e.getMessage(), e);
+        throw new RuntimeException(e);
       }
     }
 
diff --git a/integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java b/integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java
index 634c116..1022576 100644
--- a/integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java
+++ b/integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java
@@ -133,7 +133,7 @@ public class MapredCarbonInputFormat extends CarbonTableInputFormat<ArrayWritabl
     try {
       queryModel = getQueryModel(jobConf, path);
     } catch (InvalidConfigurationException e) {
-      LOGGER.error("Failed to create record reader: " + e.getMessage());
+      LOGGER.error("Failed to create record reader: " + e.getMessage(), e);
       return null;
     }
     CarbonReadSupport<ArrayWritable> readSupport = new CarbonDictionaryDecodeReadSupport<>();
diff --git a/integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonTableReader.java b/integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonTableReader.java
index 916e44c..57d8d5e 100755
--- a/integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonTableReader.java
+++ b/integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonTableReader.java
@@ -262,9 +262,9 @@ public class CarbonTableReader {
       try {
         loadMetadataDetails = SegmentStatusManager.readTableStatusFile(
             CarbonTablePath.getTableStatusFilePath(carbonTable.getTablePath()));
-      } catch (IOException exception) {
-        LOGGER.error(exception.getMessage());
-        throw exception;
+      } catch (IOException e) {
+        LOGGER.error(e.getMessage(), e);
+        throw e;
       }
       filteredPartitions = findRequiredPartitions(constraints, carbonTable, loadMetadataDetails);
     }
@@ -329,9 +329,9 @@ public class CarbonTableReader {
             new SegmentFileStore(carbonTable.getTablePath(), loadMetadataDetail.getSegmentFile());
         partitionSpecs.addAll(segmentFileStore.getPartitionSpecs());
 
-      } catch (IOException exception) {
-        LOGGER.error(exception.getMessage());
-        throw exception;
+      } catch (IOException e) {
+        LOGGER.error(e.getMessage(), e);
+        throw e;
       }
     }
     List<String> partitionValuesFromExpression =
diff --git a/integration/spark-common/src/main/java/org/apache/carbondata/spark/dictionary/client/SecureDictionaryClientHandler.java b/integration/spark-common/src/main/java/org/apache/carbondata/spark/dictionary/client/SecureDictionaryClientHandler.java
index d3f27ed..cee88fa 100644
--- a/integration/spark-common/src/main/java/org/apache/carbondata/spark/dictionary/client/SecureDictionaryClientHandler.java
+++ b/integration/spark-common/src/main/java/org/apache/carbondata/spark/dictionary/client/SecureDictionaryClientHandler.java
@@ -74,7 +74,7 @@ public class SecureDictionaryClientHandler extends RpcHandler {
       data.release();
       return newKey;
     } catch (Exception e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
       throw new RuntimeException(e);
     }
   }
@@ -92,7 +92,7 @@ public class SecureDictionaryClientHandler extends RpcHandler {
         LOGGER.error("Failed to add key: " + key + " to queue");
       }
     } catch (Exception e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
       throw e;
     }
   }
diff --git a/integration/spark-common/src/main/java/org/apache/carbondata/spark/dictionary/server/SecureDictionaryServerHandler.java b/integration/spark-common/src/main/java/org/apache/carbondata/spark/dictionary/server/SecureDictionaryServerHandler.java
index 9e291a4..89b8f95 100644
--- a/integration/spark-common/src/main/java/org/apache/carbondata/spark/dictionary/server/SecureDictionaryServerHandler.java
+++ b/integration/spark-common/src/main/java/org/apache/carbondata/spark/dictionary/server/SecureDictionaryServerHandler.java
@@ -84,7 +84,7 @@ import org.apache.spark.network.server.StreamManager;
       key.writeData(buff);
       rpcResponseCallback.onSuccess(buff.nioBuffer());
     } catch (Exception e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
     }
   }
 
diff --git a/integration/spark-datasource/src/main/scala/org/apache/carbondata/spark/vectorreader/VectorizedCarbonRecordReader.java b/integration/spark-datasource/src/main/scala/org/apache/carbondata/spark/vectorreader/VectorizedCarbonRecordReader.java
index c9a4ba4..34e7c23 100644
--- a/integration/spark-datasource/src/main/scala/org/apache/carbondata/spark/vectorreader/VectorizedCarbonRecordReader.java
+++ b/integration/spark-datasource/src/main/scala/org/apache/carbondata/spark/vectorreader/VectorizedCarbonRecordReader.java
@@ -146,14 +146,14 @@ public class VectorizedCarbonRecordReader extends AbstractRecordReader<Object> {
       iterator = (AbstractDetailQueryResultIterator) queryExecutor.execute(queryModel);
     } catch (QueryExecutionException e) {
       if (ExceptionUtils.indexOfThrowable(e, FileNotFoundException.class) > 0) {
-        LOGGER.error(e);
+        LOGGER.error(e.getMessage(), e);
         throw new InterruptedException(
             "Insert overwrite may be in progress.Please check " + e.getMessage());
       }
       throw new InterruptedException(e.getMessage());
     } catch (Exception e) {
       if (ExceptionUtils.indexOfThrowable(e, FileNotFoundException.class) > 0) {
-        LOGGER.error(e);
+        LOGGER.error(e.getMessage(), e);
         throw new InterruptedException(
             "Insert overwrite may be in progress.Please check " + e.getMessage());
       }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/datamap/DataMapWriterListener.java b/processing/src/main/java/org/apache/carbondata/processing/datamap/DataMapWriterListener.java
index e88c422..be00480 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/datamap/DataMapWriterListener.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/datamap/DataMapWriterListener.java
@@ -103,7 +103,7 @@ public class DataMapWriterListener {
     try {
       writer = factory.createWriter(new Segment(segmentId), taskNo, segmentProperties);
     } catch (IOException e) {
-      LOG.error("Failed to create DataMapWriter: " + e.getMessage());
+      LOG.error("Failed to create DataMapWriter: " + e.getMessage(), e);
       throw new DataMapWriterException(e);
     }
     if (writers != null) {
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/AbstractDataLoadProcessorStep.java b/processing/src/main/java/org/apache/carbondata/processing/loading/AbstractDataLoadProcessorStep.java
index c3b587c..5fd507b 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/AbstractDataLoadProcessorStep.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/AbstractDataLoadProcessorStep.java
@@ -82,7 +82,7 @@ public abstract class AbstractDataLoadProcessorStep {
               Thread.sleep(10000);
             } catch (InterruptedException e) {
               //ignore
-              LOGGER.error(e.getMessage());
+              LOGGER.error(e.getMessage(), e);
             }
           }
         }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/converter/impl/RowConverterImpl.java b/processing/src/main/java/org/apache/carbondata/processing/loading/converter/impl/RowConverterImpl.java
index 458b3ab..df50e25 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/converter/impl/RowConverterImpl.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/converter/impl/RowConverterImpl.java
@@ -144,7 +144,7 @@ public class RowConverterImpl implements RowConverter {
         // wait for client initialization finished, or will raise null pointer exception
         Thread.sleep(1000);
       } catch (InterruptedException e) {
-        LOGGER.error(e);
+        LOGGER.error(e.getMessage(), e);
         throw new RuntimeException(e);
       }
 
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/model/CarbonLoadModelBuilder.java b/processing/src/main/java/org/apache/carbondata/processing/loading/model/CarbonLoadModelBuilder.java
index eb3c253..7abd573 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/model/CarbonLoadModelBuilder.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/model/CarbonLoadModelBuilder.java
@@ -421,7 +421,7 @@ public class CarbonLoadModelBuilder {
       CompressorFactory.getInstance().getCompressor(columnCompressor);
       carbonLoadModel.setColumnCompressor(columnCompressor);
     } catch (Exception e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
       throw new InvalidLoadOptionException("Failed to load the compressor");
     }
   }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/parser/impl/JsonRowParser.java b/processing/src/main/java/org/apache/carbondata/processing/loading/parser/impl/JsonRowParser.java
index 119ae67..bc494e6 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/parser/impl/JsonRowParser.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/parser/impl/JsonRowParser.java
@@ -64,7 +64,7 @@ public class JsonRowParser implements RowParser {
       jsonNodeMapCaseInsensitive.putAll(jsonNodeMap);
       return jsonToCarbonRecord(jsonNodeMapCaseInsensitive, dataFields);
     } catch (IOException e) {
-      throw new IOException("Failed to parse Json String: " + e.getMessage());
+      throw new IOException("Failed to parse Json String: " + e.getMessage(), e);
     }
   }
 
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/ParallelReadMergeSorterImpl.java b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/ParallelReadMergeSorterImpl.java
index 55b336e..02d6309 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/ParallelReadMergeSorterImpl.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/ParallelReadMergeSorterImpl.java
@@ -217,7 +217,7 @@ public class ParallelReadMergeSorterImpl extends AbstractMergeSorter {
           }
         }
       } catch (Exception e) {
-        LOGGER.error(e);
+        LOGGER.error(e.getMessage(), e);
         observer.notifyFailed(e);
       }
     }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/ParallelReadMergeSorterWithColumnRangeImpl.java b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/ParallelReadMergeSorterWithColumnRangeImpl.java
index 8b86c0c..e1dddee 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/ParallelReadMergeSorterWithColumnRangeImpl.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/ParallelReadMergeSorterWithColumnRangeImpl.java
@@ -238,7 +238,7 @@ public class ParallelReadMergeSorterWithColumnRangeImpl extends AbstractMergeSor
         }
         LOGGER.info("Rows processed by each range: " + insideCounterList);
       } catch (Exception e) {
-        LOGGER.error(e);
+        LOGGER.error(e.getMessage(), e);
         this.threadStatusObserver.notifyFailed(e);
       }
     }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/UnsafeBatchParallelReadMergeSorterImpl.java b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/UnsafeBatchParallelReadMergeSorterImpl.java
index aa960b6..2d2455b 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/UnsafeBatchParallelReadMergeSorterImpl.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/UnsafeBatchParallelReadMergeSorterImpl.java
@@ -115,7 +115,7 @@ public class UnsafeBatchParallelReadMergeSorterImpl extends AbstractMergeSorter
     try {
       executorService.awaitTermination(2, TimeUnit.DAYS);
     } catch (InterruptedException e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
     }
   }
 
@@ -167,7 +167,7 @@ public class UnsafeBatchParallelReadMergeSorterImpl extends AbstractMergeSorter
           }
         }
       } catch (Exception e) {
-        LOGGER.error(e);
+        LOGGER.error(e.getMessage(), e);
         this.threadStatusObserver.notifyFailed(e);
       } finally {
         synchronized (sortDataRows) {
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/UnsafeParallelReadMergeSorterImpl.java b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/UnsafeParallelReadMergeSorterImpl.java
index 6e11ca6..8af3ae2 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/UnsafeParallelReadMergeSorterImpl.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/UnsafeParallelReadMergeSorterImpl.java
@@ -208,7 +208,7 @@ public class UnsafeParallelReadMergeSorterImpl extends AbstractMergeSorter {
           }
         }
       } catch (Exception e) {
-        LOGGER.error(e);
+        LOGGER.error(e.getMessage(), e);
         this.threadStatusObserver.notifyFailed(e);
       }
     }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/UnsafeParallelReadMergeSorterWithColumnRangeImpl.java b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/UnsafeParallelReadMergeSorterWithColumnRangeImpl.java
index f9631a5..693cc96 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/UnsafeParallelReadMergeSorterWithColumnRangeImpl.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/UnsafeParallelReadMergeSorterWithColumnRangeImpl.java
@@ -232,7 +232,7 @@ public class UnsafeParallelReadMergeSorterWithColumnRangeImpl extends AbstractMe
         }
         LOGGER.info("Rows processed by each range: " + insideRowCounterList);
       } catch (Exception e) {
-        LOGGER.error(e);
+        LOGGER.error(e.getMessage(), e);
         this.threadStatusObserver.notifyFailed(e);
       }
     }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeSortDataRows.java b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeSortDataRows.java
index e8e1c08..87f97be 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeSortDataRows.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeSortDataRows.java
@@ -199,8 +199,8 @@ public class UnsafeSortDataRows {
           } catch (Exception ex) {
             // row page has freed in handlePreviousPage(), so other iterator may try to access it.
             rowPage = null;
-            LOGGER.error(
-                "exception occurred while trying to acquire a semaphore lock: " + ex.getMessage());
+            LOGGER.error("exception occurred while trying to acquire a semaphore lock: "
+                + ex.getMessage(), ex);
             throw new CarbonSortKeyAndGroupByException(ex);
           }
         }
@@ -213,7 +213,7 @@ public class UnsafeSortDataRows {
           i--;
         } else {
           LOGGER.error(
-              "exception occurred while trying to acquire a semaphore lock: " + e.getMessage());
+              "exception occurred while trying to acquire a semaphore lock: " + e.getMessage(), e);
           throw new CarbonSortKeyAndGroupByException(e);
         }
       }
@@ -236,8 +236,8 @@ public class UnsafeSortDataRows {
           rowPage = createUnsafeRowPage();
         } catch (Exception ex) {
           rowPage = null;
-          LOGGER.error(
-              "exception occurred while trying to acquire a semaphore lock: " + ex.getMessage());
+          LOGGER.error("exception occurred while trying to acquire a semaphore lock: "
+              + ex.getMessage(), ex);
           throw new CarbonSortKeyAndGroupByException(ex);
         }
       }
@@ -249,7 +249,7 @@ public class UnsafeSortDataRows {
         addRow(row);
       } else {
         LOGGER.error(
-            "exception occurred while trying to acquire a semaphore lock: " + e.getMessage());
+            "exception occurred while trying to acquire a semaphore lock: " + e.getMessage(), e);
         throw new CarbonSortKeyAndGroupByException(e);
       }
     }
@@ -422,7 +422,7 @@ public class UnsafeSortDataRows {
         try {
           threadStatusObserver.notifyFailed(e);
         } catch (CarbonSortKeyAndGroupByException ex) {
-          LOGGER.error(e);
+          LOGGER.error(e.getMessage(), e);
         }
       } finally {
         semaphore.release();
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeSortTempFileChunkHolder.java b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeSortTempFileChunkHolder.java
index 4a97b20..04cab70 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeSortTempFileChunkHolder.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeSortTempFileChunkHolder.java
@@ -145,13 +145,13 @@ public class UnsafeSortTempFileChunkHolder implements SortTempChunkHolder {
         }
       }
     } catch (FileNotFoundException e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
       throw new RuntimeException(tempFile + " No Found", e);
     } catch (IOException e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
       throw new RuntimeException(tempFile + " No Found", e);
     } catch (Exception e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
       throw new RuntimeException(tempFile + " Problem while reading", e);
     }
   }
@@ -193,7 +193,7 @@ public class UnsafeSortTempFileChunkHolder implements SortTempChunkHolder {
         try {
           submit.get();
         } catch (Exception e) {
-          LOGGER.error(e);
+          LOGGER.error(e.getMessage(), e);
         }
         bufferRowCounter = 0;
         currentBuffer = backupBuffer;
@@ -319,7 +319,7 @@ public class UnsafeSortTempFileChunkHolder implements SortTempChunkHolder {
           currentBuffer = prefetchRecordsFromFile(numberOfRecords);
         }
       } catch (Exception e) {
-        LOGGER.error(e);
+        LOGGER.error(e.getMessage(), e);
       }
       return null;
     }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/merger/UnsafeIntermediateMerger.java b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/merger/UnsafeIntermediateMerger.java
index 1389ff7..ea12263 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/merger/UnsafeIntermediateMerger.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/merger/UnsafeIntermediateMerger.java
@@ -213,7 +213,7 @@ public class UnsafeIntermediateMerger {
         mergerTask.get(i).get();
       } catch (InterruptedException | ExecutionException e) {
         LOGGER.error(e.getMessage(), e);
-        throw new CarbonSortKeyAndGroupByException(e.getMessage(), e);
+        throw new CarbonSortKeyAndGroupByException(e);
       }
     }
   }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/merger/UnsafeSingleThreadFinalSortFilesMerger.java b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/merger/UnsafeSingleThreadFinalSortFilesMerger.java
index 7e36389..e7cadec 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/merger/UnsafeSingleThreadFinalSortFilesMerger.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/merger/UnsafeSingleThreadFinalSortFilesMerger.java
@@ -154,7 +154,7 @@ public class UnsafeSingleThreadFinalSortFilesMerger extends CarbonIterator<Objec
 
       LOGGER.info("Heap Size: " + this.recordHolderHeapLocal.size());
     } catch (Exception e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
       throw new CarbonDataWriterException(e);
     }
   }
@@ -238,7 +238,7 @@ public class UnsafeSingleThreadFinalSortFilesMerger extends CarbonIterator<Objec
     try {
       poll.readRow();
     } catch (Exception e) {
-      throw new CarbonDataWriterException(e.getMessage(), e);
+      throw new CarbonDataWriterException(e);
     }
 
     // add to heap
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/steps/CarbonRowDataWriterProcessorStepImpl.java b/processing/src/main/java/org/apache/carbondata/processing/loading/steps/CarbonRowDataWriterProcessorStepImpl.java
index 68e8e22..f976abe 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/steps/CarbonRowDataWriterProcessorStepImpl.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/steps/CarbonRowDataWriterProcessorStepImpl.java
@@ -146,7 +146,7 @@ public class CarbonRowDataWriterProcessorStepImpl extends AbstractDataLoadProces
     } catch (CarbonDataWriterException e) {
       LOGGER.error("Failed for table: " + tableName + " in DataWriterProcessorStepImpl", e);
       throw new CarbonDataLoadingException(
-          "Error while initializing data handler : " + e.getMessage());
+          "Error while initializing data handler : " + e.getMessage(), e);
     } catch (Exception e) {
       LOGGER.error("Failed for table: " + tableName + " in DataWriterProcessorStepImpl", e);
       if (e instanceof BadRecordFoundException) {
@@ -234,10 +234,10 @@ public class CarbonRowDataWriterProcessorStepImpl extends AbstractDataLoadProces
         dataHandler.closeHandler();
       } catch (CarbonDataWriterException e) {
         LOGGER.error(e.getMessage(), e);
-        throw new CarbonDataLoadingException(e.getMessage());
+        throw new CarbonDataLoadingException(e);
       } catch (Exception e) {
         LOGGER.error(e.getMessage(), e);
-        throw new CarbonDataLoadingException("There is an unexpected error: " + e.getMessage());
+        throw new CarbonDataLoadingException("There is an unexpected error: " + e.getMessage(), e);
       }
     }
   }
@@ -330,7 +330,7 @@ public class CarbonRowDataWriterProcessorStepImpl extends AbstractDataLoadProces
       try {
         doExecute(this.iterator, iteratorIndex);
       } catch (IOException e) {
-        LOGGER.error(e);
+        LOGGER.error(e.getMessage(), e);
         throw new RuntimeException(e);
       }
     }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/steps/DataWriterBatchProcessorStepImpl.java b/processing/src/main/java/org/apache/carbondata/processing/loading/steps/DataWriterBatchProcessorStepImpl.java
index 05b2424..33948de 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/steps/DataWriterBatchProcessorStepImpl.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/steps/DataWriterBatchProcessorStepImpl.java
@@ -116,7 +116,7 @@ public class DataWriterBatchProcessorStepImpl extends AbstractDataLoadProcessorS
       if (e.getCause() instanceof BadRecordFoundException) {
         throw new BadRecordFoundException(e.getCause().getMessage());
       }
-      throw new CarbonDataLoadingException("There is an unexpected error: " + e.getMessage());
+      throw new CarbonDataLoadingException("There is an unexpected error: " + e.getMessage(), e);
     }
     return null;
   }
@@ -160,7 +160,7 @@ public class DataWriterBatchProcessorStepImpl extends AbstractDataLoadProcessorS
       try {
         dataHandler.closeHandler();
       } catch (Exception e) {
-        LOGGER.error(e);
+        LOGGER.error(e.getMessage(), e);
         throw new CarbonDataLoadingException(
             "There is an unexpected error while closing data handler", e);
       }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonCompactionUtil.java b/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonCompactionUtil.java
index ffcfe0c..efd2559 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonCompactionUtil.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonCompactionUtil.java
@@ -180,7 +180,7 @@ public class CarbonCompactionUtil {
         return true;
       }
     } catch (IOException e) {
-      LOGGER.error("Exception in isFileExist compaction request file " + e.getMessage());
+      LOGGER.error("Exception in isFileExist compaction request file " + e.getMessage(), e);
     }
     return false;
   }
@@ -207,7 +207,7 @@ public class CarbonCompactionUtil {
       }
 
     } catch (IOException e) {
-      LOGGER.error("Exception in determining the compaction request file " + e.getMessage());
+      LOGGER.error("Exception in determining the compaction request file " + e.getMessage(), e);
     }
     return CompactionType.MINOR;
   }
@@ -243,7 +243,7 @@ public class CarbonCompactionUtil {
         LOGGER.info("Compaction request file is not present. file is : " + compactionRequiredFile);
       }
     } catch (IOException e) {
-      LOGGER.error("Exception in deleting the compaction request file " + e.getMessage());
+      LOGGER.error("Exception in deleting the compaction request file " + e.getMessage(), e);
     }
     return false;
   }
@@ -277,7 +277,7 @@ public class CarbonCompactionUtil {
         LOGGER.info("Compaction request file : " + statusFile + " already exist.");
       }
     } catch (IOException e) {
-      LOGGER.error("Exception in creating the compaction request file " + e.getMessage());
+      LOGGER.error("Exception in creating the compaction request file " + e.getMessage(), e);
     }
     return false;
   }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonDataMergerUtil.java b/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonDataMergerUtil.java
index b380888..01e47be 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonDataMergerUtil.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonDataMergerUtil.java
@@ -554,7 +554,7 @@ public final class CarbonDataMergerUtil {
         try {
           segDate2 = sdf.parse(sdf.format(segmentDate));
         } catch (ParseException e) {
-          LOGGER.error("Error while parsing segment start time" + e.getMessage());
+          LOGGER.error("Error while parsing segment start time" + e.getMessage(), e);
         }
 
         if (isTwoDatesPresentInRequiredRange(segDate1, segDate2, numberOfDaysAllowedToMerge)) {
@@ -596,7 +596,7 @@ public final class CarbonDataMergerUtil {
     try {
       segDate1 = sdf.parse(sdf.format(baselineLoadStartTime));
     } catch (ParseException e) {
-      LOGGER.error("Error while parsing segment start time" + e.getMessage());
+      LOGGER.error("Error while parsing segment start time" + e.getMessage(), e);
     }
     loadsOfSameDate.add(segment);
     return segDate1;
diff --git a/processing/src/main/java/org/apache/carbondata/processing/merger/CompactionResultSortProcessor.java b/processing/src/main/java/org/apache/carbondata/processing/merger/CompactionResultSortProcessor.java
index a1d5b43..7b3381c 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/merger/CompactionResultSortProcessor.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/merger/CompactionResultSortProcessor.java
@@ -226,7 +226,7 @@ public class CompactionResultSortProcessor extends AbstractResultProcessor {
         try {
           CarbonUtil.deleteFoldersAndFiles(new File(tempLoc));
         } catch (IOException | InterruptedException e) {
-          LOGGER.error("Problem deleting local folders during compaction: " + e.getMessage());
+          LOGGER.error("Problem deleting local folders during compaction: " + e.getMessage(), e);
         }
       }
     }
@@ -256,8 +256,8 @@ public class CompactionResultSortProcessor extends AbstractResultProcessor {
     try {
       sortDataRows.startSorting();
     } catch (CarbonSortKeyAndGroupByException e) {
-      LOGGER.error(e);
-      throw new Exception("Problem loading data during compaction: " + e.getMessage());
+      LOGGER.error(e.getMessage(), e);
+      throw new Exception("Problem loading data during compaction: " + e.getMessage(), e);
     }
   }
 
@@ -394,10 +394,10 @@ public class CompactionResultSortProcessor extends AbstractResultProcessor {
       }
       dataHandler.finish();
     } catch (CarbonDataWriterException e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
       throw new Exception("Problem loading data during compaction.", e);
     } catch (Exception e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
       throw new Exception("Problem loading data during compaction.", e);
     } finally {
       if (null != dataHandler) {
@@ -420,8 +420,9 @@ public class CompactionResultSortProcessor extends AbstractResultProcessor {
     try {
       sortDataRows.addRow(row);
     } catch (CarbonSortKeyAndGroupByException e) {
-      LOGGER.error(e);
-      throw new Exception("Row addition for sorting failed during compaction: " + e.getMessage());
+      LOGGER.error(e.getMessage(), e);
+      throw new Exception("Row addition for sorting failed during compaction: "
+          + e.getMessage(), e);
     }
   }
 
@@ -459,9 +460,9 @@ public class CompactionResultSortProcessor extends AbstractResultProcessor {
     try {
       this.sortDataRows.initialize();
     } catch (CarbonSortKeyAndGroupByException e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
       throw new Exception(
-          "Error initializing sort data rows object during compaction: " + e.getMessage());
+          "Error initializing sort data rows object during compaction: " + e.getMessage(), e);
     }
   }
 
@@ -517,8 +518,9 @@ public class CompactionResultSortProcessor extends AbstractResultProcessor {
     try {
       dataHandler.initialise();
     } catch (CarbonDataWriterException e) {
-      LOGGER.error(e);
-      throw new Exception("Problem initialising data handler during compaction: " + e.getMessage());
+      LOGGER.error(e.getMessage(), e);
+      throw new Exception("Problem initialising data handler during compaction: "
+          + e.getMessage(), e);
     }
   }
 
diff --git a/processing/src/main/java/org/apache/carbondata/processing/merger/RowResultMergerProcessor.java b/processing/src/main/java/org/apache/carbondata/processing/merger/RowResultMergerProcessor.java
index 5566422..7234c33 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/merger/RowResultMergerProcessor.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/merger/RowResultMergerProcessor.java
@@ -244,7 +244,7 @@ public class RowResultMergerProcessor extends AbstractResultProcessor {
         row1 = o1.fetchConverted();
         row2 = o2.fetchConverted();
       } catch (KeyGenException e) {
-        LOGGER.error(e.getMessage());
+        LOGGER.error(e.getMessage(), e);
       }
       if (null == row1 || null == row2) {
         return 0;
diff --git a/processing/src/main/java/org/apache/carbondata/processing/partition/spliter/RowResultProcessor.java b/processing/src/main/java/org/apache/carbondata/processing/partition/spliter/RowResultProcessor.java
index 977b9d3..d7f23fb 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/partition/spliter/RowResultProcessor.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/partition/spliter/RowResultProcessor.java
@@ -86,8 +86,7 @@ public class RowResultProcessor {
       }
       processStatus = true;
     } catch (CarbonDataWriterException e) {
-      LOGGER.error(e.getMessage(), e);
-      LOGGER.error("Exception in executing RowResultProcessor" + e.getMessage());
+      LOGGER.error("Exception in executing RowResultProcessor" + e.getMessage(), e);
       processStatus = false;
     } finally {
       try {
@@ -95,7 +94,8 @@ public class RowResultProcessor {
           this.dataHandler.closeHandler();
         }
       } catch (Exception e) {
-        LOGGER.error("Exception while closing the handler in RowResultProcessor" + e.getMessage());
+        LOGGER.error("Exception while closing the handler in RowResultProcessor"
+            + e.getMessage(), e);
         processStatus = false;
       }
     }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SingleThreadFinalSortFilesMerger.java b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SingleThreadFinalSortFilesMerger.java
index bd9526f..6f36e96 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SingleThreadFinalSortFilesMerger.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SingleThreadFinalSortFilesMerger.java
@@ -212,7 +212,7 @@ public class SingleThreadFinalSortFilesMerger extends CarbonIterator<Object[]> {
     try {
       executorService.awaitTermination(2, TimeUnit.HOURS);
     } catch (Exception e) {
-      throw new CarbonDataWriterException(e.getMessage(), e);
+      throw new CarbonDataWriterException(e);
     }
     checkFailure();
     LOGGER.info("final merger Heap Size" + this.recordHolderHeapLocal.size());
@@ -290,7 +290,7 @@ public class SingleThreadFinalSortFilesMerger extends CarbonIterator<Object[]> {
       poll.readRow();
     } catch (CarbonSortKeyAndGroupByException e) {
       close();
-      throw new CarbonDataWriterException(e.getMessage(), e);
+      throw new CarbonDataWriterException(e);
     }
 
     // add to heap
diff --git a/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortDataRows.java b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortDataRows.java
index 996b844..128547d 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortDataRows.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortDataRows.java
@@ -174,7 +174,7 @@ public class SortDataRows {
               .execute(new DataSorterAndWriter(recordHolderListLocal));
         } catch (Exception e) {
           LOGGER.error(
-              "exception occurred while trying to acquire a semaphore lock: " + e.getMessage());
+              "exception occurred while trying to acquire a semaphore lock: " + e.getMessage(), e);
           throw new CarbonSortKeyAndGroupByException(e);
         }
         // create the new holder Array
diff --git a/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortTempFileChunkHolder.java b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortTempFileChunkHolder.java
index 82e6b37..eeea2ec 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortTempFileChunkHolder.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortTempFileChunkHolder.java
@@ -157,13 +157,13 @@ public class SortTempFileChunkHolder implements Comparable<SortTempFileChunkHold
         }
       }
     } catch (FileNotFoundException e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
       throw new CarbonSortKeyAndGroupByException(tempFile + " No Found", e);
     } catch (IOException e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
       throw new CarbonSortKeyAndGroupByException(tempFile + " No Found", e);
     } catch (Exception e) {
-      LOGGER.error(e);
+      LOGGER.error(e.getMessage(), e);
       throw new CarbonSortKeyAndGroupByException(tempFile + " Problem while reading", e);
     }
   }
@@ -204,7 +204,7 @@ public class SortTempFileChunkHolder implements Comparable<SortTempFileChunkHold
         try {
           submit.get();
         } catch (Exception e) {
-          LOGGER.error(e);
+          LOGGER.error(e.getMessage(), e);
         }
         bufferRowCounter = 0;
         currentBuffer = backupBuffer;
@@ -330,7 +330,7 @@ public class SortTempFileChunkHolder implements Comparable<SortTempFileChunkHold
           currentBuffer = prefetchRecordsFromFile(numberOfRecords);
         }
       } catch (Exception e) {
-        LOGGER.error(e);
+        LOGGER.error(e.getMessage(), e);
       }
       return null;
     }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerColumnar.java b/processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerColumnar.java
index 96fd544..1270b1f 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerColumnar.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerColumnar.java
@@ -220,7 +220,7 @@ public class CarbonFactDataHandlerColumnar implements CarbonFactHandler {
         this.entryCount = 0;
       } catch (InterruptedException e) {
         LOGGER.error(e.getMessage(), e);
-        throw new CarbonDataWriterException(e.getMessage(), e);
+        throw new CarbonDataWriterException(e);
       }
     }
   }
@@ -326,7 +326,7 @@ public class CarbonFactDataHandlerColumnar implements CarbonFactHandler {
       processingComplete = true;
     } catch (InterruptedException e) {
       LOGGER.error(e.getMessage(), e);
-      throw new CarbonDataWriterException(e.getMessage(), e);
+      throw new CarbonDataWriterException(e);
     }
   }
 
@@ -344,7 +344,7 @@ public class CarbonFactDataHandlerColumnar implements CarbonFactHandler {
       service.awaitTermination(1, TimeUnit.DAYS);
     } catch (InterruptedException e) {
       LOGGER.error(e.getMessage(), e);
-      throw new CarbonDataWriterException(e.getMessage());
+      throw new CarbonDataWriterException(e);
     }
   }
 
@@ -362,7 +362,7 @@ public class CarbonFactDataHandlerColumnar implements CarbonFactHandler {
         taskList.get(i).get();
       } catch (InterruptedException | ExecutionException e) {
         LOGGER.error(e.getMessage(), e);
-        throw new CarbonDataWriterException(e.getMessage(), e);
+        throw new CarbonDataWriterException(e);
       }
     }
   }
@@ -382,7 +382,7 @@ public class CarbonFactDataHandlerColumnar implements CarbonFactHandler {
         try {
           Thread.sleep(50);
         } catch (InterruptedException e) {
-          throw new CarbonDataWriterException(e.getMessage());
+          throw new CarbonDataWriterException(e);
         }
       }
       consumerExecutorService.shutdownNow();
@@ -641,7 +641,7 @@ public class CarbonFactDataHandlerColumnar implements CarbonFactHandler {
             producerExecutorService.shutdownNow();
             resetBlockletProcessingCount();
             LOGGER.error("Problem while writing the carbon data file", throwable);
-            throw new CarbonDataWriterException(throwable.getMessage());
+            throw new CarbonDataWriterException(throwable);
           }
         } finally {
           semaphore.release();
diff --git a/processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java b/processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java
index 0ff3eb6..a396978 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java
@@ -143,7 +143,7 @@ public final class CarbonLoaderUtil {
         CarbonUtil.deleteFoldersAndFiles(carbonFile);
       }
     } catch (IOException | InterruptedException e) {
-      LOGGER.error("Unable to delete the given path :: " + e.getMessage());
+      LOGGER.error("Unable to delete the given path :: " + e.getMessage(), e);
     }
   }
 
@@ -354,7 +354,7 @@ public final class CarbonLoaderUtil {
           try {
             CarbonUtil.deleteFoldersAndFiles(staleFolder);
           } catch (IOException | InterruptedException e) {
-            LOGGER.error("Failed to delete stale folder: " + e.getMessage());
+            LOGGER.error("Failed to delete stale folder: " + e.getMessage(), e);
           }
         }
         status = true;
@@ -967,7 +967,7 @@ public final class CarbonLoaderUtil {
                     + StringUtils.join(block.getLocations(), ", ")
                     + ")-->" + activeExecutor);
               } catch (IOException e) {
-                LOGGER.error(e);
+                LOGGER.error(e.getMessage(), e);
               }
             }
             remainingBlocks.remove(block);
diff --git a/store/sdk/src/main/java/org/apache/carbondata/store/LocalCarbonStore.java b/store/sdk/src/main/java/org/apache/carbondata/store/LocalCarbonStore.java
index 7bfc1cb..307f64f 100644
--- a/store/sdk/src/main/java/org/apache/carbondata/store/LocalCarbonStore.java
+++ b/store/sdk/src/main/java/org/apache/carbondata/store/LocalCarbonStore.java
@@ -109,7 +109,7 @@ class LocalCarbonStore extends MetaCachedCarbonStore {
         try {
           reader.close();
         } catch (IOException e) {
-          LOGGER.error(e);
+          LOGGER.error(e.getMessage(), e);
         }
       }
     } catch (InterruptedException e) {
@@ -119,7 +119,7 @@ class LocalCarbonStore extends MetaCachedCarbonStore {
         try {
           reader.close();
         } catch (IOException e) {
-          LOGGER.error(e);
+          LOGGER.error(e.getMessage(), e);
         }
       }
     }
diff --git a/tools/cli/src/main/java/org/apache/carbondata/tool/CarbonCli.java b/tools/cli/src/main/java/org/apache/carbondata/tool/CarbonCli.java
index 6fc3128..a122394 100644
--- a/tools/cli/src/main/java/org/apache/carbondata/tool/CarbonCli.java
+++ b/tools/cli/src/main/java/org/apache/carbondata/tool/CarbonCli.java
@@ -123,8 +123,8 @@ public class CarbonCli {
     CommandLine line;
     try {
       line = parser.parse(options, args);
-    } catch (ParseException exp) {
-      throw new RuntimeException("Parsing failed. Reason: " + exp.getMessage());
+    } catch (ParseException ex) {
+      throw new RuntimeException("Parsing failed. Reason: " + ex.getMessage(), ex);
     }
 
     runCli(System.out, options, line);


[carbondata] 26/41: [CARBONDATA-3318] Added PreAgg & Bloom Event-Listener for ShowCacheCommmand

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit c4f32dd3c127011f656e7caea82e5df7af7c1c2a
Author: namanrastogi <na...@gmail.com>
AuthorDate: Sat Mar 9 15:49:00 2019 +0530

    [CARBONDATA-3318] Added PreAgg & Bloom Event-Listener for ShowCacheCommmand
    
    Decoupling of Cache Commands
    1. Added PreAgg Event-Listener for ShowCacheCommmand
    2. Added Bloom Event-Listener for ShowCacheCommmand
    3. Added PreAgg Event-Listener for DropCacheCommand
    4. Added Bloom Event-Listener for DropCacheCommmand
    5. Updated doc
    6. Support external table
    6.1 display external table in comments/with table name
    6.2 count the index files for external table
    
    This closes #3146
---
 .../datamap/bloom/BloomCacheKeyValue.java          |   2 +-
 .../bloom/BloomCoarseGrainDataMapFactory.java      |   9 +-
 docs/ddl-of-carbondata.md                          |  13 +
 .../sql/commands/TestCarbonShowCacheCommand.scala  |  51 ++-
 .../{DropCacheEvents.scala => CacheEvents.scala}   |  11 +-
 .../org/apache/carbondata/events/Events.scala      |   9 +-
 .../scala/org/apache/spark/sql/CarbonEnv.scala     |   7 +-
 .../sql/execution/command/cache/CacheUtil.scala    | 108 +++++++
 .../command/cache/CarbonDropCacheCommand.scala     |  59 +---
 .../command/cache/CarbonShowCacheCommand.scala     | 341 ++++++++-------------
 .../command/cache/DropCacheEventListeners.scala    | 121 ++++++++
 .../cache/DropCachePreAggEventListener.scala       |  70 -----
 .../command/cache/ShowCacheEventListeners.scala    | 126 ++++++++
 13 files changed, 581 insertions(+), 346 deletions(-)

diff --git a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCacheKeyValue.java b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCacheKeyValue.java
index a66ee63..70624eb 100644
--- a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCacheKeyValue.java
+++ b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCacheKeyValue.java
@@ -35,7 +35,7 @@ public class BloomCacheKeyValue {
     private String shardPath;
     private String indexColumn;
 
-    CacheKey(String shardPath, String indexColumn) {
+    public CacheKey(String shardPath, String indexColumn) {
       this.shardPath = shardPath;
       this.indexColumn = indexColumn;
     }
diff --git a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
index 9785549..11b216e 100644
--- a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
+++ b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
@@ -227,7 +227,8 @@ public class BloomCoarseGrainDataMapFactory extends DataMapFactory<CoarseGrainDa
    * returns all shard directories of bloom index files for query
    * if bloom index files are merged we should get only one shard path
    */
-  private Set<String> getAllShardPaths(String tablePath, String segmentId) {
+  public static Set<String> getAllShardPaths(String tablePath, String segmentId,
+      String dataMapName) {
     String dataMapStorePath = CarbonTablePath.getDataMapStorePath(
         tablePath, segmentId, dataMapName);
     CarbonFile[] carbonFiles = FileFactory.getCarbonFile(dataMapStorePath).listFiles();
@@ -257,7 +258,8 @@ public class BloomCoarseGrainDataMapFactory extends DataMapFactory<CoarseGrainDa
     try {
       Set<String> shardPaths = segmentMap.get(segment.getSegmentNo());
       if (shardPaths == null) {
-        shardPaths = getAllShardPaths(getCarbonTable().getTablePath(), segment.getSegmentNo());
+        shardPaths =
+            getAllShardPaths(getCarbonTable().getTablePath(), segment.getSegmentNo(), dataMapName);
         segmentMap.put(segment.getSegmentNo(), shardPaths);
       }
       Set<String> filteredShards = segment.getFilteredIndexShardNames();
@@ -299,7 +301,8 @@ public class BloomCoarseGrainDataMapFactory extends DataMapFactory<CoarseGrainDa
     List<DataMapDistributable> dataMapDistributableList = new ArrayList<>();
     Set<String> shardPaths = segmentMap.get(segment.getSegmentNo());
     if (shardPaths == null) {
-      shardPaths = getAllShardPaths(getCarbonTable().getTablePath(), segment.getSegmentNo());
+      shardPaths =
+          getAllShardPaths(getCarbonTable().getTablePath(), segment.getSegmentNo(), dataMapName);
       segmentMap.put(segment.getSegmentNo(), shardPaths);
     }
     Set<String> filteredShards = segment.getFilteredIndexShardNames();
diff --git a/docs/ddl-of-carbondata.md b/docs/ddl-of-carbondata.md
index e6f209e..07a2670 100644
--- a/docs/ddl-of-carbondata.md
+++ b/docs/ddl-of-carbondata.md
@@ -1119,3 +1119,16 @@ Users can specify which columns to include and exclude for local dictionary gene
   its dictionary files, its datamaps and children tables.
     
   This command is not allowed on child tables.
+
+### Important points
+
+  1. Cache information is updated only after the select query is executed. 
+  
+  2. In case of alter table the already loaded cache is invalidated when any subsequent select query
+  is fired.
+
+  3. Dictionary is loaded in cache only when the dictionary columns are queried upon. If we don't do
+  direct query on dictionary column, cache will not be loaded.
+  If we do `SELECT * FROM t1`, and even though for this case dictionary is loaded, it is loaded in
+  executor and not on driver, and the final result rows are returned back to driver, and thus will
+  produce no trace on driver cache if we do `SHOW METACACHE` or `SHOW METACACHE ON TABLE t1`.
diff --git a/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala b/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala
index e7fd5fa..35ac2e3 100644
--- a/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala
+++ b/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala
@@ -17,10 +17,14 @@
 
 package org.apache.carbondata.sql.commands
 
-import org.apache.spark.sql.Row
+import org.apache.spark.sql.{CarbonEnv, Row}
 import org.apache.spark.sql.test.util.QueryTest
+import org.junit.Assert
 import org.scalatest.BeforeAndAfterAll
 
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.impl.FileFactory
+
 class TestCarbonShowCacheCommand extends QueryTest with BeforeAndAfterAll {
   override protected def beforeAll(): Unit = {
     // use new database
@@ -133,6 +137,28 @@ class TestCarbonShowCacheCommand extends QueryTest with BeforeAndAfterAll {
     assert(showCache(0).get(2).toString.equalsIgnoreCase("1/1 index files cached"))
   }
 
+  test("test external table show cache") {
+    sql(s"CREATE TABLE employeeTable(empno int, empname String, designation String, " +
+        s"doj Timestamp, workgroupcategory int, workgroupcategoryname String, deptno int, " +
+        s"deptname String, projectcode int, projectjoindate Timestamp, projectenddate Timestamp," +
+        s"attendance int, utilization int, salary int) stored by 'carbondata'")
+    sql(s"LOAD DATA INPATH '$resourcesPath/data.csv' INTO TABLE employeeTable")
+    val table = CarbonEnv.getCarbonTable(Some("default"), "employeeTable")(sqlContext.sparkSession)
+    val location = FileFactory
+      .getUpdatedFilePath(
+        table.getTablePath + CarbonCommonConstants.FILE_SEPARATOR + "/Fact/Part0/Segment_0")
+    sql(s"CREATE EXTERNAL TABLE extTable stored as carbondata LOCATION '${location}'")
+    sql("select * from extTable").show()
+    val rows = sql("SHOW METACACHE ON TABLE extTable").collect()
+    var isPresent = false
+    rows.foreach(row => {
+      if (row.getString(2).equalsIgnoreCase("1/1 index files cached (external table)")){
+        isPresent = true
+      }
+    })
+    Assert.assertTrue(isPresent)
+  }
+
   override protected def afterAll(): Unit = {
     sql("use default").collect()
     dropTable
@@ -145,42 +171,63 @@ class TestCarbonShowCacheCommand extends QueryTest with BeforeAndAfterAll {
     sql("DROP TABLE IF EXISTS default.cache_4")
     sql("DROP TABLE IF EXISTS default.cache_5")
     sql("DROP TABLE IF EXISTS empTable")
+    sql("DROP TABLE IF EXISTS employeeTable")
+    sql("DROP TABLE IF EXISTS extTable")
   }
 
   test("show cache") {
+
+    // Empty database
     sql("use cache_empty_db").collect()
     val result1 = sql("show metacache").collect()
     assertResult(2)(result1.length)
     assertResult(Row("cache_empty_db", "ALL", "0 B", "0 B", "0 B"))(result1(1))
 
+    // Database with 3 tables but only 2 are in cache
     sql("use cache_db").collect()
     val result2 = sql("show metacache").collect()
     assertResult(4)(result2.length)
 
+    // Make sure PreAgg tables are not in SHOW METADATA
     sql("use default").collect()
     val result3 = sql("show metacache").collect()
     val dataMapCacheInfo = result3
       .map(row => row.getString(1))
       .filter(table => table.equals("cache_4_cache_4_count"))
-    assertResult(1)(dataMapCacheInfo.length)
+    assertResult(0)(dataMapCacheInfo.length)
   }
 
   test("show metacache on table") {
     sql("use cache_db").collect()
+
+    // Table with Index, Dictionary & Bloom filter
     val result1 = sql("show metacache on table cache_1").collect()
     assertResult(3)(result1.length)
+    assertResult("1/1 index files cached")(result1(0).getString(2))
+    assertResult("bloomfilter")(result1(2).getString(2))
 
+    // Table with Index and Dictionary
     val result2 = sql("show metacache on table cache_db.cache_2").collect()
     assertResult(2)(result2.length)
+    assertResult("2/2 index files cached")(result2(0).getString(2))
+    assertResult("0 B")(result2(1).getString(1))
 
+    // Table not in cache
     checkAnswer(sql("show metacache on table cache_db.cache_3"),
       Seq(Row("Index", "0 B", "0/1 index files cached"), Row("Dictionary", "0 B", "")))
 
+    // Table with Index, Dictionary & PreAgg child table
     val result4 = sql("show metacache on table default.cache_4").collect()
     assertResult(3)(result4.length)
+    assertResult("1/1 index files cached")(result4(0).getString(2))
+    assertResult("0 B")(result4(1).getString(1))
+    assertResult("preaggregate")(result4(2).getString(2))
 
     sql("use default").collect()
+
+    // Table with 5 index files
     val result5 = sql("show metacache on table cache_5").collect()
     assertResult(2)(result5.length)
+    assertResult("5/5 index files cached")(result5(0).getString(2))
   }
 }
diff --git a/integration/spark-common/src/main/scala/org/apache/carbondata/events/DropCacheEvents.scala b/integration/spark-common/src/main/scala/org/apache/carbondata/events/CacheEvents.scala
similarity index 80%
rename from integration/spark-common/src/main/scala/org/apache/carbondata/events/DropCacheEvents.scala
rename to integration/spark-common/src/main/scala/org/apache/carbondata/events/CacheEvents.scala
index 2e8b78e..ec5127f 100644
--- a/integration/spark-common/src/main/scala/org/apache/carbondata/events/DropCacheEvents.scala
+++ b/integration/spark-common/src/main/scala/org/apache/carbondata/events/CacheEvents.scala
@@ -21,8 +21,15 @@ import org.apache.spark.sql.SparkSession
 
 import org.apache.carbondata.core.metadata.schema.table.CarbonTable
 
-case class DropCacheEvent(
+case class DropTableCacheEvent(
     carbonTable: CarbonTable,
     sparkSession: SparkSession,
     internalCall: Boolean)
-  extends Event with DropCacheEventInfo
+  extends Event with DropTableCacheEventInfo
+
+
+case class ShowTableCacheEvent(
+    carbonTable: CarbonTable,
+    sparkSession: SparkSession,
+    internalCall: Boolean)
+  extends Event with ShowTableCacheEventInfo
diff --git a/integration/spark-common/src/main/scala/org/apache/carbondata/events/Events.scala b/integration/spark-common/src/main/scala/org/apache/carbondata/events/Events.scala
index c03d3c6..e6b9213 100644
--- a/integration/spark-common/src/main/scala/org/apache/carbondata/events/Events.scala
+++ b/integration/spark-common/src/main/scala/org/apache/carbondata/events/Events.scala
@@ -63,9 +63,16 @@ trait DropTableEventInfo {
 }
 
 /**
+ * event for show cache
+ */
+trait ShowTableCacheEventInfo {
+  val carbonTable: CarbonTable
+}
+
+/**
  * event for drop cache
  */
-trait DropCacheEventInfo {
+trait DropTableCacheEventInfo {
   val carbonTable: CarbonTable
 }
 
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonEnv.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
index 60d896a..7ca9945 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
@@ -23,7 +23,7 @@ import org.apache.spark.sql.catalyst.TableIdentifier
 import org.apache.spark.sql.catalyst.analysis.NoSuchTableException
 import org.apache.spark.sql.catalyst.catalog.SessionCatalog
 import org.apache.spark.sql.events.{MergeBloomIndexEventListener, MergeIndexEventListener}
-import org.apache.spark.sql.execution.command.cache.DropCachePreAggEventListener
+import org.apache.spark.sql.execution.command.cache._
 import org.apache.spark.sql.execution.command.preaaggregate._
 import org.apache.spark.sql.execution.command.timeseries.TimeSeriesFunction
 import org.apache.spark.sql.hive._
@@ -186,7 +186,10 @@ object CarbonEnv {
       .addListener(classOf[AlterTableCompactionPostEvent], new MergeIndexEventListener)
       .addListener(classOf[AlterTableMergeIndexEvent], new MergeIndexEventListener)
       .addListener(classOf[BuildDataMapPostExecutionEvent], new MergeBloomIndexEventListener)
-      .addListener(classOf[DropCacheEvent], DropCachePreAggEventListener)
+      .addListener(classOf[DropTableCacheEvent], DropCachePreAggEventListener)
+      .addListener(classOf[DropTableCacheEvent], DropCacheBloomEventListener)
+      .addListener(classOf[ShowTableCacheEvent], ShowCachePreAggEventListener)
+      .addListener(classOf[ShowTableCacheEvent], ShowCacheBloomEventListener)
   }
 
   /**
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CacheUtil.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CacheUtil.scala
new file mode 100644
index 0000000..615d8e0
--- /dev/null
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CacheUtil.scala
@@ -0,0 +1,108 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command.cache
+
+import org.apache.hadoop.mapred.JobConf
+import scala.collection.JavaConverters._
+
+import org.apache.carbondata.core.cache.CacheType
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datamap.Segment
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.metadata.schema.table.{CarbonTable, DataMapSchema}
+import org.apache.carbondata.core.readcommitter.LatestFilesReadCommittedScope
+import org.apache.carbondata.datamap.bloom.{BloomCacheKeyValue, BloomCoarseGrainDataMapFactory}
+import org.apache.carbondata.processing.merger.CarbonDataMergerUtil
+
+
+object CacheUtil {
+
+  /**
+   * Given a carbonTable, returns the list of all carbonindex files
+   *
+   * @param carbonTable
+   * @return List of all index files
+   */
+  def getAllIndexFiles(carbonTable: CarbonTable): List[String] = {
+    if (carbonTable.isTransactionalTable) {
+      val absoluteTableIdentifier = carbonTable.getAbsoluteTableIdentifier
+      CarbonDataMergerUtil.getValidSegmentList(absoluteTableIdentifier).asScala.flatMap {
+        segment =>
+          segment.getCommittedIndexFile.keySet().asScala
+      }.toList
+    } else {
+      val tablePath = carbonTable.getTablePath
+      val readCommittedScope = new LatestFilesReadCommittedScope(tablePath,
+        FileFactory.getConfiguration)
+      readCommittedScope.getSegmentList.flatMap {
+        load =>
+          val seg = new Segment(load.getLoadName, null, readCommittedScope)
+          seg.getCommittedIndexFile.keySet().asScala
+      }.toList
+    }
+  }
+
+  /**
+   * Given a carbonTable file, returns a list of all dictionary entries which can be in cache
+   *
+   * @param carbonTable
+   * @return List of all dict entries which can in cache
+   */
+  def getAllDictCacheKeys(carbonTable: CarbonTable): List[String] = {
+    def getDictCacheKey(columnIdentifier: String,
+        cacheType: CacheType[_, _]): String = {
+      columnIdentifier + CarbonCommonConstants.UNDERSCORE + cacheType.getCacheName
+    }
+
+    carbonTable.getAllDimensions.asScala
+      .collect {
+        case dict if dict.isGlobalDictionaryEncoding =>
+          Seq(getDictCacheKey(dict.getColumnId, CacheType.FORWARD_DICTIONARY),
+            getDictCacheKey(dict.getColumnId, CacheType.REVERSE_DICTIONARY))
+      }.flatten.toList
+  }
+
+  def getBloomCacheKeys(carbonTable: CarbonTable, datamap: DataMapSchema): List[String] = {
+    val segments = CarbonDataMergerUtil
+      .getValidSegmentList(carbonTable.getAbsoluteTableIdentifier).asScala
+
+    // Generate shard Path for the datamap
+    val shardPaths = segments.flatMap {
+      segment =>
+        BloomCoarseGrainDataMapFactory.getAllShardPaths(carbonTable.getTablePath,
+          segment.getSegmentNo, datamap.getDataMapName).asScala
+    }
+
+    // get index columns
+    val indexColumns = carbonTable.getIndexedColumns(datamap).asScala.map {
+      entry =>
+        entry.getColName
+    }
+
+    // generate cache key using shard path and index columns on which bloom was created.
+    val datamapKeys = shardPaths.flatMap {
+      shardPath =>
+        indexColumns.map {
+          indexCol =>
+            new BloomCacheKeyValue.CacheKey(shardPath, indexCol).toString
+      }
+    }
+    datamapKeys.toList
+  }
+
+}
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonDropCacheCommand.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonDropCacheCommand.scala
index e955ed9..a0bb43e 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonDropCacheCommand.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonDropCacheCommand.scala
@@ -18,7 +18,6 @@
 package org.apache.spark.sql.execution.command.cache
 
 import scala.collection.JavaConverters._
-import scala.collection.mutable.ListBuffer
 
 import org.apache.spark.sql.{CarbonEnv, Row, SparkSession}
 import org.apache.spark.sql.catalyst.TableIdentifier
@@ -26,12 +25,8 @@ import org.apache.spark.sql.execution.command.MetadataCommand
 
 import org.apache.carbondata.common.logging.LogServiceFactory
 import org.apache.carbondata.core.cache.CacheProvider
-import org.apache.carbondata.core.cache.dictionary.AbstractColumnDictionaryInfo
-import org.apache.carbondata.core.constants.CarbonCommonConstants
-import org.apache.carbondata.core.indexstore.BlockletDataMapIndexWrapper
 import org.apache.carbondata.core.metadata.schema.table.CarbonTable
-import org.apache.carbondata.datamap.bloom.BloomCacheKeyValue
-import org.apache.carbondata.events.{DropCacheEvent, OperationContext, OperationListenerBus}
+import org.apache.carbondata.events.{DropTableCacheEvent, OperationContext, OperationListenerBus}
 
 case class CarbonDropCacheCommand(tableIdentifier: TableIdentifier, internalCall: Boolean = false)
   extends MetadataCommand {
@@ -45,59 +40,27 @@ case class CarbonDropCacheCommand(tableIdentifier: TableIdentifier, internalCall
   }
 
   def clearCache(carbonTable: CarbonTable, sparkSession: SparkSession): Unit = {
-    LOGGER.info("Drop cache request received for table " + carbonTable.getTableName)
+    LOGGER.info("Drop cache request received for table " + carbonTable.getTableUniqueName)
 
-    val dropCacheEvent = DropCacheEvent(
-      carbonTable,
-      sparkSession,
-      internalCall
-    )
+    val dropCacheEvent = DropTableCacheEvent(carbonTable, sparkSession, internalCall)
     val operationContext = new OperationContext
     OperationListenerBus.getInstance.fireEvent(dropCacheEvent, operationContext)
 
     val cache = CacheProvider.getInstance().getCarbonCache
     if (cache != null) {
-      val tablePath = carbonTable.getTablePath + CarbonCommonConstants.FILE_SEPARATOR
 
-      // Dictionary IDs
-      val dictIds = carbonTable.getAllDimensions.asScala.filter(_.isGlobalDictionaryEncoding)
-        .map(_.getColumnId).toArray
+      // Get all Index files for the specified table.
+      val allIndexFiles = CacheUtil.getAllIndexFiles(carbonTable)
 
-      // Remove elements from cache
-      val keysToRemove = ListBuffer[String]()
-      val cacheIterator = cache.getCacheMap.entrySet().iterator()
-      while (cacheIterator.hasNext) {
-        val entry = cacheIterator.next()
-        val cache = entry.getValue
+      // Extract dictionary keys for the table and create cache keys from those
+      val dictKeys: List[String] = CacheUtil.getAllDictCacheKeys(carbonTable)
 
-        if (cache.isInstanceOf[BlockletDataMapIndexWrapper]) {
-          // index
-          val indexPath = entry.getKey.replace(CarbonCommonConstants.WINDOWS_FILE_SEPARATOR,
-            CarbonCommonConstants.FILE_SEPARATOR)
-          if (indexPath.startsWith(tablePath)) {
-            keysToRemove += entry.getKey
-          }
-        } else if (cache.isInstanceOf[BloomCacheKeyValue.CacheValue]) {
-          // bloom datamap
-          val shardPath = entry.getKey.replace(CarbonCommonConstants.WINDOWS_FILE_SEPARATOR,
-            CarbonCommonConstants.FILE_SEPARATOR)
-          if (shardPath.contains(tablePath)) {
-            keysToRemove += entry.getKey
-          }
-        } else if (cache.isInstanceOf[AbstractColumnDictionaryInfo]) {
-          // dictionary
-          val dictId = dictIds.find(id => entry.getKey.startsWith(id))
-          if (dictId.isDefined) {
-            keysToRemove += entry.getKey
-          }
-        }
-      }
+      // Remove elements from cache
+      val keysToRemove = allIndexFiles ++ dictKeys
       cache.removeAll(keysToRemove.asJava)
     }
-
-    LOGGER.info("Drop cache request received for table " + carbonTable.getTableName)
+    LOGGER.info("Drop cache request served for table " + carbonTable.getTableUniqueName)
   }
 
-  override protected def opName: String = "DROP CACHE"
-
+  override protected def opName: String = "DROP METACACHE"
 }
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
index 462be83..e19ee48 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
@@ -20,27 +20,28 @@ package org.apache.spark.sql.execution.command.cache
 import scala.collection.mutable
 import scala.collection.JavaConverters._
 
+import org.apache.hadoop.mapred.JobConf
 import org.apache.spark.sql.{CarbonEnv, Row, SparkSession}
 import org.apache.spark.sql.catalyst.TableIdentifier
 import org.apache.spark.sql.catalyst.expressions.AttributeReference
 import org.apache.spark.sql.execution.command.MetadataCommand
 import org.apache.spark.sql.types.StringType
 
-import org.apache.carbondata.core.cache.CacheProvider
+import org.apache.carbondata.core.cache.{CacheProvider, CacheType}
 import org.apache.carbondata.core.cache.dictionary.AbstractColumnDictionaryInfo
 import org.apache.carbondata.core.constants.CarbonCommonConstants
-import org.apache.carbondata.core.datamap.DataMapStoreManager
+import org.apache.carbondata.core.datamap.Segment
 import org.apache.carbondata.core.indexstore.BlockletDataMapIndexWrapper
-import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier
-import org.apache.carbondata.core.metadata.schema.table.{CarbonTable, DataMapSchema}
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.readcommitter.LatestFilesReadCommittedScope
 import org.apache.carbondata.datamap.bloom.BloomCacheKeyValue
+import org.apache.carbondata.events.{OperationContext, OperationListenerBus, ShowTableCacheEvent}
 import org.apache.carbondata.processing.merger.CarbonDataMergerUtil
 import org.apache.carbondata.spark.util.CommonUtil.bytesToDisplaySize
 
-/**
- * SHOW CACHE
- */
-case class CarbonShowCacheCommand(tableIdentifier: Option[TableIdentifier])
+
+case class CarbonShowCacheCommand(tableIdentifier: Option[TableIdentifier],
+    internalCall: Boolean = false)
   extends MetadataCommand {
 
   override def output: Seq[AttributeReference] = {
@@ -61,241 +62,147 @@ case class CarbonShowCacheCommand(tableIdentifier: Option[TableIdentifier])
 
   override protected def opName: String = "SHOW CACHE"
 
-  def showAllTablesCache(sparkSession: SparkSession): Seq[Row] = {
+  def getAllTablesCache(sparkSession: SparkSession): Seq[Row] = {
     val currentDatabase = sparkSession.sessionState.catalog.getCurrentDatabase
     val cache = CacheProvider.getInstance().getCarbonCache()
     if (cache == null) {
       Seq(
-        Row("ALL", "ALL", bytesToDisplaySize(0L),
-          bytesToDisplaySize(0L), bytesToDisplaySize(0L)),
-        Row(currentDatabase, "ALL", bytesToDisplaySize(0L),
-          bytesToDisplaySize(0L), bytesToDisplaySize(0L)))
+        Row("ALL", "ALL", 0L, 0L, 0L),
+        Row(currentDatabase, "ALL", 0L, 0L, 0L))
     } else {
       val carbonTables = CarbonEnv.getInstance(sparkSession).carbonMetaStore
-        .listAllTables(sparkSession)
-        .filter { table =>
-        table.getDatabaseName.equalsIgnoreCase(currentDatabase)
-      }
-      val tablePaths = carbonTables
-        .map { table =>
-          (table.getTablePath + CarbonCommonConstants.FILE_SEPARATOR,
-            table.getDatabaseName + "." + table.getTableName)
+        .listAllTables(sparkSession).filter {
+        carbonTable =>
+          carbonTable.getDatabaseName.equalsIgnoreCase(currentDatabase) &&
+          !carbonTable.isChildDataMap
       }
 
-      val dictIds = carbonTables
-        .filter(_ != null)
-        .flatMap { table =>
-          table
-            .getAllDimensions
-            .asScala
-            .filter(_.isGlobalDictionaryEncoding)
-            .toArray
-            .map(dim => (dim.getColumnId, table.getDatabaseName + "." + table.getTableName))
-        }
-
-      // all databases
-      var (allIndexSize, allDatamapSize, allDictSize) = (0L, 0L, 0L)
-      // current database
+      // All tables of current database
       var (dbIndexSize, dbDatamapSize, dbDictSize) = (0L, 0L, 0L)
-      val tableMapIndexSize = mutable.HashMap[String, Long]()
-      val tableMapDatamapSize = mutable.HashMap[String, Long]()
-      val tableMapDictSize = mutable.HashMap[String, Long]()
-      val cacheIterator = cache.getCacheMap.entrySet().iterator()
-      while (cacheIterator.hasNext) {
-        val entry = cacheIterator.next()
-        val cache = entry.getValue
-        if (cache.isInstanceOf[BlockletDataMapIndexWrapper]) {
-          // index
-          allIndexSize = allIndexSize + cache.getMemorySize
-          val indexPath = entry.getKey.replace(
-            CarbonCommonConstants.WINDOWS_FILE_SEPARATOR, CarbonCommonConstants.FILE_SEPARATOR)
-          val tablePath = tablePaths.find(path => indexPath.startsWith(path._1))
-          if (tablePath.isDefined) {
-            dbIndexSize = dbIndexSize + cache.getMemorySize
-            val memorySize = tableMapIndexSize.get(tablePath.get._2)
-            if (memorySize.isEmpty) {
-              tableMapIndexSize.put(tablePath.get._2, cache.getMemorySize)
-            } else {
-              tableMapIndexSize.put(tablePath.get._2, memorySize.get + cache.getMemorySize)
-            }
-          }
-        } else if (cache.isInstanceOf[BloomCacheKeyValue.CacheValue]) {
-          // bloom datamap
-          allDatamapSize = allDatamapSize + cache.getMemorySize
-          val shardPath = entry.getKey.replace(CarbonCommonConstants.WINDOWS_FILE_SEPARATOR,
-            CarbonCommonConstants.FILE_SEPARATOR)
-          val tablePath = tablePaths.find(path => shardPath.contains(path._1))
-          if (tablePath.isDefined) {
-            dbDatamapSize = dbDatamapSize + cache.getMemorySize
-            val memorySize = tableMapDatamapSize.get(tablePath.get._2)
-            if (memorySize.isEmpty) {
-              tableMapDatamapSize.put(tablePath.get._2, cache.getMemorySize)
-            } else {
-              tableMapDatamapSize.put(tablePath.get._2, memorySize.get + cache.getMemorySize)
-            }
-          }
-        } else if (cache.isInstanceOf[AbstractColumnDictionaryInfo]) {
-          // dictionary
-          allDictSize = allDictSize + cache.getMemorySize
-          val dictId = dictIds.find(id => entry.getKey.startsWith(id._1))
-          if (dictId.isDefined) {
-            dbDictSize = dbDictSize + cache.getMemorySize
-            val memorySize = tableMapDictSize.get(dictId.get._2)
-            if (memorySize.isEmpty) {
-              tableMapDictSize.put(dictId.get._2, cache.getMemorySize)
-            } else {
-              tableMapDictSize.put(dictId.get._2, memorySize.get + cache.getMemorySize)
-            }
-          }
-        }
-      }
-      if (tableMapIndexSize.isEmpty && tableMapDatamapSize.isEmpty && tableMapDictSize.isEmpty) {
-        Seq(
-          Row("ALL", "ALL", bytesToDisplaySize(allIndexSize),
-            bytesToDisplaySize(allDatamapSize), bytesToDisplaySize(allDictSize)),
-          Row(currentDatabase, "ALL", bytesToDisplaySize(0),
-            bytesToDisplaySize(0), bytesToDisplaySize(0)))
-      } else {
-        val tableList = tableMapIndexSize
-          .map(_._1)
-          .toSeq
-          .union(tableMapDictSize.map(_._1).toSeq)
-          .distinct
-          .sorted
-          .map { uniqueName =>
-            val values = uniqueName.split("\\.")
-            val indexSize = tableMapIndexSize.getOrElse(uniqueName, 0L)
-            val datamapSize = tableMapDatamapSize.getOrElse(uniqueName, 0L)
-            val dictSize = tableMapDictSize.getOrElse(uniqueName, 0L)
-            Row(values(0), values(1), bytesToDisplaySize(indexSize),
-              bytesToDisplaySize(datamapSize), bytesToDisplaySize(dictSize))
+      val tableList: Seq[Row] = carbonTables.map {
+        carbonTable =>
+          val tableResult = getTableCache(sparkSession, carbonTable)
+          var (indexSize, datamapSize) = (tableResult(0).getLong(1), 0L)
+          tableResult.drop(2).foreach {
+            row =>
+              indexSize += row.getLong(1)
+              datamapSize += row.getLong(2)
           }
+          val dictSize = tableResult(1).getLong(1)
 
-        Seq(
-          Row("ALL", "ALL", bytesToDisplaySize(allIndexSize),
-            bytesToDisplaySize(allDatamapSize), bytesToDisplaySize(allDictSize)),
-          Row(currentDatabase, "ALL", bytesToDisplaySize(dbIndexSize),
-            bytesToDisplaySize(dbDatamapSize), bytesToDisplaySize(dbDictSize))
-        ) ++ tableList
-      }
-    }
-  }
-
-  def showTableCache(sparkSession: SparkSession, carbonTable: CarbonTable): Seq[Row] = {
-    val cache = CacheProvider.getInstance().getCarbonCache()
-    if (cache == null) {
-      Seq.empty
-    } else {
-      val tablePath = carbonTable.getTablePath + CarbonCommonConstants.FILE_SEPARATOR
-      var numIndexFilesCached = 0
+          dbIndexSize += indexSize
+          dbDictSize += dictSize
+          dbDatamapSize += datamapSize
 
-      // Path -> Name, Type
-      val datamapName = mutable.Map[String, (String, String)]()
-      // Path -> Size
-      val datamapSize = mutable.Map[String, Long]()
-      // parent table
-      datamapName.put(tablePath, ("", ""))
-      datamapSize.put(tablePath, 0)
-      // children tables
-      for( schema <- carbonTable.getTableInfo.getDataMapSchemaList.asScala ) {
-        val childTableName = carbonTable.getTableName + "_" + schema.getDataMapName
-        val childTable = CarbonEnv
-          .getCarbonTable(Some(carbonTable.getDatabaseName), childTableName)(sparkSession)
-        val path = childTable.getTablePath + CarbonCommonConstants.FILE_SEPARATOR
-        val name = schema.getDataMapName
-        val dmType = schema.getProviderName
-        datamapName.put(path, (name, dmType))
-        datamapSize.put(path, 0)
-      }
-      // index schemas
-      for (schema <- DataMapStoreManager.getInstance().getDataMapSchemasOfTable(carbonTable)
-        .asScala) {
-        val path = tablePath + schema.getDataMapName + CarbonCommonConstants.FILE_SEPARATOR
-        val name = schema.getDataMapName
-        val dmType = schema.getProviderName
-        datamapName.put(path, (name, dmType))
-        datamapSize.put(path, 0)
-      }
-
-      var dictSize = 0L
-
-      // dictionary column ids
-      val dictIds = carbonTable
-        .getAllDimensions
-        .asScala
-        .filter(_.isGlobalDictionaryEncoding)
-        .map(_.getColumnId)
-        .toArray
-
-      val cacheIterator = cache.getCacheMap.entrySet().iterator()
-      while (cacheIterator.hasNext) {
-        val entry = cacheIterator.next()
-        val cache = entry.getValue
-
-        if (cache.isInstanceOf[BlockletDataMapIndexWrapper]) {
-          // index
-          val indexPath = entry.getKey.replace(CarbonCommonConstants.WINDOWS_FILE_SEPARATOR,
-            CarbonCommonConstants.FILE_SEPARATOR)
-          val pathEntry = datamapSize.filter(entry => indexPath.startsWith(entry._1))
-          if(pathEntry.nonEmpty) {
-            val (path, size) = pathEntry.iterator.next()
-            datamapSize.put(path, size + cache.getMemorySize)
-          }
-          if(indexPath.startsWith(tablePath)) {
-            numIndexFilesCached += 1
+          val tableName = if (!carbonTable.isTransactionalTable) {
+            carbonTable.getTableName + " (external table)"
           }
-        } else if (cache.isInstanceOf[BloomCacheKeyValue.CacheValue]) {
-          // bloom datamap
-          val shardPath = entry.getKey.replace(CarbonCommonConstants.WINDOWS_FILE_SEPARATOR,
-            CarbonCommonConstants.FILE_SEPARATOR)
-          val pathEntry = datamapSize.filter(entry => shardPath.contains(entry._1))
-          if(pathEntry.nonEmpty) {
-            val (path, size) = pathEntry.iterator.next()
-            datamapSize.put(path, size + cache.getMemorySize)
+          else {
+            carbonTable.getTableName
           }
-        } else if (cache.isInstanceOf[AbstractColumnDictionaryInfo]) {
-          // dictionary
-          val dictId = dictIds.find(id => entry.getKey.startsWith(id))
-          if (dictId.isDefined) {
-            dictSize = dictSize + cache.getMemorySize
+          (currentDatabase, tableName, indexSize, datamapSize, dictSize)
+      }.collect {
+        case (db, table, indexSize, datamapSize, dictSize) if !((indexSize == 0) &&
+                                                                (datamapSize == 0) &&
+                                                                (dictSize == 0)) =>
+          Row(db, table, indexSize, datamapSize, dictSize)
+      }
+
+      // Scan whole cache and fill the entries for All-Database-All-Tables
+      var (allIndexSize, allDatamapSize, allDictSize) = (0L, 0L, 0L)
+      cache.getCacheMap.asScala.foreach {
+        case (_, cacheable) =>
+          cacheable match {
+            case _: BlockletDataMapIndexWrapper =>
+              allIndexSize += cacheable.getMemorySize
+            case _: BloomCacheKeyValue.CacheValue =>
+              allDatamapSize += cacheable.getMemorySize
+            case _: AbstractColumnDictionaryInfo =>
+              allDictSize += cacheable.getMemorySize
           }
-        }
       }
 
-      // get all index files
-      val absoluteTableIdentifier = AbsoluteTableIdentifier.from(tablePath)
-      val numIndexFilesAll = CarbonDataMergerUtil.getValidSegmentList(absoluteTableIdentifier)
-        .asScala.map {
-          segment =>
-            segment.getCommittedIndexFile
-        }.flatMap {
-        indexFilesMap => indexFilesMap.keySet().toArray
-      }.size
+      Seq(
+        Row("ALL", "ALL", allIndexSize, allDatamapSize, allDictSize),
+        Row(currentDatabase, "ALL", dbIndexSize, dbDatamapSize, dbDictSize)
+      ) ++ tableList
+    }
+  }
 
-      var result = Seq(
-        Row("Index", bytesToDisplaySize(datamapSize.get(tablePath).get),
-          numIndexFilesCached + "/" + numIndexFilesAll + " index files cached"),
-        Row("Dictionary", bytesToDisplaySize(dictSize), "")
-      )
-      for ((path, size) <- datamapSize) {
-        if (path != tablePath) {
-          val (dmName, dmType) = datamapName.get(path).get
-          result = result :+ Row(dmName, bytesToDisplaySize(size), dmType)
-        }
-      }
-      result
+  def getTableCache(sparkSession: SparkSession, carbonTable: CarbonTable): Seq[Row] = {
+    val cache = CacheProvider.getInstance().getCarbonCache
+    val showTableCacheEvent = ShowTableCacheEvent(carbonTable, sparkSession, internalCall)
+    val operationContext = new OperationContext
+    // datamapName -> (datamapProviderName, indexSize, datamapSize)
+    val currentTableSizeMap = scala.collection.mutable.Map[String, (String, String, Long, Long)]()
+    operationContext.setProperty(carbonTable.getTableUniqueName, currentTableSizeMap)
+    OperationListenerBus.getInstance.fireEvent(showTableCacheEvent, operationContext)
+
+    // Get all Index files for the specified table.
+    val allIndexFiles: List[String] = CacheUtil.getAllIndexFiles(carbonTable)
+    val indexFilesInCache: List[String] = allIndexFiles.filter {
+      indexFile =>
+        cache.get(indexFile) != null
+    }
+    val sizeOfIndexFilesInCache: Long = indexFilesInCache.map {
+      indexFile =>
+        cache.get(indexFile).getMemorySize
+    }.sum
+
+    // Extract dictionary keys for the table and create cache keys from those
+    val dictKeys = CacheUtil.getAllDictCacheKeys(carbonTable)
+    val sizeOfDictInCache = dictKeys.collect {
+      case dictKey if cache.get(dictKey) != null =>
+        cache.get(dictKey).getMemorySize
+    }.sum
+
+    // Assemble result for all the datamaps for the table
+    val otherDatamaps = operationContext.getProperty(carbonTable.getTableUniqueName)
+      .asInstanceOf[mutable.Map[String, (String, Long, Long)]]
+    val otherDatamapsResults: Seq[Row] = otherDatamaps.map {
+      case (name, (provider, indexSize, dmSize)) =>
+        Row(name, indexSize, dmSize, provider)
+    }.toSeq
+
+    var comments = indexFilesInCache.size + "/" + allIndexFiles.size + " index files cached"
+    if (!carbonTable.isTransactionalTable) {
+      comments += " (external table)"
     }
+    Seq(
+      Row("Index", sizeOfIndexFilesInCache, comments),
+      Row("Dictionary", sizeOfDictInCache, "")
+    ) ++ otherDatamapsResults
   }
 
   override def processMetadata(sparkSession: SparkSession): Seq[Row] = {
     if (tableIdentifier.isEmpty) {
-      showAllTablesCache(sparkSession)
+      /**
+       * Assemble result for database
+       */
+      val result = getAllTablesCache(sparkSession)
+      result.map {
+        row =>
+          Row(row.get(0), row.get(1), bytesToDisplaySize(row.getLong(2)),
+            bytesToDisplaySize(row.getLong(3)), bytesToDisplaySize(row.getLong(4)))
+      }
     } else {
+      /**
+       * Assemble result for table
+       */
       val carbonTable = CarbonEnv.getCarbonTable(tableIdentifier.get)(sparkSession)
-      if (carbonTable.isChildDataMap) {
-        throw new UnsupportedOperationException("Operation not allowed on child table.")
+      if (CacheProvider.getInstance().getCarbonCache == null) {
+        return Seq.empty
+      }
+      val rawResult = getTableCache(sparkSession, carbonTable)
+      val result = rawResult.slice(0, 2) ++
+                   rawResult.drop(2).map {
+                     row =>
+                       Row(row.get(0), row.getLong(1) + row.getLong(2), row.get(3))
+                   }
+      result.map {
+        row =>
+          Row(row.get(0), bytesToDisplaySize(row.getLong(1)), row.get(2))
       }
-      showTableCache(sparkSession, carbonTable)
     }
   }
 }
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/DropCacheEventListeners.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/DropCacheEventListeners.scala
new file mode 100644
index 0000000..6c8bb54
--- /dev/null
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/DropCacheEventListeners.scala
@@ -0,0 +1,121 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to You under the Apache License, Version 2.0
+* (the "License"); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+*    http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+
+package org.apache.spark.sql.execution.command.cache
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.CarbonEnv
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.execution.command.cache.DropCachePreAggEventListener.LOGGER
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.cache.CacheProvider
+import org.apache.carbondata.core.datamap.DataMapStoreManager
+import org.apache.carbondata.core.metadata.schema.datamap.DataMapClassProvider
+import org.apache.carbondata.events.{DropTableCacheEvent, Event, OperationContext,
+  OperationEventListener}
+import org.apache.carbondata.processing.merger.CarbonDataMergerUtil
+
+object DropCachePreAggEventListener extends OperationEventListener {
+
+  val LOGGER = LogServiceFactory.getLogService(this.getClass.getCanonicalName)
+
+  /**
+   * Called on a specified event occurrence
+   *
+   * @param event
+   * @param operationContext
+   */
+  override protected def onEvent(event: Event, operationContext: OperationContext): Unit = {
+    event match {
+      case dropCacheEvent: DropTableCacheEvent =>
+        val carbonTable = dropCacheEvent.carbonTable
+        val sparkSession = dropCacheEvent.sparkSession
+        val internalCall = dropCacheEvent.internalCall
+        if (carbonTable.isChildDataMap && !internalCall) {
+          throw new UnsupportedOperationException("Operation not allowed on child table.")
+        }
+
+        if (carbonTable.hasDataMapSchema) {
+          val childrenSchemas = carbonTable.getTableInfo.getDataMapSchemaList.asScala
+            .filter(_.getRelationIdentifier != null)
+          for (childSchema <- childrenSchemas) {
+            val childTable =
+              CarbonEnv.getCarbonTable(
+                TableIdentifier(childSchema.getRelationIdentifier.getTableName,
+                  Some(childSchema.getRelationIdentifier.getDatabaseName)))(sparkSession)
+            try {
+              val dropCacheCommandForChildTable =
+                CarbonDropCacheCommand(
+                  TableIdentifier(childTable.getTableName, Some(childTable.getDatabaseName)),
+                  internalCall = true)
+              dropCacheCommandForChildTable.processMetadata(sparkSession)
+            }
+            catch {
+              case e: Exception =>
+                LOGGER.warn(
+                  s"Clean cache for PreAgg table ${ childTable.getTableName } failed.", e)
+            }
+          }
+        }
+    }
+  }
+}
+
+
+object DropCacheBloomEventListener extends OperationEventListener {
+
+  val LOGGER = LogServiceFactory.getLogService(this.getClass.getCanonicalName)
+
+  /**
+   * Called on a specified event occurrence
+   *
+   * @param event
+   * @param operationContext
+   */
+  override protected def onEvent(event: Event, operationContext: OperationContext): Unit = {
+    event match {
+      case dropCacheEvent: DropTableCacheEvent =>
+        val carbonTable = dropCacheEvent.carbonTable
+        val cache = CacheProvider.getInstance().getCarbonCache
+        val datamaps = DataMapStoreManager.getInstance().getDataMapSchemasOfTable(carbonTable)
+          .asScala.toList
+        val segments = CarbonDataMergerUtil
+          .getValidSegmentList(carbonTable.getAbsoluteTableIdentifier).asScala.toList
+
+        datamaps.foreach {
+          case datamap if datamap.getProviderName
+            .equalsIgnoreCase(DataMapClassProvider.BLOOMFILTER.getShortName) =>
+            try {
+              // Get datamap keys
+              val datamapKeys = CacheUtil.getBloomCacheKeys(carbonTable, datamap)
+
+              // remove datamap keys from cache
+              cache.removeAll(datamapKeys.asJava)
+            } catch {
+              case e: Exception =>
+                LOGGER.warn(
+                  s"Clean cache for Bloom datamap ${datamap.getDataMapName} failed.", e)
+            }
+          case _ =>
+        }
+    }
+  }
+}
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/DropCachePreAggEventListener.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/DropCachePreAggEventListener.scala
deleted file mode 100644
index 3d03c60..0000000
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/DropCachePreAggEventListener.scala
+++ /dev/null
@@ -1,70 +0,0 @@
-/*
-* Licensed to the Apache Software Foundation (ASF) under one or more
-* contributor license agreements.  See the NOTICE file distributed with
-* this work for additional information regarding copyright ownership.
-* The ASF licenses this file to You under the Apache License, Version 2.0
-* (the "License"); you may not use this file except in compliance with
-* the License.  You may obtain a copy of the License at
-*
-*    http://www.apache.org/licenses/LICENSE-2.0
-*
-* Unless required by applicable law or agreed to in writing, software
-* distributed under the License is distributed on an "AS IS" BASIS,
-* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-* See the License for the specific language governing permissions and
-* limitations under the License.
-*/
-
-package org.apache.spark.sql.execution.command.cache
-
-import scala.collection.JavaConverters._
-
-import org.apache.spark.internal.Logging
-import org.apache.spark.sql.CarbonEnv
-import org.apache.spark.sql.catalyst.TableIdentifier
-
-import org.apache.carbondata.common.logging.LogServiceFactory
-import org.apache.carbondata.events.{DropCacheEvent, Event, OperationContext,
-  OperationEventListener}
-
-object DropCachePreAggEventListener extends OperationEventListener {
-
-  val LOGGER = LogServiceFactory.getLogService(this.getClass.getCanonicalName)
-
-  /**
-   * Called on a specified event occurrence
-   *
-   * @param event
-   * @param operationContext
-   */
-  override protected def onEvent(event: Event,
-      operationContext: OperationContext): Unit = {
-
-    event match {
-      case dropCacheEvent: DropCacheEvent =>
-        val carbonTable = dropCacheEvent.carbonTable
-        val sparkSession = dropCacheEvent.sparkSession
-        val internalCall = dropCacheEvent.internalCall
-        if (carbonTable.isChildDataMap && !internalCall) {
-          throw new UnsupportedOperationException("Operation not allowed on child table.")
-        }
-
-        if (carbonTable.hasDataMapSchema) {
-          val childrenSchemas = carbonTable.getTableInfo.getDataMapSchemaList.asScala
-            .filter(_.getRelationIdentifier != null)
-          for (childSchema <- childrenSchemas) {
-            val childTable =
-              CarbonEnv.getCarbonTable(
-                TableIdentifier(childSchema.getRelationIdentifier.getTableName,
-                  Some(childSchema.getRelationIdentifier.getDatabaseName)))(sparkSession)
-            val dropCacheCommandForChildTable =
-              CarbonDropCacheCommand(
-                TableIdentifier(childTable.getTableName, Some(childTable.getDatabaseName)),
-                internalCall = true)
-            dropCacheCommandForChildTable.processMetadata(sparkSession)
-          }
-        }
-    }
-
-  }
-}
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/ShowCacheEventListeners.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/ShowCacheEventListeners.scala
new file mode 100644
index 0000000..70f63a4
--- /dev/null
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/ShowCacheEventListeners.scala
@@ -0,0 +1,126 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to You under the Apache License, Version 2.0
+* (the "License"); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+*    http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+
+package org.apache.spark.sql.execution.command.cache
+
+import java.util
+import java.util.{HashSet, Set}
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable
+
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.CarbonEnv
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.cache.CacheProvider
+import org.apache.carbondata.core.datamap.DataMapStoreManager
+import org.apache.carbondata.core.indexstore.BlockletDataMapIndexWrapper
+import org.apache.carbondata.core.metadata.schema.datamap.DataMapClassProvider
+import org.apache.carbondata.core.util.path.CarbonTablePath
+import org.apache.carbondata.datamap.bloom.{BloomCacheKeyValue, BloomCoarseGrainDataMapFactory}
+import org.apache.carbondata.events._
+import org.apache.carbondata.processing.merger.CarbonDataMergerUtil
+
+object ShowCachePreAggEventListener extends OperationEventListener {
+
+  val LOGGER = LogServiceFactory.getLogService(this.getClass.getCanonicalName)
+
+  /**
+   * Called on a specified event occurrence
+   *
+   * @param event
+   * @param operationContext
+   */
+  override protected def onEvent(event: Event, operationContext: OperationContext): Unit = {
+    event match {
+      case showTableCacheEvent: ShowTableCacheEvent =>
+        val carbonTable = showTableCacheEvent.carbonTable
+        val sparkSession = showTableCacheEvent.sparkSession
+        val internalCall = showTableCacheEvent.internalCall
+        if (carbonTable.isChildDataMap && !internalCall) {
+          throw new UnsupportedOperationException("Operation not allowed on child table.")
+        }
+
+        val currentTableSizeMap = operationContext.getProperty(carbonTable.getTableUniqueName)
+          .asInstanceOf[mutable.Map[String, (String, Long, Long)]]
+
+        if (carbonTable.hasDataMapSchema) {
+          val childrenSchemas = carbonTable.getTableInfo.getDataMapSchemaList.asScala
+            .filter(_.getRelationIdentifier != null)
+          for (childSchema <- childrenSchemas) {
+            val datamapName = childSchema.getDataMapName
+            val datamapProvider = childSchema.getProviderName
+            val childCarbonTable = CarbonEnv.getCarbonTable(
+              TableIdentifier(childSchema.getRelationIdentifier.getTableName,
+                Some(carbonTable.getDatabaseName)))(sparkSession)
+
+            val resultForChild = CarbonShowCacheCommand(None, true)
+              .getTableCache(sparkSession, childCarbonTable)
+            val datamapSize = resultForChild.head.getLong(1)
+            currentTableSizeMap.put(datamapName, (datamapProvider, datamapSize, 0L))
+          }
+        }
+    }
+  }
+}
+
+
+object ShowCacheBloomEventListener extends OperationEventListener {
+
+  val LOGGER = LogServiceFactory.getLogService(this.getClass.getCanonicalName)
+
+  /**
+   * Called on a specified event occurrence
+   *
+   * @param event
+   * @param operationContext
+   */
+  override protected def onEvent(event: Event, operationContext: OperationContext): Unit = {
+    event match {
+      case showTableCacheEvent: ShowTableCacheEvent =>
+        val carbonTable = showTableCacheEvent.carbonTable
+        val cache = CacheProvider.getInstance().getCarbonCache
+        val currentTableSizeMap = operationContext.getProperty(carbonTable.getTableUniqueName)
+          .asInstanceOf[mutable.Map[String, (String, Long, Long)]]
+
+        // Extract all datamaps for the table
+        val datamaps = DataMapStoreManager.getInstance().getDataMapSchemasOfTable(carbonTable)
+          .asScala
+
+        datamaps.foreach {
+          case datamap if datamap.getProviderName
+            .equalsIgnoreCase(DataMapClassProvider.BLOOMFILTER.getShortName) =>
+
+            // Get datamap keys
+            val datamapKeys = CacheUtil.getBloomCacheKeys(carbonTable, datamap)
+
+            // calculate the memory size if key exists in cache
+            val datamapSize = datamapKeys.collect {
+              case key if cache.get(key) != null =>
+                cache.get(key).getMemorySize
+            }.sum
+
+            // put the datmap size into main table's map.
+            currentTableSizeMap
+              .put(datamap.getDataMapName, (datamap.getProviderName, 0L, datamapSize))
+
+          case _ =>
+        }
+    }
+  }
+}


[carbondata] 27/41: [CARBONDATA-3293] Prune datamaps improvement for count(*)

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit ef8001ed1031cd49283d0da4eb8cbb706a7145e7
Author: dhatchayani <dh...@gmail.com>
AuthorDate: Fri Mar 15 12:37:27 2019 +0530

    [CARBONDATA-3293] Prune datamaps improvement for count(*)
    
    Problem:
    (1) Currently for count (*) , the prune is same as select * query. Blocklet and ExtendedBlocklet are formed from the DataMapRow and that is of no need and it is a time consuming process.
    (2) Checking the update/delete status all the time.
    
    Solution:
    (1) We have the blocklet row count in the DataMapRow itself, so it is just enough to read the count. With this count (*) query performance can be improved.
    (2) No need to check the update/delete status all the time unless the table is not updated/deleted.
    
    This closes #3148
---
 .../constants/CarbonCommonConstantsInternal.java   |  2 +
 .../carbondata/core/datamap/TableDataMap.java      | 44 +++++++++++++++
 .../carbondata/core/datamap/dev/DataMap.java       | 15 +++++
 .../datamap/dev/cgdatamap/CoarseGrainDataMap.java  | 14 +++++
 .../datamap/dev/fgdatamap/FineGrainDataMap.java    | 13 +++++
 .../indexstore/blockletindex/BlockDataMap.java     | 57 ++++++++++++++++++-
 .../indexstore/blockletindex/BlockletDataMap.java  |  3 +-
 .../blockletindex/BlockletDataMapRowIndexes.java   | 14 +++--
 .../core/indexstore/schema/SchemaGenerator.java    |  2 +
 .../carbondata/core/mutate/CarbonUpdateUtil.java   | 10 +++-
 .../hadoop/api/CarbonTableInputFormat.java         | 64 ++++++++++++----------
 ...ryWithColumnMetCacheAndCacheLevelProperty.scala |  5 +-
 .../org/apache/spark/sql/CarbonCountStar.scala     |  2 +-
 .../command/mutation/DeleteExecution.scala         |  2 +-
 14 files changed, 202 insertions(+), 45 deletions(-)

diff --git a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstantsInternal.java b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstantsInternal.java
index 398e03a..cfcbe44 100644
--- a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstantsInternal.java
+++ b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstantsInternal.java
@@ -24,4 +24,6 @@ public interface CarbonCommonConstantsInternal {
 
   String QUERY_ON_PRE_AGG_STREAMING = "carbon.query.on.preagg.streaming.";
 
+  String ROW_COUNT = "rowCount";
+
 }
diff --git a/core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java b/core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java
index 0d46fd8..15b0e8b 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java
@@ -18,6 +18,7 @@ package org.apache.carbondata.core.datamap;
 
 import java.io.IOException;
 import java.util.ArrayList;
+import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
 import java.util.concurrent.Callable;
@@ -34,6 +35,7 @@ import org.apache.carbondata.core.constants.CarbonCommonConstants;
 import org.apache.carbondata.core.datamap.dev.BlockletSerializer;
 import org.apache.carbondata.core.datamap.dev.DataMap;
 import org.apache.carbondata.core.datamap.dev.DataMapFactory;
+import org.apache.carbondata.core.datamap.dev.cgdatamap.CoarseGrainDataMap;
 import org.apache.carbondata.core.datamap.dev.fgdatamap.FineGrainBlocklet;
 import org.apache.carbondata.core.datastore.block.SegmentProperties;
 import org.apache.carbondata.core.datastore.impl.FileFactory;
@@ -499,4 +501,46 @@ public final class TableDataMap extends OperationEventListener {
     }
     return prunedSegments;
   }
+
+  /**
+   * Prune the datamap of the given segments and return the Map of blocklet path and row count
+   *
+   * @param segments
+   * @param partitions
+   * @return
+   * @throws IOException
+   */
+  public Map<String, Long> getBlockRowCount(List<Segment> segments,
+      final List<PartitionSpec> partitions, TableDataMap defaultDataMap)
+      throws IOException {
+    Map<String, Long> blockletToRowCountMap = new HashMap<>();
+    for (Segment segment : segments) {
+      List<CoarseGrainDataMap> dataMaps = defaultDataMap.getDataMapFactory().getDataMaps(segment);
+      for (CoarseGrainDataMap dataMap : dataMaps) {
+        dataMap.getRowCountForEachBlock(segment, partitions, blockletToRowCountMap);
+      }
+    }
+    return blockletToRowCountMap;
+  }
+
+  /**
+   * Prune the datamap of the given segments and return the Map of blocklet path and row count
+   *
+   * @param segments
+   * @param partitions
+   * @return
+   * @throws IOException
+   */
+  public long getRowCount(List<Segment> segments, final List<PartitionSpec> partitions,
+      TableDataMap defaultDataMap) throws IOException {
+    long totalRowCount = 0L;
+    for (Segment segment : segments) {
+      List<CoarseGrainDataMap> dataMaps = defaultDataMap.getDataMapFactory().getDataMaps(segment);
+      for (CoarseGrainDataMap dataMap : dataMaps) {
+        totalRowCount += dataMap.getRowCount(segment, partitions);
+      }
+    }
+    return totalRowCount;
+  }
+
 }
diff --git a/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java b/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java
index c52cc41..adc74b9 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java
@@ -18,8 +18,10 @@ package org.apache.carbondata.core.datamap.dev;
 
 import java.io.IOException;
 import java.util.List;
+import java.util.Map;
 
 import org.apache.carbondata.common.annotations.InterfaceAudience;
+import org.apache.carbondata.core.datamap.Segment;
 import org.apache.carbondata.core.datastore.block.SegmentProperties;
 import org.apache.carbondata.core.indexstore.Blocklet;
 import org.apache.carbondata.core.indexstore.PartitionSpec;
@@ -54,6 +56,19 @@ public interface DataMap<T extends Blocklet> {
   List<T> prune(Expression filter, SegmentProperties segmentProperties,
       List<PartitionSpec> partitions, CarbonTable carbonTable) throws IOException;
 
+  /**
+   * Prune the data maps for finding the row count. It returns a Map of
+   * blockletpath and the row count
+   */
+  long getRowCount(Segment segment, List<PartitionSpec> partitions) throws IOException;
+
+  /**
+   * Prune the data maps for finding the row count for each block. It returns a Map of
+   * blockletpath and the row count
+   */
+  Map<String, Long> getRowCountForEachBlock(Segment segment, List<PartitionSpec> partitions,
+      Map<String, Long> blockletToRowCountMap) throws IOException;
+
   // TODO Move this method to Abstract class
   /**
    * Validate whether the current segment needs to be fetching the required data
diff --git a/core/src/main/java/org/apache/carbondata/core/datamap/dev/cgdatamap/CoarseGrainDataMap.java b/core/src/main/java/org/apache/carbondata/core/datamap/dev/cgdatamap/CoarseGrainDataMap.java
index b4af9d9..3aba163 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/dev/cgdatamap/CoarseGrainDataMap.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/dev/cgdatamap/CoarseGrainDataMap.java
@@ -18,9 +18,11 @@ package org.apache.carbondata.core.datamap.dev.cgdatamap;
 
 import java.io.IOException;
 import java.util.List;
+import java.util.Map;
 
 import org.apache.carbondata.common.annotations.InterfaceAudience;
 import org.apache.carbondata.common.annotations.InterfaceStability;
+import org.apache.carbondata.core.datamap.Segment;
 import org.apache.carbondata.core.datamap.dev.DataMap;
 import org.apache.carbondata.core.datastore.block.SegmentProperties;
 import org.apache.carbondata.core.indexstore.Blocklet;
@@ -41,6 +43,18 @@ public abstract class CoarseGrainDataMap implements DataMap<Blocklet> {
     throw new UnsupportedOperationException("Filter expression not supported");
   }
 
+  @Override
+  public long getRowCount(Segment segment, List<PartitionSpec> partitions) throws IOException {
+    throw new UnsupportedOperationException("Operation not supported");
+  }
+
+  @Override
+  public Map<String, Long> getRowCountForEachBlock(Segment segment, List<PartitionSpec> partitions,
+      Map<String, Long> blockletToRowCountMap) throws IOException {
+    throw new UnsupportedOperationException("Operation not supported");
+  }
+
+
   @Override public int getNumberOfEntries() {
     // keep default, one record in one datamap
     return 1;
diff --git a/core/src/main/java/org/apache/carbondata/core/datamap/dev/fgdatamap/FineGrainDataMap.java b/core/src/main/java/org/apache/carbondata/core/datamap/dev/fgdatamap/FineGrainDataMap.java
index 03b2bfb..3a47df1 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/dev/fgdatamap/FineGrainDataMap.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/dev/fgdatamap/FineGrainDataMap.java
@@ -18,9 +18,11 @@ package org.apache.carbondata.core.datamap.dev.fgdatamap;
 
 import java.io.IOException;
 import java.util.List;
+import java.util.Map;
 
 import org.apache.carbondata.common.annotations.InterfaceAudience;
 import org.apache.carbondata.common.annotations.InterfaceStability;
+import org.apache.carbondata.core.datamap.Segment;
 import org.apache.carbondata.core.datamap.dev.DataMap;
 import org.apache.carbondata.core.datastore.block.SegmentProperties;
 import org.apache.carbondata.core.indexstore.PartitionSpec;
@@ -40,6 +42,17 @@ public abstract class FineGrainDataMap implements DataMap<FineGrainBlocklet> {
     throw new UnsupportedOperationException("Filter expression not supported");
   }
 
+  @Override
+  public long getRowCount(Segment segment, List<PartitionSpec> partitions) throws IOException {
+    throw new UnsupportedOperationException("Operation not supported");
+  }
+
+  @Override
+  public Map<String, Long> getRowCountForEachBlock(Segment segment, List<PartitionSpec> partitions,
+      Map<String, Long> blockletToRowCountMap) throws IOException {
+    throw new UnsupportedOperationException("Operation not supported");
+  }
+
   @Override public int getNumberOfEntries() {
     // keep default, one record in one datamap
     return 1;
diff --git a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
index a7818c2..8ebd50d 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
@@ -23,10 +23,13 @@ import java.nio.ByteBuffer;
 import java.util.ArrayList;
 import java.util.Arrays;
 import java.util.BitSet;
+import java.util.HashMap;
 import java.util.List;
+import java.util.Map;
 
 import org.apache.carbondata.common.logging.LogServiceFactory;
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datamap.Segment;
 import org.apache.carbondata.core.datamap.dev.DataMapModel;
 import org.apache.carbondata.core.datamap.dev.cgdatamap.CoarseGrainDataMap;
 import org.apache.carbondata.core.datastore.block.SegmentProperties;
@@ -217,6 +220,7 @@ public class BlockDataMap extends CoarseGrainDataMap
     CarbonRowSchema[] schema = getFileFooterEntrySchema();
     boolean[] minMaxFlag = new boolean[segmentProperties.getColumnsValueSize().length];
     Arrays.fill(minMaxFlag, true);
+    long totalRowCount = 0;
     for (DataFileFooter fileFooter : indexInfo) {
       TableBlockInfo blockInfo = fileFooter.getBlockInfo().getTableBlockInfo();
       BlockMetaInfo blockMetaInfo =
@@ -241,11 +245,14 @@ public class BlockDataMap extends CoarseGrainDataMap
         summaryRow = loadToUnsafeBlock(schema, taskSummarySchema, fileFooter, segmentProperties,
             getMinMaxCacheColumns(), blockInfo.getFilePath(), summaryRow,
             blockMetaInfo, updatedMinValues, updatedMaxValues, minMaxFlag);
+        totalRowCount += fileFooter.getNumberOfRows();
       }
     }
     List<Short> blockletCountList = new ArrayList<>();
     blockletCountList.add((short) 0);
     byte[] blockletCount = convertRowCountFromShortToByteArray(blockletCountList);
+    // set the total row count
+    summaryRow.setLong(totalRowCount, TASK_ROW_COUNT);
     summaryRow.setByteArray(blockletCount, taskSummarySchema.length - 1);
     setMinMaxFlagForTaskSummary(summaryRow, taskSummarySchema, segmentProperties, minMaxFlag);
     return summaryRow;
@@ -289,6 +296,7 @@ public class BlockDataMap extends CoarseGrainDataMap
     // min max flag for task summary
     boolean[] taskSummaryMinMaxFlag = new boolean[segmentProperties.getColumnsValueSize().length];
     Arrays.fill(taskSummaryMinMaxFlag, true);
+    long totalRowCount = 0;
     for (DataFileFooter fileFooter : indexInfo) {
       TableBlockInfo blockInfo = fileFooter.getBlockInfo().getTableBlockInfo();
       BlockMetaInfo blockMetaInfo =
@@ -331,6 +339,7 @@ public class BlockDataMap extends CoarseGrainDataMap
               summaryRow,
               blockletDataMapInfo.getBlockMetaInfoMap().get(previousBlockInfo.getFilePath()),
               blockMinValues, blockMaxValues, minMaxFlag);
+          totalRowCount += previousDataFileFooter.getNumberOfRows();
           minMaxFlag = new boolean[segmentProperties.getColumnsValueSize().length];
           Arrays.fill(minMaxFlag, true);
           // flag to check whether last file footer entry is different from previous entry.
@@ -361,9 +370,12 @@ public class BlockDataMap extends CoarseGrainDataMap
               blockletDataMapInfo.getBlockMetaInfoMap()
                   .get(previousDataFileFooter.getBlockInfo().getTableBlockInfo().getFilePath()),
               blockMinValues, blockMaxValues, minMaxFlag);
+      totalRowCount += previousDataFileFooter.getNumberOfRows();
       blockletCountInEachBlock.add(totalBlockletsInOneBlock);
     }
     byte[] blockletCount = convertRowCountFromShortToByteArray(blockletCountInEachBlock);
+    // set the total row count
+    summaryRow.setLong(totalRowCount, TASK_ROW_COUNT);
     // blocklet count index is the last index
     summaryRow.setByteArray(blockletCount, taskSummarySchema.length - 1);
     setMinMaxFlagForTaskSummary(summaryRow, taskSummarySchema, segmentProperties,
@@ -409,7 +421,7 @@ public class BlockDataMap extends CoarseGrainDataMap
     }
     DataMapRow row = new DataMapRowImpl(schema);
     int ordinal = 0;
-    int taskMinMaxOrdinal = 0;
+    int taskMinMaxOrdinal = 1;
     // get min max values for columns to be cached
     byte[][] minValuesForColumnsToBeCached = BlockletDataMapUtil
         .getMinMaxForColumnsToBeCached(segmentProperties, minMaxCacheColumns, minValues);
@@ -648,6 +660,49 @@ public class BlockDataMap extends CoarseGrainDataMap
     return sum;
   }
 
+  @Override
+  public long getRowCount(Segment segment, List<PartitionSpec> partitions) {
+    long totalRowCount =
+        taskSummaryDMStore.getDataMapRow(getTaskSummarySchema(), 0).getLong(TASK_ROW_COUNT);
+    if (totalRowCount == 0) {
+      Map<String, Long> blockletToRowCountMap = new HashMap<>();
+      getRowCountForEachBlock(segment, partitions, blockletToRowCountMap);
+      for (long blockletRowCount : blockletToRowCountMap.values()) {
+        totalRowCount += blockletRowCount;
+      }
+    } else {
+      if (taskSummaryDMStore.getRowCount() == 0) {
+        return 0L;
+      }
+    }
+    return totalRowCount;
+  }
+
+  public Map<String, Long> getRowCountForEachBlock(Segment segment, List<PartitionSpec> partitions,
+      Map<String, Long> blockletToRowCountMap) {
+    if (memoryDMStore.getRowCount() == 0) {
+      return new HashMap<>();
+    }
+    // if it has partitioned datamap but there is no partitioned information stored, it means
+    // partitions are dropped so return empty list.
+    if (partitions != null) {
+      if (!validatePartitionInfo(partitions)) {
+        return new HashMap<>();
+      }
+    }
+    CarbonRowSchema[] schema = getFileFooterEntrySchema();
+    int numEntries = memoryDMStore.getRowCount();
+    for (int i = 0; i < numEntries; i++) {
+      DataMapRow dataMapRow = memoryDMStore.getDataMapRow(schema, i);
+      String fileName = new String(dataMapRow.getByteArray(FILE_PATH_INDEX),
+          CarbonCommonConstants.DEFAULT_CHARSET_CLASS) + CarbonTablePath.getCarbonDataExtension();
+      int rowCount = dataMapRow.getInt(ROW_COUNT_INDEX);
+      // prepend segment number with the blocklet file path
+      blockletToRowCountMap.put((segment.getSegmentNo() + "," + fileName), (long) rowCount);
+    }
+    return blockletToRowCountMap;
+  }
+
   private List<Blocklet> prune(FilterResolverIntf filterExp) {
     if (memoryDMStore.getRowCount() == 0) {
       return new ArrayList<>();
diff --git a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java
index 191056d..7939a17 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java
@@ -146,6 +146,7 @@ public class BlockletDataMap extends BlockDataMap implements Serializable {
         relativeBlockletId += fileFooter.getBlockletList().size();
       }
     }
+    summaryRow.setLong(0L, TASK_ROW_COUNT);
     setMinMaxFlagForTaskSummary(summaryRow, taskSummarySchema, segmentProperties,
         summaryRowMinMaxFlag);
     return summaryRow;
@@ -163,7 +164,7 @@ public class BlockletDataMap extends BlockDataMap implements Serializable {
     for (int index = 0; index < blockletList.size(); index++) {
       DataMapRow row = new DataMapRowImpl(schema);
       int ordinal = 0;
-      int taskMinMaxOrdinal = 0;
+      int taskMinMaxOrdinal = 1;
       BlockletInfo blockletInfo = blockletList.get(index);
       blockletInfo.setSorted(fileFooter.isSorted());
       BlockletMinMaxIndex minMaxIndex = blockletInfo.getBlockletIndex().getMinMaxIndex();
diff --git a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapRowIndexes.java b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapRowIndexes.java
index 085fb7d..dcaecd2 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapRowIndexes.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapRowIndexes.java
@@ -50,15 +50,17 @@ public interface BlockletDataMapRowIndexes {
   int BLOCKLET_ID_INDEX = 12;
 
   // Summary dataMap row indexes
-  int TASK_MIN_VALUES_INDEX = 0;
+  int TASK_ROW_COUNT = 0;
 
-  int TASK_MAX_VALUES_INDEX = 1;
+  int TASK_MIN_VALUES_INDEX = 1;
 
-  int SUMMARY_INDEX_FILE_NAME = 2;
+  int TASK_MAX_VALUES_INDEX = 2;
 
-  int SUMMARY_SEGMENTID = 3;
+  int SUMMARY_INDEX_FILE_NAME = 3;
 
-  int TASK_MIN_MAX_FLAG = 4;
+  int SUMMARY_SEGMENTID = 4;
 
-  int SUMMARY_INDEX_PATH = 5;
+  int TASK_MIN_MAX_FLAG = 5;
+
+  int SUMMARY_INDEX_PATH = 6;
 }
diff --git a/core/src/main/java/org/apache/carbondata/core/indexstore/schema/SchemaGenerator.java b/core/src/main/java/org/apache/carbondata/core/indexstore/schema/SchemaGenerator.java
index 7a2e13a..52b9fb3 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/schema/SchemaGenerator.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/schema/SchemaGenerator.java
@@ -113,6 +113,8 @@ public class SchemaGenerator {
       List<CarbonColumn> minMaxCacheColumns,
       boolean storeBlockletCount, boolean filePathToBeStored) throws MemoryException {
     List<CarbonRowSchema> taskMinMaxSchemas = new ArrayList<>();
+    // for number of rows.
+    taskMinMaxSchemas.add(new CarbonRowSchema.FixedCarbonRowSchema(DataTypes.LONG));
     // get MinMax Schema
     getMinMaxSchema(segmentProperties, taskMinMaxSchemas, minMaxCacheColumns);
     // for storing file name
diff --git a/core/src/main/java/org/apache/carbondata/core/mutate/CarbonUpdateUtil.java b/core/src/main/java/org/apache/carbondata/core/mutate/CarbonUpdateUtil.java
index bd8c465..a632f03 100644
--- a/core/src/main/java/org/apache/carbondata/core/mutate/CarbonUpdateUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/mutate/CarbonUpdateUtil.java
@@ -28,6 +28,7 @@ import java.util.Set;
 
 import org.apache.carbondata.common.logging.LogServiceFactory;
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.constants.CarbonCommonConstantsInternal;
 import org.apache.carbondata.core.datamap.Segment;
 import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
 import org.apache.carbondata.core.datastore.filesystem.CarbonFileFilter;
@@ -747,9 +748,12 @@ public class CarbonUpdateUtil {
   /**
    * Return row count of input block
    */
-  public static long getRowCount(
-      BlockMappingVO blockMappingVO,
-      CarbonTable carbonTable) {
+  public static long getRowCount(BlockMappingVO blockMappingVO, CarbonTable carbonTable) {
+    if (blockMappingVO.getBlockRowCountMapping().size() == 1
+        && blockMappingVO.getBlockRowCountMapping().get(CarbonCommonConstantsInternal.ROW_COUNT)
+        != null) {
+      return blockMappingVO.getBlockRowCountMapping().get(CarbonCommonConstantsInternal.ROW_COUNT);
+    }
     SegmentUpdateStatusManager updateStatusManager =
         new SegmentUpdateStatusManager(carbonTable);
     long rowCount = 0;
diff --git a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
index 281143b..4ba8b8c 100644
--- a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
+++ b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
@@ -28,11 +28,11 @@ import java.util.List;
 import java.util.Map;
 
 import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstantsInternal;
 import org.apache.carbondata.core.datamap.DataMapStoreManager;
 import org.apache.carbondata.core.datamap.Segment;
 import org.apache.carbondata.core.datamap.TableDataMap;
 import org.apache.carbondata.core.datastore.impl.FileFactory;
-import org.apache.carbondata.core.indexstore.ExtendedBlocklet;
 import org.apache.carbondata.core.indexstore.PartitionSpec;
 import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
 import org.apache.carbondata.core.metadata.schema.PartitionInfo;
@@ -58,7 +58,6 @@ import org.apache.carbondata.core.statusmanager.SegmentUpdateStatusManager;
 import org.apache.carbondata.core.stream.StreamFile;
 import org.apache.carbondata.core.stream.StreamPruner;
 import org.apache.carbondata.core.util.CarbonUtil;
-import org.apache.carbondata.core.util.path.CarbonTablePath;
 import org.apache.carbondata.hadoop.CarbonInputSplit;
 
 import org.apache.hadoop.conf.Configuration;
@@ -576,7 +575,7 @@ public class CarbonTableInputFormat<T> extends CarbonInputFormat<T> {
    * Get the row count of the Block and mapping of segment and Block count.
    */
   public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
-      List<PartitionSpec> partitions) throws IOException {
+      List<PartitionSpec> partitions, boolean isUpdateFlow) throws IOException {
     // Normal query flow goes to CarbonInputFormat#getPrunedBlocklets and initialize the
     // pruning info for table we queried. But here count star query without filter uses a different
     // query plan, and no pruning info is initialized. When it calls default data map to
@@ -586,7 +585,7 @@ public class CarbonTableInputFormat<T> extends CarbonInputFormat<T> {
     ExplainCollector.remove();
 
     AbsoluteTableIdentifier identifier = table.getAbsoluteTableIdentifier();
-    TableDataMap blockletMap = DataMapStoreManager.getInstance().getDefaultDataMap(table);
+    TableDataMap defaultDataMap = DataMapStoreManager.getInstance().getDefaultDataMap(table);
 
     ReadCommittedScope readCommittedScope = getReadCommitted(job, identifier);
     LoadMetadataDetails[] loadMetadataDetails = readCommittedScope.getSegmentList();
@@ -602,6 +601,7 @@ public class CarbonTableInputFormat<T> extends CarbonInputFormat<T> {
     // TODO: currently only batch segment is supported, add support for streaming table
     List<Segment> filteredSegment =
         getFilteredSegment(job, allSegments.getValidSegments(), false, readCommittedScope);
+    boolean isIUDTable = (updateStatusManager.getUpdateStatusDetails().length != 0);
     /* In the select * flow, getSplits() method was clearing the segmentMap if,
     segment needs refreshing. same thing need for select count(*) flow also.
     For NonTransactional table, one of the reason for a segment refresh is below scenario.
@@ -624,36 +624,40 @@ public class CarbonTableInputFormat<T> extends CarbonInputFormat<T> {
           .clearInvalidSegments(getOrCreateCarbonTable(job.getConfiguration()),
               toBeCleanedSegments);
     }
-    List<ExtendedBlocklet> blocklets =
-        blockletMap.prune(filteredSegment, (FilterResolverIntf) null, partitions);
-    for (ExtendedBlocklet blocklet : blocklets) {
-      String blockName = blocklet.getPath();
-      blockName = CarbonTablePath.getCarbonDataFileName(blockName);
-      blockName = blockName + CarbonTablePath.getCarbonDataExtension();
-
-      long rowCount = blocklet.getDetailInfo().getRowCount();
-
-      String segmentId = Segment.toSegment(blocklet.getSegmentId()).getSegmentNo();
-      String key = CarbonUpdateUtil.getSegmentBlockNameKey(segmentId, blockName);
-
-      // if block is invalid then don't add the count
-      SegmentUpdateDetails details = updateStatusManager.getDetailsForABlock(key);
-
-      if (null == details || !CarbonUpdateUtil.isBlockInvalid(details.getSegmentStatus())) {
-        Long blockCount = blockRowCountMapping.get(key);
-        if (blockCount == null) {
-          blockCount = 0L;
-          Long count = segmentAndBlockCountMapping.get(segmentId);
-          if (count == null) {
-            count = 0L;
+    if (isIUDTable || isUpdateFlow) {
+      Map<String, Long> blockletToRowCountMap =
+          defaultDataMap.getBlockRowCount(filteredSegment, partitions, defaultDataMap);
+      // key is the (segmentId","+blockletPath) and key is the row count of that blocklet
+      for (Map.Entry<String, Long> eachBlocklet : blockletToRowCountMap.entrySet()) {
+        String[] segmentIdAndPath = eachBlocklet.getKey().split(",", 2);
+        String segmentId = segmentIdAndPath[0];
+        String blockName = segmentIdAndPath[1];
+
+        long rowCount = eachBlocklet.getValue();
+
+        String key = CarbonUpdateUtil.getSegmentBlockNameKey(segmentId, blockName);
+
+        // if block is invalid then don't add the count
+        SegmentUpdateDetails details = updateStatusManager.getDetailsForABlock(key);
+
+        if (null == details || !CarbonUpdateUtil.isBlockInvalid(details.getSegmentStatus())) {
+          Long blockCount = blockRowCountMapping.get(key);
+          if (blockCount == null) {
+            blockCount = 0L;
+            Long count = segmentAndBlockCountMapping.get(segmentId);
+            if (count == null) {
+              count = 0L;
+            }
+            segmentAndBlockCountMapping.put(segmentId, count + 1);
           }
-          segmentAndBlockCountMapping.put(segmentId, count + 1);
+          blockCount += rowCount;
+          blockRowCountMapping.put(key, blockCount);
         }
-        blockCount += rowCount;
-        blockRowCountMapping.put(key, blockCount);
       }
+    } else {
+      long totalRowCount = defaultDataMap.getRowCount(filteredSegment, partitions, defaultDataMap);
+      blockRowCountMapping.put(CarbonCommonConstantsInternal.ROW_COUNT, totalRowCount);
     }
-
     return new BlockMappingVO(blockRowCountMapping, segmentAndBlockCountMapping);
   }
 
diff --git a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/TestQueryWithColumnMetCacheAndCacheLevelProperty.scala b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/TestQueryWithColumnMetCacheAndCacheLevelProperty.scala
index 7c9a9fc..001964a 100644
--- a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/TestQueryWithColumnMetCacheAndCacheLevelProperty.scala
+++ b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/TestQueryWithColumnMetCacheAndCacheLevelProperty.scala
@@ -31,7 +31,7 @@ import org.apache.carbondata.core.constants.CarbonCommonConstants
 import org.apache.carbondata.core.datamap.dev.DataMap
 import org.apache.carbondata.core.datamap.{DataMapChooser, DataMapStoreManager, Segment, TableDataMap}
 import org.apache.carbondata.core.datastore.block.SegmentPropertiesAndSchemaHolder
-import org.apache.carbondata.core.indexstore.blockletindex.{BlockDataMap, BlockletDataMap}
+import org.apache.carbondata.core.indexstore.blockletindex.{BlockDataMap, BlockletDataMap, BlockletDataMapRowIndexes}
 import org.apache.carbondata.core.indexstore.schema.CarbonRowSchema
 import org.apache.carbondata.core.indexstore.Blocklet
 import org.apache.carbondata.core.metadata.datatype.DataTypes
@@ -93,7 +93,8 @@ class TestQueryWithColumnMetCacheAndCacheLevelProperty extends QueryTest with Be
     val index = dataMaps(0).asInstanceOf[BlockDataMap].getSegmentPropertiesIndex
     val summarySchema = SegmentPropertiesAndSchemaHolder.getInstance()
       .getSegmentPropertiesWrapper(index).getTaskSummarySchemaForBlock(storeBlockletCount, false)
-    val minSchemas = summarySchema(0).asInstanceOf[CarbonRowSchema.StructCarbonRowSchema]
+    val minSchemas = summarySchema(BlockletDataMapRowIndexes.TASK_MIN_VALUES_INDEX)
+      .asInstanceOf[CarbonRowSchema.StructCarbonRowSchema]
       .getChildSchemas
     minSchemas.length == expectedLength
   }
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonCountStar.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonCountStar.scala
index 297cb54..cfceea4 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonCountStar.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonCountStar.scala
@@ -64,7 +64,7 @@ case class CarbonCountStar(
           sparkSession,
           TableIdentifier(
             carbonTable.getTableName,
-            Some(carbonTable.getDatabaseName))).map(_.asJava).orNull),
+            Some(carbonTable.getDatabaseName))).map(_.asJava).orNull, false),
       carbonTable)
     val valueRaw =
       attributesRaw.head.dataType match {
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/DeleteExecution.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/DeleteExecution.scala
index a88a02b..7337496 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/DeleteExecution.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/DeleteExecution.scala
@@ -104,7 +104,7 @@ object DeleteExecution {
         CarbonFilters.getPartitions(
           Seq.empty,
           sparkSession,
-          TableIdentifier(tableName, databaseNameOp)).map(_.asJava).orNull)
+          TableIdentifier(tableName, databaseNameOp)).map(_.asJava).orNull, true)
     val segmentUpdateStatusMngr = new SegmentUpdateStatusManager(carbonTable)
     CarbonUpdateUtil
       .createBlockDetailsMap(blockMappingVO, segmentUpdateStatusMngr)


[carbondata] 17/41: [DOC] Update the doc of "Show DataMap"

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit ded8885f16c0fbf4a5b723c221045a7fd9e4f68b
Author: qiuchenjian <80...@qq.com>
AuthorDate: Wed Jan 30 09:23:21 2019 +0800

    [DOC] Update the doc of "Show DataMap"
    
    update the doc of showing datamap
    
    This closes #3117
---
 docs/datamap/datamap-management.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/docs/datamap/datamap-management.md b/docs/datamap/datamap-management.md
index 0dc4718..087c70a 100644
--- a/docs/datamap/datamap-management.md
+++ b/docs/datamap/datamap-management.md
@@ -141,6 +141,7 @@ There is a SHOW DATAMAPS command, when this is issued, system will read all data
 - DataMapName
 - DataMapProviderName like mv, preaggreagte, timeseries, etc
 - Associated Table
+- DataMap Properties
 
 ### Compaction on DataMap
 


[carbondata] 28/41: [CARBONDATA-3321] Improved Single/Concurrent query Performance

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit fbea5c64899365bc1e0bef87abe939884c9a7ed8
Author: kumarvishal09 <ku...@gmail.com>
AuthorDate: Tue Mar 19 19:22:30 2019 +0530

    [CARBONDATA-3321] Improved Single/Concurrent query Performance
    
    What changes were proposed in this pull request?
    Problem
    Single/Concurrent query is slow when number of segment is more because of below root cause:
    
    Memory footprint is more because of this gc is more and reducing query performance
    Converting to Unsafe data map row to safe data map during pruning
    Multi threaded pruning in case of non filter query is not supported
    Retrieval from unsafe data map row is slower
    Solution
    
    Reduce memory footprint during query
    Number of object created during query execution was high because of this GC was more and impacting query performance
    Reduced memory footprint of temporary objects.
    a) Added lazy decoding of data map row
    b) Remove convetToSafe, this was used for converting UnsafeDataMapRow to DataMapRowImpl for
    faster retrieval. Changed the Unsafe datamap row format for faster retrieval
    C) Reduced Unnecessary string object creation
    
    Added Multi thread pruning in case of non filter query
    When number of segments/blocks are more pruning is slow for non filter query. As multi threaded
    pruning in case of filter query is already supported added same for non filter query
    
    Changed UnsafeDmStore storage format for faster retrieval.
    Earlier only sequential access was allowed on UnsafeDataMapRow because of this converting
    unsafedatamaprow to blocklet was slow and impacting query performance.
    Changed format in UnsafeDataMapRow so random access can be done for faster retrieval
    
    How was this patch tested?
    Tested in 17 Node cluster with 5K/10K segments
    
    This closes #3154
---
 .../core/datamap/DataMapStoreManager.java          |  11 +-
 .../core/datamap/DistributableDataMapFormat.java   |   1 +
 .../apache/carbondata/core/datamap/Segment.java    |  24 +-
 .../carbondata/core/datamap/TableDataMap.java      |  58 ++--
 .../core/datastore/impl/FileFactory.java           |  43 ++-
 .../core/indexstore/ExtendedBlocklet.java          |  97 ++++--
 .../core/indexstore/SegmentPropertiesFetcher.java  |   3 +
 .../TableBlockIndexUniqueIdentifier.java           |   5 +-
 .../core/indexstore/UnsafeMemoryDMStore.java       | 161 +++++++---
 .../indexstore/blockletindex/BlockDataMap.java     |  88 +++---
 .../indexstore/blockletindex/BlockletDataMap.java  |  25 +-
 .../blockletindex/BlockletDataMapFactory.java      |   9 +-
 .../carbondata/core/indexstore/row/DataMapRow.java |  12 +-
 .../core/indexstore/row/UnsafeDataMapRow.java      | 217 ++-----------
 .../core/indexstore/schema/CarbonRowSchema.java    |   8 +
 .../core/indexstore/schema/SchemaGenerator.java    |  70 ++++
 .../carbondata/core/scan/model/QueryModel.java     |  30 --
 .../apache/carbondata/hadoop/CarbonInputSplit.java | 352 +++++++++++++++------
 .../hadoop/internal/ObjectArrayWritable.java       |   0
 .../carbondata/hadoop/internal/index/Block.java    |   0
 .../carbondata/hadoop/CarbonMultiBlockSplit.java   |  17 +-
 .../carbondata/hadoop/CarbonRecordReader.java      |   4 +-
 .../hadoop/api/CarbonFileInputFormat.java          |   6 +-
 .../carbondata/hadoop/api/CarbonInputFormat.java   |  55 +---
 .../hadoop/api/CarbonTableInputFormat.java         |  51 +--
 .../hadoop/util/CarbonVectorizedRecordReader.java  |   2 +-
 .../presto/impl/CarbonLocalInputSplit.java         |   4 +-
 .../org/apache/carbondata/spark/util/Util.java     |   3 +-
 .../carbondata/spark/rdd/CarbonMergerRDD.scala     |   4 +-
 .../carbondata/spark/rdd/CarbonScanRDD.scala       |  11 +-
 .../datasources/SparkCarbonFileFormat.scala        |   6 +-
 .../BloomCoarseGrainDataMapFunctionSuite.scala     |   2 +-
 32 files changed, 755 insertions(+), 624 deletions(-)

diff --git a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
index 085d98a..524d8b0 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
@@ -473,6 +473,15 @@ public final class DataMapStoreManager {
    * @param identifier Table identifier
    */
   public void clearDataMaps(AbsoluteTableIdentifier identifier) {
+    clearDataMaps(identifier, true);
+  }
+
+  /**
+   * Clear the datamap/datamaps of a table from memory
+   *
+   * @param identifier Table identifier
+   */
+  public void clearDataMaps(AbsoluteTableIdentifier identifier, boolean launchJob) {
     CarbonTable carbonTable = getCarbonTable(identifier);
     String tableUniqueName = identifier.getCarbonTableIdentifier().getTableUniqueName();
     List<TableDataMap> tableIndices = allDataMaps.get(tableUniqueName);
@@ -483,7 +492,7 @@ public final class DataMapStoreManager {
         tableIndices = allDataMaps.get(tableUniqueName);
       }
     }
-    if (null != carbonTable && tableIndices != null) {
+    if (null != carbonTable && tableIndices != null && launchJob) {
       try {
         DataMapUtil.executeDataMapJobForClearingDataMaps(carbonTable);
       } catch (IOException e) {
diff --git a/core/src/main/java/org/apache/carbondata/core/datamap/DistributableDataMapFormat.java b/core/src/main/java/org/apache/carbondata/core/datamap/DistributableDataMapFormat.java
index 007541d..4c23008 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/DistributableDataMapFormat.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/DistributableDataMapFormat.java
@@ -110,6 +110,7 @@ public class DistributableDataMapFormat extends FileInputFormat<Void, ExtendedBl
                 distributable.getDistributable(),
                 dataMapExprWrapper.getFilterResolverIntf(distributable.getUniqueId()), partitions);
         for (ExtendedBlocklet blocklet : blocklets) {
+          blocklet.getDetailInfo();
           blocklet.setDataMapUniqueId(distributable.getUniqueId());
         }
         blockletIterator = blocklets.iterator();
diff --git a/core/src/main/java/org/apache/carbondata/core/datamap/Segment.java b/core/src/main/java/org/apache/carbondata/core/datamap/Segment.java
index 85445eb..4797b53 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/Segment.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/Segment.java
@@ -62,6 +62,8 @@ public class Segment implements Serializable {
    */
   private LoadMetadataDetails loadMetadataDetails;
 
+  private String segmentString;
+
   public Segment(String segmentNo) {
     this.segmentNo = segmentNo;
   }
@@ -69,6 +71,7 @@ public class Segment implements Serializable {
   public Segment(String segmentNo, ReadCommittedScope readCommittedScope) {
     this.segmentNo = segmentNo;
     this.readCommittedScope = readCommittedScope;
+    segmentString = segmentNo;
   }
 
   /**
@@ -82,6 +85,11 @@ public class Segment implements Serializable {
     this.segmentNo = segmentNo;
     this.segmentFileName = segmentFileName;
     this.readCommittedScope = null;
+    if (segmentFileName != null) {
+      segmentString = segmentNo + "#" + segmentFileName;
+    } else {
+      segmentString = segmentNo;
+    }
   }
 
   /**
@@ -94,6 +102,11 @@ public class Segment implements Serializable {
     this.segmentNo = segmentNo;
     this.segmentFileName = segmentFileName;
     this.readCommittedScope = readCommittedScope;
+    if (segmentFileName != null) {
+      segmentString = segmentNo + "#" + segmentFileName;
+    } else {
+      segmentString = segmentNo;
+    }
   }
 
   /**
@@ -107,6 +120,11 @@ public class Segment implements Serializable {
     this.segmentFileName = segmentFileName;
     this.readCommittedScope = readCommittedScope;
     this.loadMetadataDetails = loadMetadataDetails;
+    if (segmentFileName != null) {
+      segmentString = segmentNo + "#" + segmentFileName;
+    } else {
+      segmentString = segmentNo;
+    }
   }
 
   /**
@@ -233,11 +251,7 @@ public class Segment implements Serializable {
   }
 
   @Override public String toString() {
-    if (segmentFileName != null) {
-      return segmentNo + "#" + segmentFileName;
-    } else {
-      return segmentNo;
-    }
+    return segmentString;
   }
 
   public LoadMetadataDetails getLoadMetadataDetails() {
diff --git a/core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java b/core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java
index 15b0e8b..f9020bd 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java
@@ -127,7 +127,7 @@ public final class TableDataMap extends OperationEventListener {
       }
       blocklets.addAll(addSegmentId(
           blockletDetailsFetcher.getExtendedBlocklets(pruneBlocklets, segment),
-          segment.toString()));
+          segment));
     }
     return blocklets;
   }
@@ -148,15 +148,11 @@ public final class TableDataMap extends OperationEventListener {
     final List<ExtendedBlocklet> blocklets = new ArrayList<>();
     final Map<Segment, List<DataMap>> dataMaps = dataMapFactory.getDataMaps(segments);
     // for non-filter queries
-    if (filterExp == null) {
-      // if filter is not passed, then return all the blocklets.
-      return pruneWithoutFilter(segments, partitions, blocklets);
-    }
     // for filter queries
     int totalFiles = 0;
     int datamapsCount = 0;
     for (Segment segment : segments) {
-      for (DataMap dataMap : dataMaps.get(segment)) {
+      for (DataMap dataMap: dataMaps.get(segment)) {
         totalFiles += dataMap.getNumberOfEntries();
         datamapsCount++;
       }
@@ -168,11 +164,16 @@ public final class TableDataMap extends OperationEventListener {
       // As 0.1 million files block pruning can take only 1 second.
       // Doing multi-thread for smaller values is not recommended as
       // driver should have minimum threads opened to support multiple concurrent queries.
+      if (filterExp == null) {
+        // if filter is not passed, then return all the blocklets.
+        return pruneWithoutFilter(segments, partitions, blocklets);
+      }
       return pruneWithFilter(segments, filterExp, partitions, blocklets, dataMaps);
     }
     // handle by multi-thread
-    return pruneWithFilterMultiThread(segments, filterExp, partitions, blocklets, dataMaps,
-        totalFiles);
+    List<ExtendedBlocklet> extendedBlocklets =
+        pruneMultiThread(segments, filterExp, partitions, blocklets, dataMaps, totalFiles);
+    return extendedBlocklets;
   }
 
   private List<ExtendedBlocklet> pruneWithoutFilter(List<Segment> segments,
@@ -181,7 +182,7 @@ public final class TableDataMap extends OperationEventListener {
       List<Blocklet> allBlocklets = blockletDetailsFetcher.getAllBlocklets(segment, partitions);
       blocklets.addAll(
           addSegmentId(blockletDetailsFetcher.getExtendedBlocklets(allBlocklets, segment),
-              segment.toString()));
+              segment));
     }
     return blocklets;
   }
@@ -197,12 +198,12 @@ public final class TableDataMap extends OperationEventListener {
       }
       blocklets.addAll(
           addSegmentId(blockletDetailsFetcher.getExtendedBlocklets(pruneBlocklets, segment),
-              segment.toString()));
+              segment));
     }
     return blocklets;
   }
 
-  private List<ExtendedBlocklet> pruneWithFilterMultiThread(List<Segment> segments,
+  private List<ExtendedBlocklet> pruneMultiThread(List<Segment> segments,
       final FilterResolverIntf filterExp, final List<PartitionSpec> partitions,
       List<ExtendedBlocklet> blocklets, final Map<Segment, List<DataMap>> dataMaps,
       int totalFiles) {
@@ -279,7 +280,8 @@ public final class TableDataMap extends OperationEventListener {
       throw new RuntimeException(" not all the files processed ");
     }
     List<Future<Void>> results = new ArrayList<>(numOfThreadsForPruning);
-    final Map<Segment, List<Blocklet>> prunedBlockletMap = new ConcurrentHashMap<>(segments.size());
+    final Map<Segment, List<ExtendedBlocklet>> prunedBlockletMap =
+        new ConcurrentHashMap<>(segments.size());
     final ExecutorService executorService = Executors.newFixedThreadPool(numOfThreadsForPruning);
     final String threadName = Thread.currentThread().getName();
     for (int i = 0; i < numOfThreadsForPruning; i++) {
@@ -288,16 +290,22 @@ public final class TableDataMap extends OperationEventListener {
         @Override public Void call() throws IOException {
           Thread.currentThread().setName(threadName);
           for (SegmentDataMapGroup segmentDataMapGroup : segmentDataMapGroups) {
-            List<Blocklet> pruneBlocklets = new ArrayList<>();
+            List<ExtendedBlocklet> pruneBlocklets = new ArrayList<>();
             List<DataMap> dataMapList = dataMaps.get(segmentDataMapGroup.getSegment());
+            SegmentProperties segmentProperties =
+                segmentPropertiesFetcher.getSegmentPropertiesFromDataMap(dataMapList.get(0));
+            Segment segment = segmentDataMapGroup.getSegment();
             for (int i = segmentDataMapGroup.getFromIndex();
                  i <= segmentDataMapGroup.getToIndex(); i++) {
-              pruneBlocklets.addAll(dataMapList.get(i).prune(filterExp,
-                  segmentPropertiesFetcher.getSegmentProperties(segmentDataMapGroup.getSegment()),
-                  partitions));
+              List<Blocklet> dmPruneBlocklets  = dataMapList.get(i).prune(filterExp,
+                  segmentProperties,
+                  partitions);
+              pruneBlocklets.addAll(addSegmentId(blockletDetailsFetcher
+                      .getExtendedBlocklets(dmPruneBlocklets, segment),
+                  segment));
             }
             synchronized (prunedBlockletMap) {
-              List<Blocklet> pruneBlockletsExisting =
+              List<ExtendedBlocklet> pruneBlockletsExisting =
                   prunedBlockletMap.get(segmentDataMapGroup.getSegment());
               if (pruneBlockletsExisting != null) {
                 pruneBlockletsExisting.addAll(pruneBlocklets);
@@ -324,14 +332,8 @@ public final class TableDataMap extends OperationEventListener {
         throw new RuntimeException(e);
       }
     }
-    for (Map.Entry<Segment, List<Blocklet>> entry : prunedBlockletMap.entrySet()) {
-      try {
-        blocklets.addAll(addSegmentId(
-            blockletDetailsFetcher.getExtendedBlocklets(entry.getValue(), entry.getKey()),
-            entry.getKey().toString()));
-      } catch (IOException e) {
-        throw new RuntimeException(e);
-      }
+    for (Map.Entry<Segment, List<ExtendedBlocklet>> entry : prunedBlockletMap.entrySet()) {
+      blocklets.addAll(entry.getValue());
     }
     return blocklets;
   }
@@ -353,9 +355,9 @@ public final class TableDataMap extends OperationEventListener {
   }
 
   private List<ExtendedBlocklet> addSegmentId(List<ExtendedBlocklet> pruneBlocklets,
-      String segmentId) {
+      Segment segment) {
     for (ExtendedBlocklet blocklet : pruneBlocklets) {
-      blocklet.setSegmentId(segmentId);
+      blocklet.setSegment(segment);
     }
     return pruneBlocklets;
   }
@@ -425,7 +427,7 @@ public final class TableDataMap extends OperationEventListener {
         detailedBlocklet.setDataMapWriterPath(blockletwritePath);
         serializer.serializeBlocklet((FineGrainBlocklet) blocklet, blockletwritePath);
       }
-      detailedBlocklet.setSegmentId(distributable.getSegment().toString());
+      detailedBlocklet.setSegment(distributable.getSegment());
       detailedBlocklets.add(detailedBlocklet);
     }
     return detailedBlocklets;
diff --git a/core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java b/core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java
index 7dbbe2a..a27023f 100644
--- a/core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java
+++ b/core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java
@@ -81,19 +81,46 @@ public final class FileFactory {
   }
 
   public static FileType getFileType(String path) {
-    String lowerPath = path.toLowerCase();
-    if (lowerPath.startsWith(CarbonCommonConstants.HDFSURL_PREFIX)) {
+    FileType fileType = getFileTypeWithActualPath(path);
+    if (fileType != null) {
+      return fileType;
+    }
+    fileType = getFileTypeWithLowerCase(path);
+    if (fileType != null) {
+      return fileType;
+    }
+    return FileType.LOCAL;
+  }
+
+  private static FileType getFileTypeWithLowerCase(String path) {
+    String lowerCase = path.toLowerCase();
+    if (lowerCase.startsWith(CarbonCommonConstants.HDFSURL_PREFIX)) {
       return FileType.HDFS;
-    } else if (lowerPath.startsWith(CarbonCommonConstants.ALLUXIOURL_PREFIX)) {
+    } else if (lowerCase.startsWith(CarbonCommonConstants.ALLUXIOURL_PREFIX)) {
       return FileType.ALLUXIO;
-    } else if (lowerPath.startsWith(CarbonCommonConstants.VIEWFSURL_PREFIX)) {
+    } else if (lowerCase.startsWith(CarbonCommonConstants.VIEWFSURL_PREFIX)) {
       return FileType.VIEWFS;
-    } else if (lowerPath.startsWith(CarbonCommonConstants.S3N_PREFIX) ||
-        lowerPath.startsWith(CarbonCommonConstants.S3A_PREFIX) ||
-        lowerPath.startsWith(CarbonCommonConstants.S3_PREFIX)) {
+    } else if (lowerCase.startsWith(CarbonCommonConstants.S3N_PREFIX) || lowerCase
+        .startsWith(CarbonCommonConstants.S3A_PREFIX) || lowerCase
+        .startsWith(CarbonCommonConstants.S3_PREFIX)) {
       return FileType.S3;
     }
-    return FileType.LOCAL;
+    return null;
+  }
+
+  private static FileType getFileTypeWithActualPath(String path) {
+    if (path.startsWith(CarbonCommonConstants.HDFSURL_PREFIX)) {
+      return FileType.HDFS;
+    } else if (path.startsWith(CarbonCommonConstants.ALLUXIOURL_PREFIX)) {
+      return FileType.ALLUXIO;
+    } else if (path.startsWith(CarbonCommonConstants.VIEWFSURL_PREFIX)) {
+      return FileType.VIEWFS;
+    } else if (path.startsWith(CarbonCommonConstants.S3N_PREFIX) || path
+        .startsWith(CarbonCommonConstants.S3A_PREFIX) || path
+        .startsWith(CarbonCommonConstants.S3_PREFIX)) {
+      return FileType.S3;
+    }
+    return null;
   }
 
   public static CarbonFile getCarbonFile(String path) {
diff --git a/core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java b/core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java
index 22dff8e..8c4ea06 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java
@@ -16,58 +16,67 @@
  */
 package org.apache.carbondata.core.indexstore;
 
+import java.io.IOException;
+import java.util.List;
+
+import org.apache.carbondata.core.datamap.Segment;
+import org.apache.carbondata.core.indexstore.row.DataMapRow;
+import org.apache.carbondata.core.metadata.ColumnarFormatVersion;
+import org.apache.carbondata.core.metadata.schema.table.column.ColumnSchema;
+import org.apache.carbondata.hadoop.CarbonInputSplit;
+
 /**
  * Detailed blocklet information
  */
 public class ExtendedBlocklet extends Blocklet {
 
-  private String segmentId;
-
-  private BlockletDetailInfo detailInfo;
-
-  private long length;
-
-  private String[] location;
-
-  private String dataMapWriterPath;
-
   private String dataMapUniqueId;
 
-  public ExtendedBlocklet(String filePath, String blockletId) {
-    super(filePath, blockletId);
-  }
+  private CarbonInputSplit inputSplit;
 
   public ExtendedBlocklet(String filePath, String blockletId,
-      boolean compareBlockletIdForObjectMatching) {
+      boolean compareBlockletIdForObjectMatching, ColumnarFormatVersion version) {
     super(filePath, blockletId, compareBlockletIdForObjectMatching);
+    try {
+      this.inputSplit = CarbonInputSplit.from(null, blockletId, filePath, 0, 0, version, null);
+    } catch (IOException e) {
+      throw new RuntimeException(e);
+    }
   }
 
-  public BlockletDetailInfo getDetailInfo() {
-    return detailInfo;
+  public ExtendedBlocklet(String filePath, String blockletId, ColumnarFormatVersion version) {
+    this(filePath, blockletId, true, version);
   }
 
-  public void setDetailInfo(BlockletDetailInfo detailInfo) {
-    this.detailInfo = detailInfo;
+  public BlockletDetailInfo getDetailInfo() {
+    return this.inputSplit.getDetailInfo();
   }
 
-  public void setLocation(String[] location) {
-    this.location = location;
+  public void setDataMapRow(DataMapRow dataMapRow) {
+    this.inputSplit.setDataMapRow(dataMapRow);
   }
 
   public String[] getLocations() {
-    return location;
+    try {
+      return this.inputSplit.getLocations();
+    } catch (IOException e) {
+      throw new RuntimeException(e);
+    }
   }
 
   public long getLength() {
-    return length;
+    return this.inputSplit.getLength();
   }
 
   public String getSegmentId() {
-    return segmentId;
+    return this.inputSplit.getSegmentId();
   }
 
-  public void setSegmentId(String segmentId) {
-    this.segmentId = segmentId;
+  public Segment getSegment() {
+    return this.inputSplit.getSegment();
+  }
+  public void setSegment(Segment segment) {
+    this.inputSplit.setSegment(segment);
   }
 
   public String getPath() {
@@ -75,11 +84,11 @@ public class ExtendedBlocklet extends Blocklet {
   }
 
   public String getDataMapWriterPath() {
-    return dataMapWriterPath;
+    return this.inputSplit.getDataMapWritePath();
   }
 
   public void setDataMapWriterPath(String dataMapWriterPath) {
-    this.dataMapWriterPath = dataMapWriterPath;
+    this.inputSplit.setDataMapWritePath(dataMapWriterPath);
   }
 
   public String getDataMapUniqueId() {
@@ -98,13 +107,41 @@ public class ExtendedBlocklet extends Blocklet {
     }
 
     ExtendedBlocklet that = (ExtendedBlocklet) o;
-
-    return segmentId != null ? segmentId.equals(that.segmentId) : that.segmentId == null;
+    return inputSplit.getSegmentId() != null ?
+        inputSplit.getSegmentId().equals(that.inputSplit.getSegmentId()) :
+        that.inputSplit.getSegmentId() == null;
   }
 
   @Override public int hashCode() {
     int result = super.hashCode();
-    result = 31 * result + (segmentId != null ? segmentId.hashCode() : 0);
+    result = 31 * result + (inputSplit.getSegmentId() != null ?
+        inputSplit.getSegmentId().hashCode() :
+        0);
     return result;
   }
+
+  public CarbonInputSplit getInputSplit() {
+    return inputSplit;
+  }
+
+  public void setColumnCardinality(int[] cardinality) {
+    inputSplit.setColumnCardinality(cardinality);
+  }
+
+  public void setLegacyStore(boolean isLegacyStore) {
+    inputSplit.setLegacyStore(isLegacyStore);
+  }
+
+  public void setUseMinMaxForPruning(boolean useMinMaxForPruning) {
+    this.inputSplit.setUseMinMaxForPruning(useMinMaxForPruning);
+  }
+
+  public void setIsBlockCache(boolean isBlockCache) {
+    this.inputSplit.setIsBlockCache(isBlockCache);
+  }
+
+  public void setColumnSchema(List<ColumnSchema> columnSchema) {
+    this.inputSplit.setColumnSchema(columnSchema);
+  }
+
 }
diff --git a/core/src/main/java/org/apache/carbondata/core/indexstore/SegmentPropertiesFetcher.java b/core/src/main/java/org/apache/carbondata/core/indexstore/SegmentPropertiesFetcher.java
index b7fb98c..03f8a1d 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/SegmentPropertiesFetcher.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/SegmentPropertiesFetcher.java
@@ -20,6 +20,7 @@ package org.apache.carbondata.core.indexstore;
 import java.io.IOException;
 
 import org.apache.carbondata.core.datamap.Segment;
+import org.apache.carbondata.core.datamap.dev.DataMap;
 import org.apache.carbondata.core.datastore.block.SegmentProperties;
 
 /**
@@ -35,4 +36,6 @@ public interface SegmentPropertiesFetcher {
    */
   SegmentProperties getSegmentProperties(Segment segment)
       throws IOException;
+
+  SegmentProperties getSegmentPropertiesFromDataMap(DataMap coarseGrainDataMap) throws IOException;
 }
diff --git a/core/src/main/java/org/apache/carbondata/core/indexstore/TableBlockIndexUniqueIdentifier.java b/core/src/main/java/org/apache/carbondata/core/indexstore/TableBlockIndexUniqueIdentifier.java
index 3226ceb..9f6a76e 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/TableBlockIndexUniqueIdentifier.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/TableBlockIndexUniqueIdentifier.java
@@ -37,12 +37,15 @@ public class TableBlockIndexUniqueIdentifier implements Serializable {
 
   private String segmentId;
 
+  private String uniqueName;
+
   public TableBlockIndexUniqueIdentifier(String indexFilePath, String indexFileName,
       String mergeIndexFileName, String segmentId) {
     this.indexFilePath = indexFilePath;
     this.indexFileName = indexFileName;
     this.mergeIndexFileName = mergeIndexFileName;
     this.segmentId = segmentId;
+    this.uniqueName = indexFilePath + CarbonCommonConstants.FILE_SEPARATOR + indexFileName;
   }
 
   /**
@@ -51,7 +54,7 @@ public class TableBlockIndexUniqueIdentifier implements Serializable {
    * @return
    */
   public String getUniqueTableSegmentIdentifier() {
-    return indexFilePath + CarbonCommonConstants.FILE_SEPARATOR + indexFileName;
+    return this.uniqueName;
   }
 
   public String getIndexFilePath() {
diff --git a/core/src/main/java/org/apache/carbondata/core/indexstore/UnsafeMemoryDMStore.java b/core/src/main/java/org/apache/carbondata/core/indexstore/UnsafeMemoryDMStore.java
index 0db1b0a..8185c25 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/UnsafeMemoryDMStore.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/UnsafeMemoryDMStore.java
@@ -16,6 +16,7 @@
  */
 package org.apache.carbondata.core.indexstore;
 
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
 import org.apache.carbondata.core.indexstore.row.DataMapRow;
 import org.apache.carbondata.core.indexstore.row.UnsafeDataMapRow;
 import org.apache.carbondata.core.indexstore.schema.CarbonRowSchema;
@@ -52,7 +53,7 @@ public class UnsafeMemoryDMStore extends AbstractMemoryDMStore {
     this.allocatedSize = capacity;
     this.memoryBlock =
         UnsafeMemoryManager.allocateMemoryWithRetry(MemoryType.ONHEAP, taskId, allocatedSize);
-    this.pointers = new int[1000];
+    this.pointers = new int[100];
   }
 
   /**
@@ -66,7 +67,7 @@ public class UnsafeMemoryDMStore extends AbstractMemoryDMStore {
       increaseMemory(runningLength + rowSize);
     }
     if (this.pointers.length <= rowCount + 1) {
-      int[] newPointer = new int[pointers.length + 1000];
+      int[] newPointer = new int[pointers.length + 100];
       System.arraycopy(pointers, 0, newPointer, 0, pointers.length);
       this.pointers = newPointer;
     }
@@ -84,9 +85,33 @@ public class UnsafeMemoryDMStore extends AbstractMemoryDMStore {
 
   /**
    * Add the index row to unsafe.
+   * Below format is used to store data in memory block
+   * WRITE:
+   * <FD><FD><FD><VO><VO><VO><LO><VD><VD><VD>
+   * FD: Fixed Column data
+   * VO: Variable column data offset
+   * VD: Variable column data
+   * LO: Last Offset
+   *
+   * Read:
+   * FD: Read directly based of byte postion added in CarbonRowSchema
+   *
+   * VD: Read based on below logic
+   * if not last variable column schema
+   * X = read actual variable column offset based on byte postion added in CarbonRowSchema
+   * Y = read next variable column offset (next 4 bytes)
+   * get the length
+   * len  = (X-Y)
+   * read data from offset X of size len
+   *
+   * if last variable column
+   * X = read actual variable column offset based on byte postion added in CarbonRowSchema
+   * Y = read last offset (next 4 bytes)
+   * get the length
+   * len  = (X-Y)
+   * read data from offset X of size len
    *
    * @param indexRow
-   * @return
    */
   public void addIndexRow(CarbonRowSchema[] schema, DataMapRow indexRow) throws MemoryException {
     // First calculate the required memory to keep the row in unsafe
@@ -94,88 +119,122 @@ public class UnsafeMemoryDMStore extends AbstractMemoryDMStore {
     // Check whether allocated memory is sufficient or not.
     ensureSize(rowSize);
     int pointer = runningLength;
-
+    int bytePosition = 0;
     for (int i = 0; i < schema.length; i++) {
-      addToUnsafe(schema[i], indexRow, i);
+      switch (schema[i].getSchemaType()) {
+        case STRUCT:
+          CarbonRowSchema[] childSchemas =
+              ((CarbonRowSchema.StructCarbonRowSchema) schema[i]).getChildSchemas();
+          for (int j = 0; j < childSchemas.length; j++) {
+            if (childSchemas[j].getBytePosition() > bytePosition) {
+              bytePosition = childSchemas[j].getBytePosition();
+            }
+          }
+          break;
+        default:
+          if (schema[i].getBytePosition() > bytePosition) {
+            bytePosition = schema[i].getBytePosition();
+          }
+      }
     }
+    // byte position of Last offset
+    bytePosition += CarbonCommonConstants.INT_SIZE_IN_BYTE;
+    // start byte position of variable length data
+    int varColPosition = bytePosition + CarbonCommonConstants.INT_SIZE_IN_BYTE;
+    // current position refers to current byte postion in memory block
+    int currentPosition;
+    for (int i = 0; i < schema.length; i++) {
+      switch (schema[i].getSchemaType()) {
+        case STRUCT:
+          CarbonRowSchema[] childSchemas =
+              ((CarbonRowSchema.StructCarbonRowSchema) schema[i]).getChildSchemas();
+          DataMapRow row = indexRow.getRow(i);
+          for (int j = 0; j < childSchemas.length; j++) {
+            currentPosition = addToUnsafe(childSchemas[j], row, j, pointer, varColPosition);
+            if (currentPosition > 0) {
+              varColPosition = currentPosition;
+            }
+          }
+          break;
+        default:
+          currentPosition = addToUnsafe(schema[i], indexRow, i, pointer, varColPosition);
+          if (currentPosition > 0) {
+            varColPosition = currentPosition;
+          }
+          break;
+      }
+    }
+    // writting the last offset
+    getUnsafe()
+        .putInt(memoryBlock.getBaseObject(), memoryBlock.getBaseOffset() + pointer + bytePosition,
+            varColPosition);
+    // after adding last offset increament the length by 4 bytes as last postion
+    // written as INT
+    runningLength += CarbonCommonConstants.INT_SIZE_IN_BYTE;
     pointers[rowCount++] = pointer;
   }
 
-  private void addToUnsafe(CarbonRowSchema schema, DataMapRow row, int index) {
+  private int addToUnsafe(CarbonRowSchema schema, DataMapRow row, int index, int startOffset,
+      int varPosition) {
     switch (schema.getSchemaType()) {
       case FIXED:
         DataType dataType = schema.getDataType();
         if (dataType == DataTypes.BYTE) {
-          getUnsafe()
-              .putByte(memoryBlock.getBaseObject(), memoryBlock.getBaseOffset() + runningLength,
-                  row.getByte(index));
+          getUnsafe().putByte(memoryBlock.getBaseObject(),
+              memoryBlock.getBaseOffset() + startOffset + schema.getBytePosition(),
+              row.getByte(index));
           runningLength += row.getSizeInBytes(index);
         } else if (dataType == DataTypes.BOOLEAN) {
-          getUnsafe()
-              .putBoolean(memoryBlock.getBaseObject(), memoryBlock.getBaseOffset() + runningLength,
-                  row.getBoolean(index));
+          getUnsafe().putBoolean(memoryBlock.getBaseObject(),
+              memoryBlock.getBaseOffset() + startOffset + schema.getBytePosition(),
+              row.getBoolean(index));
           runningLength += row.getSizeInBytes(index);
         } else if (dataType == DataTypes.SHORT) {
-          getUnsafe()
-              .putShort(memoryBlock.getBaseObject(), memoryBlock.getBaseOffset() + runningLength,
-                  row.getShort(index));
+          getUnsafe().putShort(memoryBlock.getBaseObject(),
+              memoryBlock.getBaseOffset() + startOffset + schema.getBytePosition(),
+              row.getShort(index));
           runningLength += row.getSizeInBytes(index);
         } else if (dataType == DataTypes.INT) {
-          getUnsafe()
-              .putInt(memoryBlock.getBaseObject(), memoryBlock.getBaseOffset() + runningLength,
-                  row.getInt(index));
+          getUnsafe().putInt(memoryBlock.getBaseObject(),
+              memoryBlock.getBaseOffset() + startOffset + schema.getBytePosition(),
+              row.getInt(index));
           runningLength += row.getSizeInBytes(index);
         } else if (dataType == DataTypes.LONG) {
-          getUnsafe()
-              .putLong(memoryBlock.getBaseObject(), memoryBlock.getBaseOffset() + runningLength,
-                  row.getLong(index));
+          getUnsafe().putLong(memoryBlock.getBaseObject(),
+              memoryBlock.getBaseOffset() + startOffset + schema.getBytePosition(),
+              row.getLong(index));
           runningLength += row.getSizeInBytes(index);
         } else if (dataType == DataTypes.FLOAT) {
-          getUnsafe()
-              .putFloat(memoryBlock.getBaseObject(), memoryBlock.getBaseOffset() + runningLength,
-                  row.getFloat(index));
+          getUnsafe().putFloat(memoryBlock.getBaseObject(),
+              memoryBlock.getBaseOffset() + startOffset + schema.getBytePosition(),
+              row.getFloat(index));
           runningLength += row.getSizeInBytes(index);
         } else if (dataType == DataTypes.DOUBLE) {
-          getUnsafe()
-              .putDouble(memoryBlock.getBaseObject(), memoryBlock.getBaseOffset() + runningLength,
-                  row.getDouble(index));
+          getUnsafe().putDouble(memoryBlock.getBaseObject(),
+              memoryBlock.getBaseOffset() + startOffset + schema.getBytePosition(),
+              row.getDouble(index));
           runningLength += row.getSizeInBytes(index);
         } else if (dataType == DataTypes.BYTE_ARRAY) {
           byte[] data = row.getByteArray(index);
           getUnsafe().copyMemory(data, BYTE_ARRAY_OFFSET, memoryBlock.getBaseObject(),
-              memoryBlock.getBaseOffset() + runningLength, data.length);
+              memoryBlock.getBaseOffset() + startOffset + schema.getBytePosition(), data.length);
           runningLength += row.getSizeInBytes(index);
         } else {
           throw new UnsupportedOperationException(
               "unsupported data type for unsafe storage: " + schema.getDataType());
         }
-        break;
+        return 0;
       case VARIABLE_SHORT:
-        byte[] data = row.getByteArray(index);
-        getUnsafe().putShort(memoryBlock.getBaseObject(),
-            memoryBlock.getBaseOffset() + runningLength, (short) data.length);
-        runningLength += 2;
-        getUnsafe().copyMemory(data, BYTE_ARRAY_OFFSET, memoryBlock.getBaseObject(),
-            memoryBlock.getBaseOffset() + runningLength, data.length);
-        runningLength += data.length;
-        break;
       case VARIABLE_INT:
-        byte[] data2 = row.getByteArray(index);
+        byte[] data = row.getByteArray(index);
         getUnsafe().putInt(memoryBlock.getBaseObject(),
-            memoryBlock.getBaseOffset() + runningLength, data2.length);
+            memoryBlock.getBaseOffset() + startOffset + schema.getBytePosition(), varPosition);
         runningLength += 4;
-        getUnsafe().copyMemory(data2, BYTE_ARRAY_OFFSET, memoryBlock.getBaseObject(),
-            memoryBlock.getBaseOffset() + runningLength, data2.length);
-        runningLength += data2.length;
-        break;
-      case STRUCT:
-        CarbonRowSchema[] childSchemas =
-            ((CarbonRowSchema.StructCarbonRowSchema) schema).getChildSchemas();
-        DataMapRow struct = row.getRow(index);
-        for (int i = 0; i < childSchemas.length; i++) {
-          addToUnsafe(childSchemas[i], struct, i);
-        }
-        break;
+        getUnsafe().copyMemory(data, BYTE_ARRAY_OFFSET, memoryBlock.getBaseObject(),
+            memoryBlock.getBaseOffset() + startOffset + varPosition, data.length);
+        runningLength += data.length;
+        varPosition += data.length;
+        return varPosition;
       default:
         throw new UnsupportedOperationException(
             "unsupported data type for unsafe storage: " + schema.getDataType());
diff --git a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
index 8ebd50d..4b32688 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
@@ -39,7 +39,6 @@ import org.apache.carbondata.core.datastore.impl.FileFactory;
 import org.apache.carbondata.core.indexstore.AbstractMemoryDMStore;
 import org.apache.carbondata.core.indexstore.BlockMetaInfo;
 import org.apache.carbondata.core.indexstore.Blocklet;
-import org.apache.carbondata.core.indexstore.BlockletDetailInfo;
 import org.apache.carbondata.core.indexstore.ExtendedBlocklet;
 import org.apache.carbondata.core.indexstore.PartitionSpec;
 import org.apache.carbondata.core.indexstore.SafeMemoryDMStore;
@@ -48,6 +47,7 @@ import org.apache.carbondata.core.indexstore.row.DataMapRow;
 import org.apache.carbondata.core.indexstore.row.DataMapRowImpl;
 import org.apache.carbondata.core.indexstore.schema.CarbonRowSchema;
 import org.apache.carbondata.core.memory.MemoryException;
+import org.apache.carbondata.core.metadata.ColumnarFormatVersion;
 import org.apache.carbondata.core.metadata.blocklet.DataFileFooter;
 import org.apache.carbondata.core.metadata.blocklet.index.BlockletIndex;
 import org.apache.carbondata.core.metadata.blocklet.index.BlockletMinMaxIndex;
@@ -131,7 +131,7 @@ public class BlockDataMap extends CoarseGrainDataMap
       filePath = path.getParent().toString().getBytes(CarbonCommonConstants.DEFAULT_CHARSET);
       isFilePathStored = true;
     }
-    byte[] fileName = path.getName().toString().getBytes(CarbonCommonConstants.DEFAULT_CHARSET);
+    byte[] fileName = path.getName().getBytes(CarbonCommonConstants.DEFAULT_CHARSET);
     byte[] segmentId =
         blockletDataMapInfo.getSegmentId().getBytes(CarbonCommonConstants.DEFAULT_CHARSET);
     if (!indexInfo.isEmpty()) {
@@ -711,13 +711,16 @@ public class BlockDataMap extends CoarseGrainDataMap
     CarbonRowSchema[] schema = getFileFooterEntrySchema();
     String filePath = getFilePath();
     int numEntries = memoryDMStore.getRowCount();
-    int totalBlocklets = getTotalBlocklets();
+    int totalBlocklets = 0;
+    if (ExplainCollector.enabled()) {
+      totalBlocklets = getTotalBlocklets();
+    }
     int hitBlocklets = 0;
     if (filterExp == null) {
       for (int i = 0; i < numEntries; i++) {
-        DataMapRow safeRow = memoryDMStore.getDataMapRow(schema, i).convertToSafeRow();
-        blocklets.add(createBlocklet(safeRow, getFileNameWithFilePath(safeRow, filePath),
-            getBlockletId(safeRow), false));
+        DataMapRow dataMapRow = memoryDMStore.getDataMapRow(schema, i);
+        blocklets.add(createBlocklet(dataMapRow, getFileNameWithFilePath(dataMapRow, filePath),
+            getBlockletId(dataMapRow), false));
       }
       hitBlocklets = totalBlocklets;
     } else {
@@ -730,28 +733,31 @@ public class BlockDataMap extends CoarseGrainDataMap
       boolean useMinMaxForPruning = useMinMaxForExecutorPruning(filterExp);
       // min and max for executor pruning
       while (entryIndex < numEntries) {
-        DataMapRow safeRow = memoryDMStore.getDataMapRow(schema, entryIndex).convertToSafeRow();
-        boolean[] minMaxFlag = getMinMaxFlag(safeRow, BLOCK_MIN_MAX_FLAG);
-        String fileName = getFileNameWithFilePath(safeRow, filePath);
-        short blockletId = getBlockletId(safeRow);
+        DataMapRow row = memoryDMStore.getDataMapRow(schema, entryIndex);
+        boolean[] minMaxFlag = getMinMaxFlag(row, BLOCK_MIN_MAX_FLAG);
+        String fileName = getFileNameWithFilePath(row, filePath);
+        short blockletId = getBlockletId(row);
         boolean isValid =
-            addBlockBasedOnMinMaxValue(filterExecuter, getMinMaxValue(safeRow, MAX_VALUES_INDEX),
-                getMinMaxValue(safeRow, MIN_VALUES_INDEX), minMaxFlag, fileName, blockletId);
+            addBlockBasedOnMinMaxValue(filterExecuter, getMinMaxValue(row, MAX_VALUES_INDEX),
+                getMinMaxValue(row, MIN_VALUES_INDEX), minMaxFlag, fileName, blockletId);
         if (isValid) {
-          blocklets.add(createBlocklet(safeRow, fileName, blockletId, useMinMaxForPruning));
-          hitBlocklets += getBlockletNumOfEntry(entryIndex);
+          blocklets.add(createBlocklet(row, fileName, blockletId, useMinMaxForPruning));
+          if (ExplainCollector.enabled()) {
+            hitBlocklets += getBlockletNumOfEntry(entryIndex);
+          }
         }
         entryIndex++;
       }
     }
-
-    if (isLegacyStore) {
-      ExplainCollector.setShowPruningInfo(false);
-    } else {
-      ExplainCollector.setShowPruningInfo(true);
-      ExplainCollector.addTotalBlocklets(totalBlocklets);
-      ExplainCollector.addTotalBlocks(getTotalBlocks());
-      ExplainCollector.addDefaultDataMapPruningHit(hitBlocklets);
+    if (ExplainCollector.enabled()) {
+      if (isLegacyStore) {
+        ExplainCollector.setShowPruningInfo(false);
+      } else {
+        ExplainCollector.setShowPruningInfo(true);
+        ExplainCollector.addTotalBlocklets(totalBlocklets);
+        ExplainCollector.addTotalBlocks(getTotalBlocks());
+        ExplainCollector.addDefaultDataMapPruningHit(hitBlocklets);
+      }
     }
     return blocklets;
   }
@@ -907,10 +913,10 @@ public class BlockDataMap extends CoarseGrainDataMap
         rowIndex++;
       }
     }
-    DataMapRow safeRow =
-        memoryDMStore.getDataMapRow(getFileFooterEntrySchema(), rowIndex).convertToSafeRow();
+    DataMapRow row =
+        memoryDMStore.getDataMapRow(getFileFooterEntrySchema(), rowIndex);
     String filePath = getFilePath();
-    return createBlocklet(safeRow, getFileNameWithFilePath(safeRow, filePath), relativeBlockletId,
+    return createBlocklet(row, getFileNameWithFilePath(row, filePath), relativeBlockletId,
         false);
   }
 
@@ -961,34 +967,16 @@ public class BlockDataMap extends CoarseGrainDataMap
 
   protected ExtendedBlocklet createBlocklet(DataMapRow row, String fileName, short blockletId,
       boolean useMinMaxForPruning) {
-    ExtendedBlocklet blocklet = new ExtendedBlocklet(fileName, blockletId + "", false);
-    BlockletDetailInfo detailInfo = getBlockletDetailInfo(row, blockletId, blocklet);
-    detailInfo.setBlockletInfoBinary(new byte[0]);
-    blocklet.setDetailInfo(detailInfo);
+    short versionNumber = row.getShort(VERSION_INDEX);
+    ExtendedBlocklet blocklet = new ExtendedBlocklet(fileName, blockletId + "", false,
+        ColumnarFormatVersion.valueOf(versionNumber));
+    blocklet.setDataMapRow(row);
+    blocklet.setColumnCardinality(getColumnCardinality());
+    blocklet.setLegacyStore(isLegacyStore);
+    blocklet.setUseMinMaxForPruning(useMinMaxForPruning);
     return blocklet;
   }
 
-  protected BlockletDetailInfo getBlockletDetailInfo(DataMapRow row, short blockletId,
-      ExtendedBlocklet blocklet) {
-    BlockletDetailInfo detailInfo = new BlockletDetailInfo();
-    detailInfo.setRowCount(row.getInt(ROW_COUNT_INDEX));
-    detailInfo.setVersionNumber(row.getShort(VERSION_INDEX));
-    detailInfo.setBlockletId(blockletId);
-    detailInfo.setDimLens(getColumnCardinality());
-    detailInfo.setSchemaUpdatedTimeStamp(row.getLong(SCHEMA_UPADATED_TIME_INDEX));
-    try {
-      blocklet.setLocation(
-          new String(row.getByteArray(LOCATIONS), CarbonCommonConstants.DEFAULT_CHARSET)
-              .split(","));
-    } catch (IOException e) {
-      throw new RuntimeException(e);
-    }
-    detailInfo.setBlockFooterOffset(row.getLong(BLOCK_FOOTER_OFFSET));
-    detailInfo.setBlockSize(row.getLong(BLOCK_LENGTH));
-    detailInfo.setLegacyStore(isLegacyStore);
-    return detailInfo;
-  }
-
   private String[] getFileDetails() {
     try {
       String[] fileDetails = new String[3];
diff --git a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java
index 7939a17..23d39ce 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java
@@ -30,12 +30,12 @@ import org.apache.carbondata.core.datastore.block.SegmentProperties;
 import org.apache.carbondata.core.datastore.block.SegmentPropertiesAndSchemaHolder;
 import org.apache.carbondata.core.datastore.block.TableBlockInfo;
 import org.apache.carbondata.core.indexstore.BlockMetaInfo;
-import org.apache.carbondata.core.indexstore.BlockletDetailInfo;
 import org.apache.carbondata.core.indexstore.ExtendedBlocklet;
 import org.apache.carbondata.core.indexstore.row.DataMapRow;
 import org.apache.carbondata.core.indexstore.row.DataMapRowImpl;
 import org.apache.carbondata.core.indexstore.schema.CarbonRowSchema;
 import org.apache.carbondata.core.memory.MemoryException;
+import org.apache.carbondata.core.metadata.ColumnarFormatVersion;
 import org.apache.carbondata.core.metadata.blocklet.BlockletInfo;
 import org.apache.carbondata.core.metadata.blocklet.DataFileFooter;
 import org.apache.carbondata.core.metadata.blocklet.index.BlockletMinMaxIndex;
@@ -232,11 +232,10 @@ public class BlockletDataMap extends BlockDataMap implements Serializable {
       return super.getDetailedBlocklet(blockletId);
     }
     int absoluteBlockletId = Integer.parseInt(blockletId);
-    DataMapRow safeRow = memoryDMStore.getDataMapRow(getFileFooterEntrySchema(), absoluteBlockletId)
-        .convertToSafeRow();
-    short relativeBlockletId = safeRow.getShort(BLOCKLET_ID_INDEX);
+    DataMapRow row = memoryDMStore.getDataMapRow(getFileFooterEntrySchema(), absoluteBlockletId);
+    short relativeBlockletId = row.getShort(BLOCKLET_ID_INDEX);
     String filePath = getFilePath();
-    return createBlocklet(safeRow, getFileNameWithFilePath(safeRow, filePath), relativeBlockletId,
+    return createBlocklet(row, getFileNameWithFilePath(row, filePath), relativeBlockletId,
         false);
   }
 
@@ -262,13 +261,15 @@ public class BlockletDataMap extends BlockDataMap implements Serializable {
     if (isLegacyStore) {
       return super.createBlocklet(row, fileName, blockletId, useMinMaxForPruning);
     }
-    ExtendedBlocklet blocklet = new ExtendedBlocklet(fileName, blockletId + "");
-    BlockletDetailInfo detailInfo = getBlockletDetailInfo(row, blockletId, blocklet);
-    detailInfo.setColumnSchemas(getColumnSchema());
-    detailInfo.setBlockletInfoBinary(row.getByteArray(BLOCKLET_INFO_INDEX));
-    detailInfo.setPagesCount(row.getShort(BLOCKLET_PAGE_COUNT_INDEX));
-    detailInfo.setUseMinMaxForPruning(useMinMaxForPruning);
-    blocklet.setDetailInfo(detailInfo);
+    short versionNumber = row.getShort(VERSION_INDEX);
+    ExtendedBlocklet blocklet = new ExtendedBlocklet(fileName, blockletId + "",
+        ColumnarFormatVersion.valueOf(versionNumber));
+    blocklet.setColumnSchema(getColumnSchema());
+    blocklet.setUseMinMaxForPruning(useMinMaxForPruning);
+    blocklet.setIsBlockCache(false);
+    blocklet.setColumnCardinality(getColumnCardinality());
+    blocklet.setLegacyStore(isLegacyStore);
+    blocklet.setDataMapRow(row);
     return blocklet;
   }
 
diff --git a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
index 5892f78..2ef7b88 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
@@ -192,7 +192,7 @@ public class BlockletDataMapFactory extends CoarseGrainDataMapFactory
   @Override
   public List<ExtendedBlocklet> getExtendedBlocklets(List<Blocklet> blocklets, Segment segment)
       throws IOException {
-    List<ExtendedBlocklet> detailedBlocklets = new ArrayList<>();
+    List<ExtendedBlocklet> detailedBlocklets = new ArrayList<>(blocklets.size() + 1);
     // If it is already detailed blocklet then type cast and return same
     if (blocklets.size() > 0 && blocklets.get(0) instanceof ExtendedBlocklet) {
       for (Blocklet blocklet : blocklets) {
@@ -379,6 +379,13 @@ public class BlockletDataMapFactory extends CoarseGrainDataMapFactory
     return dataMap.getSegmentProperties();
   }
 
+  @Override public SegmentProperties getSegmentPropertiesFromDataMap(DataMap coarseGrainDataMap)
+      throws IOException {
+    assert (coarseGrainDataMap instanceof BlockDataMap);
+    BlockDataMap dataMap = (BlockDataMap) coarseGrainDataMap;
+    return dataMap.getSegmentProperties();
+  }
+
   @Override public List<Blocklet> getAllBlocklets(Segment segment, List<PartitionSpec> partitions)
       throws IOException {
     List<Blocklet> blocklets = new ArrayList<>();
diff --git a/core/src/main/java/org/apache/carbondata/core/indexstore/row/DataMapRow.java b/core/src/main/java/org/apache/carbondata/core/indexstore/row/DataMapRow.java
index c0ea0a0..18adc06 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/row/DataMapRow.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/row/DataMapRow.java
@@ -78,6 +78,8 @@ public abstract class DataMapRow implements Serializable {
     for (int i = 0; i < schemas.length; i++) {
       len += getSizeInBytes(i);
     }
+    // for last offset in unsafe data map row
+    len += 4;
     return len;
   }
 
@@ -86,7 +88,6 @@ public abstract class DataMapRow implements Serializable {
       case FIXED:
         return schemas[ordinal].getLength();
       case VARIABLE_SHORT:
-        return getLengthInBytes(ordinal) + 2;
       case VARIABLE_INT:
         return getLengthInBytes(ordinal) + 4;
       case STRUCT:
@@ -105,15 +106,6 @@ public abstract class DataMapRow implements Serializable {
     return schemas.length;
   }
 
-  /**
-   * default implementation
-   *
-   * @return
-   */
-  public DataMapRow convertToSafeRow() {
-    return this;
-  }
-
   public void setSchemas(CarbonRowSchema[] schemas) {
     if (null == this.schemas) {
       this.schemas = schemas;
diff --git a/core/src/main/java/org/apache/carbondata/core/indexstore/row/UnsafeDataMapRow.java b/core/src/main/java/org/apache/carbondata/core/indexstore/row/UnsafeDataMapRow.java
index 70f0e0d..5f6c4dc 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/row/UnsafeDataMapRow.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/row/UnsafeDataMapRow.java
@@ -19,8 +19,6 @@ package org.apache.carbondata.core.indexstore.row;
 
 import org.apache.carbondata.core.indexstore.schema.CarbonRowSchema;
 import org.apache.carbondata.core.memory.MemoryBlock;
-import org.apache.carbondata.core.metadata.datatype.DataType;
-import org.apache.carbondata.core.metadata.datatype.DataTypes;
 
 import static org.apache.carbondata.core.memory.CarbonUnsafe.BYTE_ARRAY_OFFSET;
 import static org.apache.carbondata.core.memory.CarbonUnsafe.getUnsafe;
@@ -47,38 +45,39 @@ public class UnsafeDataMapRow extends DataMapRow {
 
   @Override public byte[] getByteArray(int ordinal) {
     int length;
-    int position = getPosition(ordinal);
+    int currentOffset;
     switch (schemas[ordinal].getSchemaType()) {
       case VARIABLE_SHORT:
-        length = getUnsafe().getShort(block.getBaseObject(),
-            block.getBaseOffset() + pointer + position);
-        position += 2;
-        break;
       case VARIABLE_INT:
-        length = getUnsafe().getInt(block.getBaseObject(),
-            block.getBaseOffset() + pointer + position);
-        position += 4;
+        final int schemaOrdinal = schemas[ordinal].getBytePosition();
+        currentOffset = getUnsafe().getInt(block.getBaseObject(),
+            block.getBaseOffset() + pointer + schemaOrdinal);
+        int nextOffset = getUnsafe().getInt(block.getBaseObject(),
+            block.getBaseOffset() + pointer + schemaOrdinal + 4);
+        length = nextOffset - currentOffset;
         break;
       default:
+        currentOffset = schemas[ordinal].getBytePosition();
         length = schemas[ordinal].getLength();
     }
     byte[] data = new byte[length];
-    getUnsafe().copyMemory(block.getBaseObject(), block.getBaseOffset() + pointer + position, data,
-        BYTE_ARRAY_OFFSET, data.length);
+    getUnsafe()
+        .copyMemory(block.getBaseObject(), block.getBaseOffset() + pointer + currentOffset, data,
+            BYTE_ARRAY_OFFSET, data.length);
     return data;
   }
 
   @Override public int getLengthInBytes(int ordinal) {
     int length;
-    int position = getPosition(ordinal);
+    int schemaOrdinal = schemas[ordinal].getBytePosition();
     switch (schemas[ordinal].getSchemaType()) {
       case VARIABLE_SHORT:
-        length = getUnsafe().getShort(block.getBaseObject(),
-            block.getBaseOffset() + pointer + position);
-        break;
       case VARIABLE_INT:
-        length = getUnsafe().getInt(block.getBaseObject(),
-            block.getBaseOffset() + pointer + position);
+        int currentOffset = getUnsafe().getInt(block.getBaseObject(),
+            block.getBaseOffset() + pointer + schemaOrdinal);
+        int nextOffset = getUnsafe().getInt(block.getBaseObject(),
+            block.getBaseOffset() + pointer + schemaOrdinal + 4);
+        length = nextOffset - currentOffset;
         break;
       default:
         length = schemas[ordinal].getLength();
@@ -91,31 +90,14 @@ public class UnsafeDataMapRow extends DataMapRow {
   }
 
   @Override public boolean getBoolean(int ordinal) {
-    return getUnsafe()
-        .getBoolean(block.getBaseObject(), block.getBaseOffset() + pointer + getPosition(ordinal));
-  }
-
-  private int getLengthInBytes(int ordinal, int position) {
-    int length;
-    switch (schemas[ordinal].getSchemaType()) {
-      case VARIABLE_SHORT:
-        length = getUnsafe().getShort(block.getBaseObject(),
-            block.getBaseOffset() + pointer + position);
-        break;
-      case VARIABLE_INT:
-        length = getUnsafe().getInt(block.getBaseObject(),
-            block.getBaseOffset() + pointer + position);
-        break;
-      default:
-        length = schemas[ordinal].getLength();
-    }
-    return length;
+    return getUnsafe().getBoolean(block.getBaseObject(),
+        block.getBaseOffset() + pointer + schemas[ordinal].getBytePosition());
   }
 
   @Override public DataMapRow getRow(int ordinal) {
     CarbonRowSchema[] childSchemas =
         ((CarbonRowSchema.StructCarbonRowSchema) schemas[ordinal]).getChildSchemas();
-    return new UnsafeDataMapRow(childSchemas, block, pointer + getPosition(ordinal));
+    return new UnsafeDataMapRow(childSchemas, block, pointer);
   }
 
   @Override public void setByteArray(byte[] byteArray, int ordinal) {
@@ -123,8 +105,8 @@ public class UnsafeDataMapRow extends DataMapRow {
   }
 
   @Override public int getInt(int ordinal) {
-    return getUnsafe()
-        .getInt(block.getBaseObject(), block.getBaseOffset() + pointer + getPosition(ordinal));
+    return getUnsafe().getInt(block.getBaseObject(),
+        block.getBaseOffset() + pointer + schemas[ordinal].getBytePosition());
   }
 
   @Override public void setInt(int value, int ordinal) {
@@ -136,8 +118,8 @@ public class UnsafeDataMapRow extends DataMapRow {
   }
 
   @Override public byte getByte(int ordinal) {
-    return getUnsafe()
-        .getByte(block.getBaseObject(), block.getBaseOffset() + pointer + getPosition(ordinal));
+    return getUnsafe().getByte(block.getBaseObject(),
+        block.getBaseOffset() + pointer + schemas[ordinal].getBytePosition());
   }
 
   @Override public void setShort(short value, int ordinal) {
@@ -145,8 +127,8 @@ public class UnsafeDataMapRow extends DataMapRow {
   }
 
   @Override public short getShort(int ordinal) {
-    return getUnsafe()
-        .getShort(block.getBaseObject(), block.getBaseOffset() + pointer + getPosition(ordinal));
+    return getUnsafe().getShort(block.getBaseObject(),
+        block.getBaseOffset() + pointer + schemas[ordinal].getBytePosition());
   }
 
   @Override public void setLong(long value, int ordinal) {
@@ -154,8 +136,8 @@ public class UnsafeDataMapRow extends DataMapRow {
   }
 
   @Override public long getLong(int ordinal) {
-    return getUnsafe()
-        .getLong(block.getBaseObject(), block.getBaseOffset() + pointer + getPosition(ordinal));
+    return getUnsafe().getLong(block.getBaseObject(),
+        block.getBaseOffset() + pointer + schemas[ordinal].getBytePosition());
   }
 
   @Override public void setFloat(float value, int ordinal) {
@@ -163,8 +145,8 @@ public class UnsafeDataMapRow extends DataMapRow {
   }
 
   @Override public float getFloat(int ordinal) {
-    return getUnsafe()
-        .getFloat(block.getBaseObject(), block.getBaseOffset() + pointer + getPosition(ordinal));
+    return getUnsafe().getFloat(block.getBaseObject(),
+        block.getBaseOffset() + pointer + schemas[ordinal].getBytePosition());
   }
 
   @Override public void setDouble(double value, int ordinal) {
@@ -172,146 +154,11 @@ public class UnsafeDataMapRow extends DataMapRow {
   }
 
   @Override public double getDouble(int ordinal) {
-    return getUnsafe()
-        .getDouble(block.getBaseObject(), block.getBaseOffset() + pointer + getPosition(ordinal));
+    return getUnsafe().getDouble(block.getBaseObject(),
+        block.getBaseOffset() + pointer + schemas[ordinal].getBytePosition());
   }
 
   @Override public void setRow(DataMapRow row, int ordinal) {
     throw new UnsupportedOperationException("Not supported to set on unsafe row");
   }
-
-  /**
-   * Convert unsafe to safe row.
-   *
-   * @return
-   */
-  public DataMapRow convertToSafeRow() {
-    DataMapRowImpl row = new DataMapRowImpl(schemas);
-    int runningLength = 0;
-    for (int i = 0; i < schemas.length; i++) {
-      CarbonRowSchema schema = schemas[i];
-      switch (schema.getSchemaType()) {
-        case FIXED:
-          DataType dataType = schema.getDataType();
-          if (dataType == DataTypes.BYTE) {
-            row.setByte(
-                getUnsafe().getByte(
-                    block.getBaseObject(),
-                    block.getBaseOffset() + pointer + runningLength),
-                i);
-            runningLength += schema.getLength();
-          } else if (dataType == DataTypes.BOOLEAN) {
-            row.setBoolean(
-                getUnsafe().getBoolean(
-                    block.getBaseObject(),
-                    block.getBaseOffset() + pointer + runningLength),
-                i);
-            runningLength += schema.getLength();
-          } else if (dataType == DataTypes.SHORT) {
-            row.setShort(
-                getUnsafe().getShort(
-                    block.getBaseObject(),
-                    block.getBaseOffset() + pointer + runningLength),
-                i);
-            runningLength += schema.getLength();
-          } else if (dataType == DataTypes.INT) {
-            row.setInt(
-                getUnsafe().getInt(
-                    block.getBaseObject(),
-                    block.getBaseOffset() + pointer + runningLength),
-                i);
-            runningLength += schema.getLength();
-          } else if (dataType == DataTypes.LONG) {
-            row.setLong(
-                getUnsafe().getLong(
-                    block.getBaseObject(),
-                    block.getBaseOffset() + pointer + runningLength),
-                i);
-            runningLength += schema.getLength();
-          } else if (dataType == DataTypes.FLOAT) {
-            row.setFloat(
-                getUnsafe().getFloat(block.getBaseObject(),
-                    block.getBaseOffset() + pointer + runningLength),
-                i);
-            runningLength += schema.getLength();
-          } else if (dataType == DataTypes.DOUBLE) {
-            row.setDouble(
-                getUnsafe().getDouble(block.getBaseObject(),
-                    block.getBaseOffset() + pointer + runningLength),
-                i);
-            runningLength += schema.getLength();
-          } else if (dataType == DataTypes.BYTE_ARRAY) {
-            byte[] data = new byte[schema.getLength()];
-            getUnsafe().copyMemory(
-                block.getBaseObject(),
-                block.getBaseOffset() + pointer + runningLength,
-                data,
-                BYTE_ARRAY_OFFSET,
-                data.length);
-            row.setByteArray(data, i);
-            runningLength += data.length;
-          } else {
-            throw new UnsupportedOperationException(
-                "unsupported data type for unsafe storage: " + schema.getDataType());
-          }
-          break;
-        case VARIABLE_SHORT:
-          int length = getUnsafe()
-              .getShort(block.getBaseObject(), block.getBaseOffset() + pointer + runningLength);
-          runningLength += 2;
-          byte[] data = new byte[length];
-          getUnsafe().copyMemory(block.getBaseObject(),
-              block.getBaseOffset() + pointer + runningLength,
-              data, BYTE_ARRAY_OFFSET, data.length);
-          runningLength += data.length;
-          row.setByteArray(data, i);
-          break;
-        case VARIABLE_INT:
-          int length2 = getUnsafe()
-              .getInt(block.getBaseObject(), block.getBaseOffset() + pointer + runningLength);
-          runningLength += 4;
-          byte[] data2 = new byte[length2];
-          getUnsafe().copyMemory(block.getBaseObject(),
-              block.getBaseOffset() + pointer + runningLength,
-              data2, BYTE_ARRAY_OFFSET, data2.length);
-          runningLength += data2.length;
-          row.setByteArray(data2, i);
-          break;
-        case STRUCT:
-          DataMapRow structRow = ((UnsafeDataMapRow) getRow(i)).convertToSafeRow();
-          row.setRow(structRow, i);
-          runningLength += structRow.getTotalSizeInBytes();
-          break;
-        default:
-          throw new UnsupportedOperationException(
-              "unsupported data type for unsafe storage: " + schema.getDataType());
-      }
-    }
-    row.setTotalLengthInBytes(runningLength);
-
-    return row;
-  }
-
-  private int getSizeInBytes(int ordinal, int position) {
-    switch (schemas[ordinal].getSchemaType()) {
-      case FIXED:
-        return schemas[ordinal].getLength();
-      case VARIABLE_SHORT:
-        return getLengthInBytes(ordinal, position) + 2;
-      case VARIABLE_INT:
-        return getLengthInBytes(ordinal, position) + 4;
-      case STRUCT:
-        return getRow(ordinal).getTotalSizeInBytes();
-      default:
-        throw new UnsupportedOperationException("wrong type");
-    }
-  }
-
-  private int getPosition(int ordinal) {
-    int position = 0;
-    for (int i = 0; i < ordinal; i++) {
-      position += getSizeInBytes(i, position);
-    }
-    return position;
-  }
 }
diff --git a/core/src/main/java/org/apache/carbondata/core/indexstore/schema/CarbonRowSchema.java b/core/src/main/java/org/apache/carbondata/core/indexstore/schema/CarbonRowSchema.java
index 7f47c00..30a7a9c 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/schema/CarbonRowSchema.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/schema/CarbonRowSchema.java
@@ -28,6 +28,7 @@ public abstract class CarbonRowSchema implements Serializable {
   private static final long serialVersionUID = -8061282029097686495L;
 
   protected DataType dataType;
+  private int bytePosition = -1;
 
   public CarbonRowSchema(DataType dataType) {
     this.dataType = dataType;
@@ -55,6 +56,13 @@ public abstract class CarbonRowSchema implements Serializable {
     return dataType.getSizeInBytes();
   }
 
+  public void setBytePosition(int bytePosition) {
+    this.bytePosition = bytePosition;
+  }
+
+  public int getBytePosition() {
+    return this.bytePosition;
+  }
   /**
    * schema type
    * @return
diff --git a/core/src/main/java/org/apache/carbondata/core/indexstore/schema/SchemaGenerator.java b/core/src/main/java/org/apache/carbondata/core/indexstore/schema/SchemaGenerator.java
index 52b9fb3..41c382b 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/schema/SchemaGenerator.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/schema/SchemaGenerator.java
@@ -20,6 +20,7 @@ package org.apache.carbondata.core.indexstore.schema;
 import java.util.ArrayList;
 import java.util.List;
 
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
 import org.apache.carbondata.core.datastore.block.SegmentProperties;
 import org.apache.carbondata.core.memory.MemoryException;
 import org.apache.carbondata.core.metadata.datatype.DataTypes;
@@ -60,10 +61,77 @@ public class SchemaGenerator {
     // written in the metadata or not.
     addMinMaxFlagSchema(segmentProperties, indexSchemas, minMaxCacheColumns);
     CarbonRowSchema[] schema = indexSchemas.toArray(new CarbonRowSchema[indexSchemas.size()]);
+    updateBytePosition(schema);
     return schema;
   }
 
   /**
+   * Method to update the byte position which will be used in case of unsafe dm store
+   * @see org/apache/carbondata/core/indexstore/UnsafeMemoryDMStore.java:87
+   *
+   * @param schema
+   */
+  private static void updateBytePosition(CarbonRowSchema[] schema) {
+    int currentSize;
+    int bytePosition = 0;
+    // First assign byte postion to all the fixed length schema
+    for (int i = 0; i < schema.length; i++) {
+      switch (schema[i].getSchemaType()) {
+        case STRUCT:
+          CarbonRowSchema[] childSchemas =
+              ((CarbonRowSchema.StructCarbonRowSchema) schema[i]).getChildSchemas();
+          for (int j = 0; j < childSchemas.length; j++) {
+            currentSize = getSchemaSize(childSchemas[j]);
+            if (currentSize != -1) {
+              childSchemas[j].setBytePosition(bytePosition);
+              bytePosition += currentSize;
+            }
+          }
+          break;
+        default:
+          currentSize = getSchemaSize(schema[i]);
+          if (currentSize != -1) {
+            schema[i].setBytePosition(bytePosition);
+            bytePosition += currentSize;
+          }
+          break;
+      }
+    }
+    // adding byte position for storing offset in case of variable length columns
+    for (int i = 0; i < schema.length; i++) {
+      switch (schema[i].getSchemaType()) {
+        case STRUCT:
+          CarbonRowSchema[] childSchemas =
+              ((CarbonRowSchema.StructCarbonRowSchema) schema[i]).getChildSchemas();
+          for (int j = 0; j < childSchemas.length; j++) {
+            if (childSchemas[j].getBytePosition() == -1) {
+              childSchemas[j].setBytePosition(bytePosition);
+              bytePosition += CarbonCommonConstants.INT_SIZE_IN_BYTE;
+            }
+          }
+          break;
+        default:
+          if (schema[i].getBytePosition() == -1) {
+            schema[i].setBytePosition(bytePosition);
+            bytePosition += CarbonCommonConstants.INT_SIZE_IN_BYTE;
+          }
+          break;
+      }
+    }
+  }
+  private static int getSchemaSize(CarbonRowSchema schema) {
+    switch (schema.getSchemaType()) {
+      case FIXED:
+        return schema.getLength();
+      case VARIABLE_SHORT:
+      case VARIABLE_INT:
+        return -1;
+      default:
+        throw new UnsupportedOperationException("Invalid Type");
+    }
+  }
+
+  /**
    * Method for creating blocklet Schema. Each blocklet row will share the same schema
    *
    * @param segmentProperties
@@ -98,6 +166,7 @@ public class SchemaGenerator {
     // for relative blocklet id i.e. blocklet id that belongs to a particular part file
     indexSchemas.add(new CarbonRowSchema.FixedCarbonRowSchema(DataTypes.SHORT));
     CarbonRowSchema[] schema = indexSchemas.toArray(new CarbonRowSchema[indexSchemas.size()]);
+    updateBytePosition(schema);
     return schema;
   }
 
@@ -140,6 +209,7 @@ public class SchemaGenerator {
     }
     CarbonRowSchema[] schema =
         taskMinMaxSchemas.toArray(new CarbonRowSchema[taskMinMaxSchemas.size()]);
+    updateBytePosition(schema);
     return schema;
   }
 
diff --git a/core/src/main/java/org/apache/carbondata/core/scan/model/QueryModel.java b/core/src/main/java/org/apache/carbondata/core/scan/model/QueryModel.java
index d6017f5..267527f 100644
--- a/core/src/main/java/org/apache/carbondata/core/scan/model/QueryModel.java
+++ b/core/src/main/java/org/apache/carbondata/core/scan/model/QueryModel.java
@@ -18,19 +18,16 @@
 package org.apache.carbondata.core.scan.model;
 
 import java.util.ArrayList;
-import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
 
 import org.apache.carbondata.core.cache.dictionary.Dictionary;
-import org.apache.carbondata.core.constants.CarbonCommonConstants;
 import org.apache.carbondata.core.datastore.block.TableBlockInfo;
 import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
 import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
 import org.apache.carbondata.core.metadata.schema.table.column.CarbonColumn;
 import org.apache.carbondata.core.metadata.schema.table.column.CarbonDimension;
 import org.apache.carbondata.core.metadata.schema.table.column.CarbonMeasure;
-import org.apache.carbondata.core.mutate.UpdateVO;
 import org.apache.carbondata.core.scan.expression.ColumnExpression;
 import org.apache.carbondata.core.scan.expression.Expression;
 import org.apache.carbondata.core.scan.expression.UnknownExpression;
@@ -92,15 +89,6 @@ public class QueryModel {
 
   private DataTypeConverter converter;
 
-  /**
-   * Invalid table blocks, which need to be removed from
-   * memory, invalid blocks can be segment which are deleted
-   * or compacted
-   */
-  private List<String> invalidSegmentIds;
-  private Map<String, UpdateVO> invalidSegmentBlockIdMap =
-      new HashMap<>(CarbonCommonConstants.DEFAULT_COLLECTION_SIZE);
-
   private boolean[] isFilterDimensions;
   private boolean[] isFilterMeasures;
 
@@ -135,7 +123,6 @@ public class QueryModel {
 
   private QueryModel(CarbonTable carbonTable) {
     tableBlockInfos = new ArrayList<TableBlockInfo>();
-    invalidSegmentIds = new ArrayList<>();
     this.table = carbonTable;
     this.queryId = String.valueOf(System.nanoTime());
   }
@@ -350,14 +337,6 @@ public class QueryModel {
     this.statisticsRecorder = statisticsRecorder;
   }
 
-  public List<String> getInvalidSegmentIds() {
-    return invalidSegmentIds;
-  }
-
-  public void setInvalidSegmentIds(List<String> invalidSegmentIds) {
-    this.invalidSegmentIds = invalidSegmentIds;
-  }
-
   public boolean isVectorReader() {
     return vectorReader;
   }
@@ -365,15 +344,6 @@ public class QueryModel {
   public void setVectorReader(boolean vectorReader) {
     this.vectorReader = vectorReader;
   }
-  public void setInvalidBlockForSegmentId(List<UpdateVO> invalidSegmentTimestampList) {
-    for (UpdateVO anUpdateVO : invalidSegmentTimestampList) {
-      this.invalidSegmentBlockIdMap.put(anUpdateVO.getSegmentId(), anUpdateVO);
-    }
-  }
-
-  public Map<String,UpdateVO>  getInvalidBlockVOForSegmentId() {
-    return  invalidSegmentBlockIdMap;
-  }
 
   public DataTypeConverter getConverter() {
     return converter;
diff --git a/hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonInputSplit.java b/core/src/main/java/org/apache/carbondata/hadoop/CarbonInputSplit.java
similarity index 57%
rename from hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonInputSplit.java
rename to core/src/main/java/org/apache/carbondata/hadoop/CarbonInputSplit.java
index bcf703c..bb1742c 100644
--- a/hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonInputSplit.java
+++ b/core/src/main/java/org/apache/carbondata/hadoop/CarbonInputSplit.java
@@ -33,11 +33,13 @@ import org.apache.carbondata.core.datamap.Segment;
 import org.apache.carbondata.core.datastore.block.BlockletInfos;
 import org.apache.carbondata.core.datastore.block.Distributable;
 import org.apache.carbondata.core.datastore.block.TableBlockInfo;
-import org.apache.carbondata.core.indexstore.Blocklet;
 import org.apache.carbondata.core.indexstore.BlockletDetailInfo;
+import org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapRowIndexes;
+import org.apache.carbondata.core.indexstore.row.DataMapRow;
 import org.apache.carbondata.core.metadata.ColumnarFormatVersion;
-import org.apache.carbondata.core.mutate.UpdateVO;
+import org.apache.carbondata.core.metadata.schema.table.column.ColumnSchema;
 import org.apache.carbondata.core.statusmanager.FileFormat;
+import org.apache.carbondata.core.util.BlockletDataMapUtil;
 import org.apache.carbondata.core.util.ByteUtil;
 import org.apache.carbondata.core.util.CarbonProperties;
 import org.apache.carbondata.core.util.path.CarbonTablePath;
@@ -45,6 +47,7 @@ import org.apache.carbondata.hadoop.internal.index.Block;
 
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.io.Writable;
+import org.apache.hadoop.mapred.SplitLocationInfo;
 import org.apache.hadoop.mapreduce.lib.input.FileSplit;
 
 /**
@@ -61,10 +64,6 @@ public class CarbonInputSplit extends FileSplit
   private String bucketId;
 
   private String blockletId;
-  /*
-   * Invalid segments that need to be removed in task side index
-   */
-  private List<String> invalidSegments;
 
   /*
    * Number of BlockLets in a block
@@ -74,14 +73,6 @@ public class CarbonInputSplit extends FileSplit
   private ColumnarFormatVersion version;
 
   /**
-   * map of blocklocation and storage id
-   */
-  private Map<String, String> blockStorageIdMap =
-      new HashMap<>(CarbonCommonConstants.DEFAULT_COLLECTION_SIZE);
-
-  private List<UpdateVO> invalidTimestampsList;
-
-  /**
    * list of delete delta files for split
    */
   private String[] deleteDeltaFiles;
@@ -97,90 +88,115 @@ public class CarbonInputSplit extends FileSplit
    */
   private Set<Integer> validBlockletIds;
 
+  private transient DataMapRow dataMapRow;
+
+  private transient int[] columnCardinality;
+
+  private transient boolean isLegacyStore;
+
+  private transient List<ColumnSchema> columnSchema;
+
+  private transient boolean useMinMaxForPruning;
+
+  private boolean isBlockCache = true;
+
+  private String filePath;
+
+  private long start;
+
+  private long length;
+
+  private String[] location;
+
+  private transient SplitLocationInfo[] hostInfos;
+
+  private transient Path path;
+
+  private transient String blockPath;
+
   public CarbonInputSplit() {
     segment = null;
     taskId = "0";
     bucketId = "0";
     blockletId = "0";
     numberOfBlocklets = 0;
-    invalidSegments = new ArrayList<>();
     version = CarbonProperties.getInstance().getFormatVersion();
   }
 
-  private CarbonInputSplit(String segmentId, String blockletId, Path path, long start, long length,
-      String[] locations, ColumnarFormatVersion version, String[] deleteDeltaFiles,
+  private CarbonInputSplit(String segmentId, String blockletId, String filePath, long start,
+      long length, ColumnarFormatVersion version, String[] deleteDeltaFiles,
       String dataMapWritePath) {
-    super(path, start, length, locations);
-    this.segment = Segment.toSegment(segmentId);
-    String taskNo = CarbonTablePath.DataFileUtil.getTaskNo(path.getName());
+    this.filePath = filePath;
+    this.start = start;
+    this.length = length;
+    if (null != segmentId) {
+      this.segment = Segment.toSegment(segmentId);
+    }
+    String taskNo = CarbonTablePath.DataFileUtil.getTaskNo(this.filePath);
     if (taskNo.contains("_")) {
       taskNo = taskNo.split("_")[0];
     }
     this.taskId = taskNo;
-    this.bucketId = CarbonTablePath.DataFileUtil.getBucketNo(path.getName());
+    this.bucketId = CarbonTablePath.DataFileUtil.getBucketNo(this.filePath);
     this.blockletId = blockletId;
-    this.invalidSegments = new ArrayList<>();
     this.version = version;
     this.deleteDeltaFiles = deleteDeltaFiles;
     this.dataMapWritePath = dataMapWritePath;
   }
 
-  public CarbonInputSplit(String segmentId, String blockletId, Path path, long start, long length,
-      String[] locations, int numberOfBlocklets, ColumnarFormatVersion version,
+  public CarbonInputSplit(String segmentId, String blockletId, String filePath, long start,
+      long length, String[] locations, int numberOfBlocklets, ColumnarFormatVersion version,
       String[] deleteDeltaFiles) {
-    this(segmentId, blockletId, path, start, length, locations, version, deleteDeltaFiles, null);
+    this(segmentId, blockletId, filePath, start, length, version, deleteDeltaFiles, null);
+    this.location = locations;
     this.numberOfBlocklets = numberOfBlocklets;
   }
-
-  public CarbonInputSplit(String segmentId, Path path, long start, long length, String[] locations,
-      FileFormat fileFormat) {
-    super(path, start, length, locations);
+  public CarbonInputSplit(String segmentId, String filePath, long start, long length,
+      String[] locations, FileFormat fileFormat) {
+    this.filePath = filePath;
+    this.start = start;
+    this.length = length;
+    this.location = locations;
     this.segment = Segment.toSegment(segmentId);
     this.fileFormat = fileFormat;
     taskId = "0";
     bucketId = "0";
     blockletId = "0";
     numberOfBlocklets = 0;
-    invalidSegments = new ArrayList<>();
     version = CarbonProperties.getInstance().getFormatVersion();
   }
 
-  public CarbonInputSplit(String segmentId, Path path, long start, long length, String[] locations,
-      String[] inMemoryHosts, FileFormat fileFormat) {
-    super(path, start, length, locations, inMemoryHosts);
+  public CarbonInputSplit(String segmentId, String filePath, long start, long length,
+      String[] locations, String[] inMemoryHosts, FileFormat fileFormat) {
+    this.filePath = filePath;
+    this.start = start;
+    this.length = length;
+    this.location = locations;
+    this.hostInfos = new SplitLocationInfo[inMemoryHosts.length];
+    for (int i = 0; i < inMemoryHosts.length; i++) {
+      // because N will be tiny, scanning is probably faster than a HashSet
+      boolean inMemory = false;
+      for (String inMemoryHost : inMemoryHosts) {
+        if (inMemoryHost.equals(inMemoryHosts[i])) {
+          inMemory = true;
+          break;
+        }
+      }
+      hostInfos[i] = new SplitLocationInfo(inMemoryHosts[i], inMemory);
+    }
     this.segment = Segment.toSegment(segmentId);
     this.fileFormat = fileFormat;
     taskId = "0";
     bucketId = "0";
     blockletId = "0";
     numberOfBlocklets = 0;
-    invalidSegments = new ArrayList<>();
     version = CarbonProperties.getInstance().getFormatVersion();
   }
 
-  /**
-   * Constructor to initialize the CarbonInputSplit with blockStorageIdMap
-   * @param segmentId
-   * @param path
-   * @param start
-   * @param length
-   * @param locations
-   * @param numberOfBlocklets
-   * @param version
-   * @param blockStorageIdMap
-   */
-  public CarbonInputSplit(String segmentId, String blockletId, Path path, long start, long length,
-      String[] locations, int numberOfBlocklets, ColumnarFormatVersion version,
-      Map<String, String> blockStorageIdMap, String[] deleteDeltaFiles) {
-    this(segmentId, blockletId, path, start, length, locations, numberOfBlocklets, version,
-        deleteDeltaFiles);
-    this.blockStorageIdMap = blockStorageIdMap;
-  }
-
-  public static CarbonInputSplit from(String segmentId, String blockletId, FileSplit split,
-      ColumnarFormatVersion version, String dataMapWritePath) throws IOException {
-    return new CarbonInputSplit(segmentId, blockletId, split.getPath(), split.getStart(),
-        split.getLength(), split.getLocations(), version, null, dataMapWritePath);
+  public static CarbonInputSplit from(String segmentId, String blockletId, String path, long start,
+      long length, ColumnarFormatVersion version, String dataMapWritePath) throws IOException {
+    return new CarbonInputSplit(segmentId, blockletId, path, start, length, version, null,
+        dataMapWritePath);
   }
 
   public static List<TableBlockInfo> createBlocks(List<CarbonInputSplit> splitList) {
@@ -190,7 +206,7 @@ public class CarbonInputSplit extends FileSplit
           new BlockletInfos(split.getNumberOfBlocklets(), 0, split.getNumberOfBlocklets());
       try {
         TableBlockInfo blockInfo =
-            new TableBlockInfo(split.getPath().toString(), split.blockletId, split.getStart(),
+            new TableBlockInfo(split.getFilePath(), split.blockletId, split.getStart(),
                 split.getSegment().toString(), split.getLocations(), split.getLength(),
                 blockletInfos, split.getVersion(), split.getDeleteDeltaFiles());
         blockInfo.setDetailInfo(split.getDetailInfo());
@@ -211,7 +227,7 @@ public class CarbonInputSplit extends FileSplit
         new BlockletInfos(inputSplit.getNumberOfBlocklets(), 0, inputSplit.getNumberOfBlocklets());
     try {
       TableBlockInfo blockInfo =
-          new TableBlockInfo(inputSplit.getPath().toString(), inputSplit.blockletId,
+          new TableBlockInfo(inputSplit.getFilePath(), inputSplit.blockletId,
               inputSplit.getStart(), inputSplit.getSegment().toString(), inputSplit.getLocations(),
               inputSplit.getLength(), blockletInfos, inputSplit.getVersion(),
               inputSplit.getDeleteDeltaFiles());
@@ -237,16 +253,13 @@ public class CarbonInputSplit extends FileSplit
 
 
   @Override public void readFields(DataInput in) throws IOException {
-    super.readFields(in);
+    this.filePath = in.readUTF();
+    this.start = in.readLong();
+    this.length = in.readLong();
     this.segment = Segment.toSegment(in.readUTF());
     this.version = ColumnarFormatVersion.valueOf(in.readShort());
     this.bucketId = in.readUTF();
     this.blockletId = in.readUTF();
-    int numInvalidSegment = in.readInt();
-    invalidSegments = new ArrayList<>(numInvalidSegment);
-    for (int i = 0; i < numInvalidSegment; i++) {
-      invalidSegments.add(in.readUTF());
-    }
     int numberOfDeleteDeltaFiles = in.readInt();
     deleteDeltaFiles = new String[numberOfDeleteDeltaFiles];
     for (int i = 0; i < numberOfDeleteDeltaFiles; i++) {
@@ -269,24 +282,24 @@ public class CarbonInputSplit extends FileSplit
   }
 
   @Override public void write(DataOutput out) throws IOException {
-    super.write(out);
+    out.writeUTF(filePath);
+    out.writeLong(start);
+    out.writeLong(length);
     out.writeUTF(segment.toString());
     out.writeShort(version.number());
     out.writeUTF(bucketId);
     out.writeUTF(blockletId);
-    out.writeInt(invalidSegments.size());
-    for (String invalidSegment : invalidSegments) {
-      out.writeUTF(invalidSegment);
-    }
     out.writeInt(null != deleteDeltaFiles ? deleteDeltaFiles.length : 0);
     if (null != deleteDeltaFiles) {
       for (int i = 0; i < deleteDeltaFiles.length; i++) {
         out.writeUTF(deleteDeltaFiles[i]);
       }
     }
-    out.writeBoolean(detailInfo != null);
+    out.writeBoolean(detailInfo != null || dataMapRow != null);
     if (detailInfo != null) {
       detailInfo.write(out);
+    } else if (dataMapRow != null) {
+      writeBlockletDetailsInfo(out);
     }
     out.writeBoolean(dataMapWritePath != null);
     if (dataMapWritePath != null) {
@@ -298,26 +311,6 @@ public class CarbonInputSplit extends FileSplit
     }
   }
 
-  public List<String> getInvalidSegments() {
-    return invalidSegments;
-  }
-
-  public void setInvalidSegments(List<Segment> invalidSegments) {
-    List<String> invalidSegmentIds = new ArrayList<>();
-    for (Segment segment: invalidSegments) {
-      invalidSegmentIds.add(segment.getSegmentNo());
-    }
-    this.invalidSegments = invalidSegmentIds;
-  }
-
-  public void setInvalidTimestampRange(List<UpdateVO> invalidTimestamps) {
-    invalidTimestampsList = invalidTimestamps;
-  }
-
-  public List<UpdateVO> getInvalidTimestampRange() {
-    return invalidTimestampsList;
-  }
-
   /**
    * returns the number of blocklets
    *
@@ -351,7 +344,7 @@ public class CarbonInputSplit extends FileSplit
     // converr seg ID to double.
 
     double seg1 = Double.parseDouble(segment.getSegmentNo());
-    double seg2 = Double.parseDouble(other.getSegmentId());
+    double seg2 = Double.parseDouble(other.segment.getSegmentNo());
     if (seg1 - seg2 < 0) {
       return -1;
     }
@@ -363,8 +356,8 @@ public class CarbonInputSplit extends FileSplit
     // if both the task id of the file is same then we need to compare the
     // offset of
     // the file
-    String filePath1 = this.getPath().getName();
-    String filePath2 = other.getPath().getName();
+    String filePath1 = this.getFilePath();
+    String filePath2 = other.getFilePath();
     if (CarbonTablePath.isCarbonDataFile(filePath1)) {
       byte[] firstTaskId = CarbonTablePath.DataFileUtil.getTaskNo(filePath1)
           .getBytes(Charset.forName(CarbonCommonConstants.DEFAULT_CHARSET));
@@ -410,13 +403,15 @@ public class CarbonInputSplit extends FileSplit
     int result = taskId.hashCode();
     result = 31 * result + segment.hashCode();
     result = 31 * result + bucketId.hashCode();
-    result = 31 * result + invalidSegments.hashCode();
     result = 31 * result + numberOfBlocklets;
     return result;
   }
 
   @Override public String getBlockPath() {
-    return getPath().getName();
+    if (null == blockPath) {
+      blockPath = getPath().getName();
+    }
+    return blockPath;
   }
 
   @Override public List<Long> getMatchedBlocklets() {
@@ -429,10 +424,11 @@ public class CarbonInputSplit extends FileSplit
 
   /**
    * returns map of blocklocation and storage id
+   *
    * @return
    */
   public Map<String, String> getBlockStorageIdMap() {
-    return blockStorageIdMap;
+    return new HashMap<>();
   }
 
   public String[] getDeleteDeltaFiles() {
@@ -443,10 +439,6 @@ public class CarbonInputSplit extends FileSplit
     this.deleteDeltaFiles = deleteDeltaFiles;
   }
 
-  public BlockletDetailInfo getDetailInfo() {
-    return detailInfo;
-  }
-
   public void setDetailInfo(BlockletDetailInfo detailInfo) {
     this.detailInfo = detailInfo;
   }
@@ -459,10 +451,6 @@ public class CarbonInputSplit extends FileSplit
     this.fileFormat = fileFormat;
   }
 
-  public Blocklet makeBlocklet() {
-    return new Blocklet(getPath().getName(), blockletId);
-  }
-
   public Set<Integer> getValidBlockletIds() {
     if (null == validBlockletIds) {
       validBlockletIds = new HashSet<>();
@@ -474,4 +462,158 @@ public class CarbonInputSplit extends FileSplit
     this.validBlockletIds = validBlockletIds;
   }
 
+  public void setDataMapWritePath(String dataMapWritePath) {
+    this.dataMapWritePath = dataMapWritePath;
+  }
+
+  public void setSegment(Segment segment) {
+    this.segment = segment;
+  }
+
+  public String getDataMapWritePath() {
+    return dataMapWritePath;
+  }
+
+  public void setDataMapRow(DataMapRow dataMapRow) {
+    this.dataMapRow = dataMapRow;
+  }
+
+  public void setColumnCardinality(int[] columnCardinality) {
+    this.columnCardinality = columnCardinality;
+  }
+
+  public void setLegacyStore(boolean legacyStore) {
+    isLegacyStore = legacyStore;
+  }
+
+  public void setColumnSchema(List<ColumnSchema> columnSchema) {
+    this.columnSchema = columnSchema;
+  }
+
+  public void setUseMinMaxForPruning(boolean useMinMaxForPruning) {
+    this.useMinMaxForPruning = useMinMaxForPruning;
+  }
+
+  public void setIsBlockCache(boolean isBlockCache) {
+    this.isBlockCache = isBlockCache;
+  }
+
+  private void writeBlockletDetailsInfo(DataOutput out) throws IOException {
+    out.writeInt(this.dataMapRow.getInt(BlockletDataMapRowIndexes.ROW_COUNT_INDEX));
+    if (this.isBlockCache) {
+      out.writeShort(0);
+    } else {
+      out.writeShort(this.dataMapRow.getShort(BlockletDataMapRowIndexes.BLOCKLET_PAGE_COUNT_INDEX));
+    }
+    out.writeShort(this.dataMapRow.getShort(BlockletDataMapRowIndexes.VERSION_INDEX));
+    out.writeShort(Short.parseShort(this.blockletId));
+    out.writeShort(this.columnCardinality.length);
+    for (int i = 0; i < this.columnCardinality.length; i++) {
+      out.writeInt(this.columnCardinality[i]);
+    }
+    out.writeLong(this.dataMapRow.getLong(BlockletDataMapRowIndexes.SCHEMA_UPADATED_TIME_INDEX));
+    out.writeBoolean(false);
+    out.writeLong(this.dataMapRow.getLong(BlockletDataMapRowIndexes.BLOCK_FOOTER_OFFSET));
+    // write -1 if columnSchemaBinary is null so that at the time of reading it can distinguish
+    // whether schema is written or not
+    if (null != this.columnSchema) {
+      byte[] columnSchemaBinary = BlockletDataMapUtil.convertSchemaToBinary(this.columnSchema);
+      out.writeInt(columnSchemaBinary.length);
+      out.write(columnSchemaBinary);
+    } else {
+      // write -1 if columnSchemaBinary is null so that at the time of reading it can distinguish
+      // whether schema is written or not
+      out.writeInt(-1);
+    }
+    if (this.isBlockCache) {
+      out.writeInt(0);
+      out.write(new byte[0]);
+    } else {
+      byte[] blockletInfoBinary =
+          this.dataMapRow.getByteArray(BlockletDataMapRowIndexes.BLOCKLET_INFO_INDEX);
+      out.writeInt(blockletInfoBinary.length);
+      out.write(blockletInfoBinary);
+    }
+    out.writeLong(this.dataMapRow.getLong(BlockletDataMapRowIndexes.BLOCK_LENGTH));
+    out.writeBoolean(this.isLegacyStore);
+    out.writeBoolean(this.useMinMaxForPruning);
+  }
+
+  public BlockletDetailInfo getDetailInfo() {
+    if (null != dataMapRow && detailInfo == null) {
+      detailInfo = new BlockletDetailInfo();
+      detailInfo
+          .setRowCount(this.dataMapRow.getInt(BlockletDataMapRowIndexes.ROW_COUNT_INDEX));
+      detailInfo
+          .setVersionNumber(this.dataMapRow.getShort(BlockletDataMapRowIndexes.VERSION_INDEX));
+      detailInfo.setBlockletId(Short.parseShort(this.blockletId));
+      detailInfo.setDimLens(this.columnCardinality);
+      detailInfo.setSchemaUpdatedTimeStamp(
+          this.dataMapRow.getLong(BlockletDataMapRowIndexes.SCHEMA_UPADATED_TIME_INDEX));
+      detailInfo.setBlockFooterOffset(
+          this.dataMapRow.getLong(BlockletDataMapRowIndexes.BLOCK_FOOTER_OFFSET));
+      detailInfo
+          .setBlockSize(this.dataMapRow.getLong(BlockletDataMapRowIndexes.BLOCK_LENGTH));
+      detailInfo.setLegacyStore(isLegacyStore);
+      detailInfo.setUseMinMaxForPruning(useMinMaxForPruning);
+      if (!this.isBlockCache) {
+        detailInfo.setColumnSchemas(this.columnSchema);
+        detailInfo.setPagesCount(
+            this.dataMapRow.getShort(BlockletDataMapRowIndexes.BLOCKLET_PAGE_COUNT_INDEX));
+        detailInfo.setBlockletInfoBinary(
+            this.dataMapRow.getByteArray(BlockletDataMapRowIndexes.BLOCKLET_INFO_INDEX));
+      } else {
+        detailInfo.setBlockletInfoBinary(new byte[0]);
+      }
+      if (location == null) {
+        try {
+          location = new String(dataMapRow.getByteArray(BlockletDataMapRowIndexes.LOCATIONS),
+              CarbonCommonConstants.DEFAULT_CHARSET).split(",");
+        } catch (IOException e) {
+          throw new RuntimeException(e);
+        }
+      }
+      dataMapRow = null;
+    }
+    return detailInfo;
+  }
+
+  @Override
+  public SplitLocationInfo[] getLocationInfo() throws IOException {
+    return hostInfos;
+  }
+
+  /**
+   * The file containing this split's data.
+   */
+  public Path getPath() {
+    if (path == null) {
+      path = new Path(filePath);
+      return path;
+    }
+    return path;
+  }
+
+  public String getFilePath() {
+    return this.filePath;
+  }
+
+  /** The position of the first byte in the file to process. */
+  public long getStart() { return start; }
+
+  @Override
+  public long getLength() { return length; }
+
+  @Override
+  public String toString() { return filePath + ":" + start + "+" + length; }
+
+  @Override public String[] getLocations() throws IOException {
+    if (this.location == null && dataMapRow == null) {
+      return new String[] {};
+    } else if (dataMapRow != null) {
+      location = new String(dataMapRow.getByteArray(BlockletDataMapRowIndexes.LOCATIONS),
+          CarbonCommonConstants.DEFAULT_CHARSET).split(",");
+    }
+    return this.location;
+  }
 }
diff --git a/hadoop/src/main/java/org/apache/carbondata/hadoop/internal/ObjectArrayWritable.java b/core/src/main/java/org/apache/carbondata/hadoop/internal/ObjectArrayWritable.java
similarity index 100%
rename from hadoop/src/main/java/org/apache/carbondata/hadoop/internal/ObjectArrayWritable.java
rename to core/src/main/java/org/apache/carbondata/hadoop/internal/ObjectArrayWritable.java
diff --git a/hadoop/src/main/java/org/apache/carbondata/hadoop/internal/index/Block.java b/core/src/main/java/org/apache/carbondata/hadoop/internal/index/Block.java
similarity index 100%
rename from hadoop/src/main/java/org/apache/carbondata/hadoop/internal/index/Block.java
rename to core/src/main/java/org/apache/carbondata/hadoop/internal/index/Block.java
diff --git a/hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonMultiBlockSplit.java b/hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonMultiBlockSplit.java
index 0b991cb..4c99c4f 100644
--- a/hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonMultiBlockSplit.java
+++ b/hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonMultiBlockSplit.java
@@ -100,7 +100,7 @@ public class CarbonMultiBlockSplit extends InputSplit implements Serializable, W
 
   public void calculateLength() {
     long total = 0;
-    if (splitList.size() > 0 && splitList.get(0).getDetailInfo() != null) {
+    if (splitList.size() > 1 && splitList.get(0).getDetailInfo() != null) {
       Map<String, Long> blockSizes = new HashMap<>();
       for (CarbonInputSplit split : splitList) {
         blockSizes.put(split.getBlockPath(), split.getDetailInfo().getBlockSize());
@@ -116,11 +116,21 @@ public class CarbonMultiBlockSplit extends InputSplit implements Serializable, W
     length = total;
   }
 
-  @Override
-  public String[] getLocations() {
+  @Override public String[] getLocations() {
+    getLocationIfNull();
     return locations;
   }
 
+  private void getLocationIfNull() {
+    try {
+      if (locations == null && splitList.size() == 1) {
+        this.locations = this.splitList.get(0).getLocations();
+      }
+    } catch (IOException e) {
+      throw new RuntimeException(e);
+    }
+  }
+
   @Override
   public void write(DataOutput out) throws IOException {
     // write number of splits and then write all splits
@@ -128,6 +138,7 @@ public class CarbonMultiBlockSplit extends InputSplit implements Serializable, W
     for (CarbonInputSplit split: splitList) {
       split.write(out);
     }
+    getLocationIfNull();
     out.writeInt(locations.length);
     for (int i = 0; i < locations.length; i++) {
       out.writeUTF(locations[i]);
diff --git a/hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonRecordReader.java b/hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonRecordReader.java
index 1dfead3..1a529e3 100644
--- a/hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonRecordReader.java
+++ b/hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonRecordReader.java
@@ -81,7 +81,7 @@ public class CarbonRecordReader<T> extends AbstractRecordReader<T> {
     List<CarbonInputSplit> splitList;
     if (inputSplit instanceof CarbonInputSplit) {
       splitList = new ArrayList<>(1);
-      String splitPath = ((CarbonInputSplit) inputSplit).getPath().toString();
+      String splitPath = ((CarbonInputSplit) inputSplit).getFilePath();
       // BlockFooterOffSet will be null in case of CarbonVectorizedReader as this has to be set
       // where multiple threads are able to read small set of files to calculate footer instead
       // of the main thread setting this for all the files.
@@ -162,7 +162,7 @@ public class CarbonRecordReader<T> extends AbstractRecordReader<T> {
     if (!skipClearDataMapAtClose) {
       // Clear the datamap cache
       DataMapStoreManager.getInstance().clearDataMaps(
-          queryModel.getTable().getAbsoluteTableIdentifier());
+          queryModel.getTable().getAbsoluteTableIdentifier(), false);
     }
     // close read support
     readSupport.close();
diff --git a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonFileInputFormat.java b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonFileInputFormat.java
index 7c08dd9..d81b02c 100644
--- a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonFileInputFormat.java
+++ b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonFileInputFormat.java
@@ -48,7 +48,6 @@ import org.apache.carbondata.core.util.path.CarbonTablePath;
 import org.apache.carbondata.hadoop.CarbonInputSplit;
 
 import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapreduce.InputSplit;
 import org.apache.hadoop.mapreduce.JobContext;
 
@@ -167,7 +166,7 @@ public class CarbonFileInputFormat<T> extends CarbonInputFormat<T> implements Se
           // Segment id is set to null because SDK does not write carbondata files with respect
           // to segments. So no specific name is present for this load.
           CarbonInputSplit split =
-              new CarbonInputSplit("null", new Path(carbonFile.getAbsolutePath()), 0,
+              new CarbonInputSplit("null", carbonFile.getAbsolutePath(), 0,
                   carbonFile.getLength(), carbonFile.getLocations(), FileFormat.COLUMNAR_V3);
           split.setVersion(ColumnarFormatVersion.V3);
           BlockletDetailInfo info = new BlockletDetailInfo();
@@ -179,7 +178,8 @@ public class CarbonFileInputFormat<T> extends CarbonInputFormat<T> implements Se
         }
         Collections.sort(splits, new Comparator<InputSplit>() {
           @Override public int compare(InputSplit o1, InputSplit o2) {
-            return ((CarbonInputSplit) o1).getPath().compareTo(((CarbonInputSplit) o2).getPath());
+            return ((CarbonInputSplit) o1).getFilePath()
+                .compareTo(((CarbonInputSplit) o2).getFilePath());
           }
         });
       }
diff --git a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonInputFormat.java b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonInputFormat.java
index 26144e2..aba0ab7 100644
--- a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonInputFormat.java
+++ b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonInputFormat.java
@@ -21,7 +21,14 @@ import java.io.ByteArrayInputStream;
 import java.io.DataInputStream;
 import java.io.IOException;
 import java.lang.reflect.Constructor;
-import java.util.*;
+import java.util.ArrayList;
+import java.util.BitSet;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.Set;
 
 import org.apache.carbondata.common.logging.LogServiceFactory;
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
@@ -38,13 +45,11 @@ import org.apache.carbondata.core.exception.InvalidConfigurationException;
 import org.apache.carbondata.core.indexstore.ExtendedBlocklet;
 import org.apache.carbondata.core.indexstore.PartitionSpec;
 import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
-import org.apache.carbondata.core.metadata.ColumnarFormatVersion;
 import org.apache.carbondata.core.metadata.schema.PartitionInfo;
 import org.apache.carbondata.core.metadata.schema.partition.PartitionType;
 import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
 import org.apache.carbondata.core.metadata.schema.table.TableInfo;
 import org.apache.carbondata.core.metadata.schema.table.column.ColumnSchema;
-import org.apache.carbondata.core.mutate.UpdateVO;
 import org.apache.carbondata.core.profiler.ExplainCollector;
 import org.apache.carbondata.core.readcommitter.ReadCommittedScope;
 import org.apache.carbondata.core.scan.expression.Expression;
@@ -80,7 +85,6 @@ import org.apache.hadoop.mapreduce.JobContext;
 import org.apache.hadoop.mapreduce.RecordReader;
 import org.apache.hadoop.mapreduce.TaskAttemptContext;
 import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
-import org.apache.hadoop.mapreduce.lib.input.FileSplit;
 import org.apache.hadoop.mapreduce.security.TokenCache;
 import org.apache.log4j.Logger;
 
@@ -408,7 +412,6 @@ m filterExpression
         new Path[] { new Path(carbonTable.getTablePath()) }, job.getConfiguration());
     List<ExtendedBlocklet> prunedBlocklets =
         getPrunedBlocklets(job, carbonTable, expression, segmentIds);
-
     List<CarbonInputSplit> resultFilteredBlocks = new ArrayList<>();
     int partitionIndex = 0;
     List<Integer> partitionIdList = new ArrayList<>();
@@ -416,13 +419,13 @@ m filterExpression
       partitionIdList = partitionInfo.getPartitionIds();
     }
     for (ExtendedBlocklet blocklet : prunedBlocklets) {
-      long partitionId = CarbonTablePath.DataFileUtil
-          .getTaskIdFromTaskNo(CarbonTablePath.DataFileUtil.getTaskNo(blocklet.getPath()));
 
       // OldPartitionIdList is only used in alter table partition command because it change
       // partition info first and then read data.
       // For other normal query should use newest partitionIdList
       if (partitionInfo != null && partitionInfo.getPartitionType() != PartitionType.NATIVE_HIVE) {
+        long partitionId = CarbonTablePath.DataFileUtil
+            .getTaskIdFromTaskNo(CarbonTablePath.DataFileUtil.getTaskNo(blocklet.getPath()));
         if (oldPartitionIdList != null) {
           partitionIndex = oldPartitionIdList.indexOf((int) partitionId);
         } else {
@@ -436,10 +439,7 @@ m filterExpression
         // for partition table, the task id of carbaondata file name is the partition id.
         // if this partition is not required, here will skip it.
         if (matchedPartitions == null || matchedPartitions.get(partitionIndex)) {
-          CarbonInputSplit inputSplit = convertToCarbonInputSplit(blocklet);
-          if (inputSplit != null) {
-            resultFilteredBlocks.add(inputSplit);
-          }
+          resultFilteredBlocks.add(blocklet.getInputSplit());
         }
       }
     }
@@ -493,7 +493,9 @@ m filterExpression
       prunedBlocklets = defaultDataMap.prune(segmentIds, expression, partitionsToPrune);
     }
 
-    ExplainCollector.setDefaultDataMapPruningBlockHit(getBlockCount(prunedBlocklets));
+    if (ExplainCollector.enabled()) {
+      ExplainCollector.setDefaultDataMapPruningBlockHit(getBlockCount(prunedBlocklets));
+    }
 
     if (prunedBlocklets.size() == 0) {
       return prunedBlocklets;
@@ -577,7 +579,7 @@ m filterExpression
       segment.getFilteredIndexShardNames().clear();
       // Check the segment exist in any of the pruned blocklets.
       for (ExtendedBlocklet blocklet : prunedBlocklets) {
-        if (blocklet.getSegmentId().equals(segment.toString())) {
+        if (blocklet.getSegment().toString().equals(segment.toString())) {
           found = true;
           // Set the pruned index file to the segment for further pruning.
           String shardName = CarbonTablePath.getShardName(blocklet.getFilePath());
@@ -593,17 +595,6 @@ m filterExpression
     segments.removeAll(toBeRemovedSegments);
   }
 
-  private CarbonInputSplit convertToCarbonInputSplit(ExtendedBlocklet blocklet) throws IOException {
-    CarbonInputSplit split = CarbonInputSplit
-        .from(blocklet.getSegmentId(), blocklet.getBlockletId(),
-            new FileSplit(new Path(blocklet.getPath()), 0, blocklet.getLength(),
-                blocklet.getLocations()),
-            ColumnarFormatVersion.valueOf((short) blocklet.getDetailInfo().getVersionNumber()),
-            blocklet.getDataMapWriterPath());
-    split.setDetailInfo(blocklet.getDetailInfo());
-    return split;
-  }
-
   @Override public RecordReader<Void, T> createRecordReader(InputSplit inputSplit,
       TaskAttemptContext taskAttemptContext) throws IOException, InterruptedException {
     Configuration configuration = taskAttemptContext.getConfiguration();
@@ -639,20 +630,6 @@ m filterExpression
         .filterExpression(filterExpression)
         .dataConverter(getDataTypeConverter(configuration))
         .build();
-
-    // update the file level index store if there are invalid segment
-    if (inputSplit instanceof CarbonMultiBlockSplit) {
-      CarbonMultiBlockSplit split = (CarbonMultiBlockSplit) inputSplit;
-      List<String> invalidSegments = split.getAllSplits().get(0).getInvalidSegments();
-      if (invalidSegments.size() > 0) {
-        queryModel.setInvalidSegmentIds(invalidSegments);
-      }
-      List<UpdateVO> invalidTimestampRangeList =
-          split.getAllSplits().get(0).getInvalidTimestampRange();
-      if ((null != invalidTimestampRangeList) && (invalidTimestampRangeList.size() > 0)) {
-        queryModel.setInvalidBlockForSegmentId(invalidTimestampRangeList);
-      }
-    }
     return queryModel;
   }
 
@@ -672,7 +649,7 @@ m filterExpression
       for (CarbonInputSplit carbonInputSplit : splits) {
         Set<Integer> validBlockletIds = carbonInputSplit.getValidBlockletIds();
         if (null != validBlockletIds && !validBlockletIds.isEmpty()) {
-          String uniqueBlockPath = carbonInputSplit.getPath().toString();
+          String uniqueBlockPath = carbonInputSplit.getFilePath();
           String shortBlockPath = CarbonTablePath
               .getShortBlockId(uniqueBlockPath.substring(uniqueBlockPath.lastIndexOf("/Part") + 1));
           blockIdToBlockletIdMapping.put(shortBlockPath, validBlockletIds);
diff --git a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
index 4ba8b8c..a7ca290 100644
--- a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
+++ b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
@@ -141,7 +141,6 @@ public class CarbonTableInputFormat<T> extends CarbonInputFormat<T> {
     SegmentUpdateStatusManager updateStatusManager =
         new SegmentUpdateStatusManager(carbonTable, loadMetadataDetails);
     List<Segment> invalidSegments = new ArrayList<>();
-    List<UpdateVO> invalidTimestampsList = new ArrayList<>();
     List<Segment> streamSegments = null;
     // get all valid segments and set them into the configuration
     SegmentStatusManager segmentStatusManager = new SegmentStatusManager(identifier,
@@ -179,7 +178,6 @@ public class CarbonTableInputFormat<T> extends CarbonInputFormat<T> {
       }
       // remove entry in the segment index if there are invalid segments
       invalidSegments.addAll(segments.getInvalidSegments());
-      invalidTimestampsList.addAll(updateStatusManager.getInvalidTimestampRange());
       if (invalidSegments.size() > 0) {
         DataMapStoreManager.getInstance()
             .clearInvalidSegments(getOrCreateCarbonTable(job.getConfiguration()), invalidSegments);
@@ -219,15 +217,6 @@ public class CarbonTableInputFormat<T> extends CarbonInputFormat<T> {
     List<InputSplit> splits =
         getSplits(job, filter, filteredSegmentToAccess, matchedPartitions, partitionInfo,
             null, updateStatusManager);
-    // pass the invalid segment to task side in order to remove index entry in task side
-    if (invalidSegments.size() > 0) {
-      for (InputSplit split : splits) {
-        ((org.apache.carbondata.hadoop.CarbonInputSplit) split).setInvalidSegments(invalidSegments);
-        ((org.apache.carbondata.hadoop.CarbonInputSplit) split)
-            .setInvalidTimestampRange(invalidTimestampsList);
-      }
-    }
-
     // add all splits of streaming
     List<InputSplit> splitsOfStreaming = getSplitsOfStreaming(job, streamSegments, carbonTable);
     if (!splitsOfStreaming.isEmpty()) {
@@ -320,13 +309,6 @@ public class CarbonTableInputFormat<T> extends CarbonInputFormat<T> {
           }
         }
       }
-      if (filteredSegmentToAccess.size() != segmentToAccessSet.size() && !validationRequired) {
-        for (Segment segment : segmentToAccessSet) {
-          if (!filteredSegmentToAccess.contains(segment)) {
-            filteredSegmentToAccess.add(segment);
-          }
-        }
-      }
       if (!filteredSegmentToAccess.containsAll(segmentToAccessSet)) {
         List<Segment> filteredSegmentToAccessTemp = new ArrayList<>(filteredSegmentToAccess);
         filteredSegmentToAccessTemp.removeAll(segmentToAccessSet);
@@ -383,16 +365,15 @@ public class CarbonTableInputFormat<T> extends CarbonInputFormat<T> {
           // there is 10% slop to avoid to generate very small split in the end
           while (((double) bytesRemaining) / splitSize > 1.1) {
             int blkIndex = getBlockIndex(blkLocations, length - bytesRemaining);
-            splits.add(
-                makeSplit(streamFile.getSegmentNo(), path, length - bytesRemaining,
-                    splitSize, blkLocations[blkIndex].getHosts(),
+            splits.add(makeSplit(streamFile.getSegmentNo(), streamFile.getFilePath(),
+                length - bytesRemaining, splitSize, blkLocations[blkIndex].getHosts(),
                     blkLocations[blkIndex].getCachedHosts(), FileFormat.ROW_V1));
             bytesRemaining -= splitSize;
           }
           if (bytesRemaining != 0) {
             int blkIndex = getBlockIndex(blkLocations, length - bytesRemaining);
-            splits.add(makeSplit(streamFile.getSegmentNo(), path, length - bytesRemaining,
-                bytesRemaining, blkLocations[blkIndex].getHosts(),
+            splits.add(makeSplit(streamFile.getSegmentNo(), streamFile.getFilePath(),
+                length - bytesRemaining, bytesRemaining, blkLocations[blkIndex].getHosts(),
                 blkLocations[blkIndex].getCachedHosts(), FileFormat.ROW_V1));
           }
         }
@@ -401,15 +382,10 @@ public class CarbonTableInputFormat<T> extends CarbonInputFormat<T> {
     return splits;
   }
 
-  protected FileSplit makeSplit(String segmentId, Path file, long start, long length,
-      String[] hosts, FileFormat fileFormat) {
-    return new CarbonInputSplit(segmentId, file, start, length, hosts, fileFormat);
-  }
-
-
-  protected FileSplit makeSplit(String segmentId, Path file, long start, long length,
+  protected FileSplit makeSplit(String segmentId, String filePath, long start, long length,
       String[] hosts, String[] inMemoryHosts, FileFormat fileFormat) {
-    return new CarbonInputSplit(segmentId, file, start, length, hosts, inMemoryHosts, fileFormat);
+    return new CarbonInputSplit(segmentId, filePath, start, length, hosts, inMemoryHosts,
+        fileFormat);
   }
 
   /**
@@ -421,10 +397,6 @@ public class CarbonTableInputFormat<T> extends CarbonInputFormat<T> {
    */
   public List<InputSplit> getSplitsOfOneSegment(JobContext job, String targetSegment,
       List<Integer> oldPartitionIdList, PartitionInfo partitionInfo) {
-    List<Segment> invalidSegments = new ArrayList<>();
-    List<UpdateVO> invalidTimestampsList = new ArrayList<>();
-
-
     try {
       carbonTable = getOrCreateCarbonTable(job.getConfiguration());
       ReadCommittedScope readCommittedScope =
@@ -464,13 +436,6 @@ public class CarbonTableInputFormat<T> extends CarbonInputFormat<T> {
       // do block filtering and get split
       List<InputSplit> splits = getSplits(job, filter, segmentList, matchedPartitions,
           partitionInfo, oldPartitionIdList, new SegmentUpdateStatusManager(carbonTable));
-      // pass the invalid segment to task side in order to remove index entry in task side
-      if (invalidSegments.size() > 0) {
-        for (InputSplit split : splits) {
-          ((CarbonInputSplit) split).setInvalidSegments(invalidSegments);
-          ((CarbonInputSplit) split).setInvalidTimestampRange(invalidTimestampsList);
-        }
-      }
       return splits;
     } catch (IOException e) {
       throw new RuntimeException("Can't get splits of the target segment ", e);
@@ -541,7 +506,7 @@ public class CarbonTableInputFormat<T> extends CarbonInputFormat<T> {
         // In case IUD is not performed in this table avoid searching for
         // invalidated blocks.
         if (CarbonUtil
-            .isInvalidTableBlock(inputSplit.getSegmentId(), inputSplit.getPath().toString(),
+            .isInvalidTableBlock(inputSplit.getSegmentId(), inputSplit.getFilePath(),
                 invalidBlockVOForSegmentId, updateStatusManager)) {
           continue;
         }
diff --git a/hadoop/src/main/java/org/apache/carbondata/hadoop/util/CarbonVectorizedRecordReader.java b/hadoop/src/main/java/org/apache/carbondata/hadoop/util/CarbonVectorizedRecordReader.java
index e18a4d4..1c11275 100644
--- a/hadoop/src/main/java/org/apache/carbondata/hadoop/util/CarbonVectorizedRecordReader.java
+++ b/hadoop/src/main/java/org/apache/carbondata/hadoop/util/CarbonVectorizedRecordReader.java
@@ -81,7 +81,7 @@ public class CarbonVectorizedRecordReader extends AbstractRecordReader<Object> {
     List<CarbonInputSplit> splitList;
     if (inputSplit instanceof CarbonInputSplit) {
       // Read the footer offset and set.
-      String splitPath = ((CarbonInputSplit) inputSplit).getPath().toString();
+      String splitPath = ((CarbonInputSplit) inputSplit).getFilePath();
       if (((CarbonInputSplit) inputSplit).getDetailInfo().getBlockFooterOffset() == 0L) {
         FileReader reader = FileFactory.getFileHolder(FileFactory.getFileType(splitPath),
             taskAttemptContext.getConfiguration());
diff --git a/integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonLocalInputSplit.java b/integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonLocalInputSplit.java
index f4f50a5..f68234c 100755
--- a/integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonLocalInputSplit.java
+++ b/integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonLocalInputSplit.java
@@ -29,8 +29,6 @@ import com.fasterxml.jackson.annotation.JsonCreator;
 import com.fasterxml.jackson.annotation.JsonProperty;
 import com.google.gson.Gson;
 
-import org.apache.hadoop.fs.Path;
-
 /**
  * CarbonLocalInputSplit represents a block, it contains a set of blocklet.
  */
@@ -136,7 +134,7 @@ public class CarbonLocalInputSplit {
 
   public static CarbonInputSplit convertSplit(CarbonLocalInputSplit carbonLocalInputSplit) {
     CarbonInputSplit inputSplit = new CarbonInputSplit(carbonLocalInputSplit.getSegmentId(),
-        carbonLocalInputSplit.getBlockletId(), new Path(carbonLocalInputSplit.getPath()),
+        carbonLocalInputSplit.getBlockletId(), carbonLocalInputSplit.getPath(),
         carbonLocalInputSplit.getStart(), carbonLocalInputSplit.getLength(),
         carbonLocalInputSplit.getLocations()
             .toArray(new String[carbonLocalInputSplit.getLocations().size()]),
diff --git a/integration/spark-common/src/main/java/org/apache/carbondata/spark/util/Util.java b/integration/spark-common/src/main/java/org/apache/carbondata/spark/util/Util.java
index d1193f5..20a2d39 100644
--- a/integration/spark-common/src/main/java/org/apache/carbondata/spark/util/Util.java
+++ b/integration/spark-common/src/main/java/org/apache/carbondata/spark/util/Util.java
@@ -52,7 +52,8 @@ public class Util {
    */
   public static boolean isBlockWithoutBlockletInfoExists(List<CarbonInputSplit> splitList) {
     for (CarbonInputSplit inputSplit : splitList) {
-      if (null == inputSplit.getDetailInfo().getBlockletInfo()) {
+      if (null == inputSplit.getDetailInfo() || null == inputSplit.getDetailInfo()
+          .getBlockletInfo()) {
         return true;
       }
     }
diff --git a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonMergerRDD.scala b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonMergerRDD.scala
index 96d288f..0e44f6d 100644
--- a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonMergerRDD.scala
+++ b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonMergerRDD.scala
@@ -330,11 +330,11 @@ class CarbonMergerRDD[K, V](
           .filter { split => FileFormat.COLUMNAR_V3.equals(split.getFileFormat) }.toList.asJava
       }
       carbonInputSplits ++:= splits.asScala.map(_.asInstanceOf[CarbonInputSplit]).filter{ entry =>
-        val blockInfo = new TableBlockInfo(entry.getPath.toString,
+        val blockInfo = new TableBlockInfo(entry.getFilePath,
           entry.getStart, entry.getSegmentId,
           entry.getLocations, entry.getLength, entry.getVersion,
           updateStatusManager.getDeleteDeltaFilePath(
-            entry.getPath.toString,
+            entry.getFilePath,
             Segment.toSegment(entry.getSegmentId).getSegmentNo)
         )
         (!updated || (updated && (!CarbonUtil
diff --git a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonScanRDD.scala b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonScanRDD.scala
index 0ab6a3a..9e66139 100644
--- a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonScanRDD.scala
+++ b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonScanRDD.scala
@@ -344,13 +344,12 @@ class CarbonScanRDD[T: ClassTag](
           closePartition()
         } else {
           // Use block distribution
-          splits.asScala.map(_.asInstanceOf[CarbonInputSplit]).groupBy { f =>
-            f.getSegmentId.concat(f.getBlockPath)
-          }.values.zipWithIndex.foreach { splitWithIndex =>
+          splits.asScala.map(_.asInstanceOf[CarbonInputSplit]).zipWithIndex.foreach {
+            splitWithIndex =>
             val multiBlockSplit =
               new CarbonMultiBlockSplit(
-                splitWithIndex._1.asJava,
-                splitWithIndex._1.flatMap(f => f.getLocations).distinct.toArray)
+                Seq(splitWithIndex._1).asJava,
+                null)
             val partition = new CarbonSparkPartition(id, splitWithIndex._2, multiBlockSplit)
             result.add(partition)
           }
@@ -704,7 +703,7 @@ class CarbonScanRDD[T: ClassTag](
             }.asInstanceOf[java.util.List[CarbonInputSplit]]
             // for each split and given block path set all the valid blocklet ids
             splitList.asScala.map { split =>
-              val uniqueBlockPath = split.getPath.toString
+              val uniqueBlockPath = split.getFilePath
               val shortBlockPath = CarbonTablePath
                 .getShortBlockId(uniqueBlockPath
                   .substring(uniqueBlockPath.lastIndexOf("/Part") + 1))
diff --git a/integration/spark-datasource/src/main/scala/org/apache/spark/sql/carbondata/execution/datasources/SparkCarbonFileFormat.scala b/integration/spark-datasource/src/main/scala/org/apache/spark/sql/carbondata/execution/datasources/SparkCarbonFileFormat.scala
index f725de3..6819a4c 100644
--- a/integration/spark-datasource/src/main/scala/org/apache/spark/sql/carbondata/execution/datasources/SparkCarbonFileFormat.scala
+++ b/integration/spark-datasource/src/main/scala/org/apache/spark/sql/carbondata/execution/datasources/SparkCarbonFileFormat.scala
@@ -383,7 +383,7 @@ class SparkCarbonFileFormat extends FileFormat
 
       if (file.filePath.endsWith(CarbonTablePath.CARBON_DATA_EXT)) {
         val split = new CarbonInputSplit("null",
-          new Path(new URI(file.filePath)),
+          new Path(new URI(file.filePath)).toString,
           file.start,
           file.length,
           file.locations,
@@ -394,10 +394,10 @@ class SparkCarbonFileFormat extends FileFormat
         split.setDetailInfo(info)
         info.setBlockSize(file.length)
         // Read the footer offset and set.
-        val reader = FileFactory.getFileHolder(FileFactory.getFileType(split.getPath.toString),
+        val reader = FileFactory.getFileHolder(FileFactory.getFileType(split.getFilePath),
           broadcastedHadoopConf.value.value)
         val buffer = reader
-          .readByteBuffer(FileFactory.getUpdatedFilePath(split.getPath.toString),
+          .readByteBuffer(FileFactory.getUpdatedFilePath(split.getFilePath),
             file.length - 8,
             8)
         info.setBlockFooterOffset(buffer.getLong)
diff --git a/integration/spark2/src/test/scala/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFunctionSuite.scala b/integration/spark2/src/test/scala/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFunctionSuite.scala
index f6e5eab..4010ef6 100644
--- a/integration/spark2/src/test/scala/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFunctionSuite.scala
+++ b/integration/spark2/src/test/scala/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFunctionSuite.scala
@@ -7,7 +7,7 @@ import org.apache.commons.io.FileUtils
 import org.apache.spark.sql.{CarbonEnv, SaveMode}
 import org.apache.spark.sql.test.Spark2TestQueryExecutor
 import org.apache.spark.sql.test.util.QueryTest
-import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
+import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach, Ignore}
 
 import org.apache.carbondata.core.constants.{CarbonCommonConstants, CarbonV3DataFormatConstants}
 import org.apache.carbondata.core.datamap.status.DataMapStatusManager


[carbondata] 25/41: [CARBONDATA-3302] [Spark-Integration] code cleaning related to CarbonCreateTable command

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 15f13ad3d7b0b1fedcb460f1bdb8f48eddd27003
Author: s71955 <su...@gmail.com>
AuthorDate: Sun Feb 24 21:45:16 2019 +0530

    [CARBONDATA-3302] [Spark-Integration] code cleaning related to CarbonCreateTable command
    
    What changes were proposed in this pull request?
    Removed Extra check to validate whether the stream relation is not null , moreover condition can be optimized further,
    currently the condition has path validation whether path is part of s3 file system and then system is checking
    whether the stream relation is not null, this check can be added initially as this overall
    condition has to be evaluated for stream table only if stream is not null.
    
    This closes #3134
---
 .../spark/sql/execution/command/table/CarbonCreateTableCommand.scala   | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonCreateTableCommand.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonCreateTableCommand.scala
index 12eb420..1e17ffe 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonCreateTableCommand.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonCreateTableCommand.scala
@@ -78,8 +78,7 @@ case class CarbonCreateTableCommand(
         path
       }
       val streaming = tableInfo.getFactTable.getTableProperties.get("streaming")
-      if (path.startsWith("s3") && streaming != null && streaming != null &&
-          streaming.equalsIgnoreCase("true")) {
+      if (streaming != null && streaming.equalsIgnoreCase("true") && path.startsWith("s3")) {
         throw new UnsupportedOperationException("streaming is not supported with s3 store")
       }
       tableInfo.setTablePath(tablePath)


[carbondata] 22/41: [CARBONDATA-3317] Fix NPE when execute 'show segments' command for stream table

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 6975346f5857732b9530b33407c7f9f702f363c0
Author: Zhang Zhichao <44...@qq.com>
AuthorDate: Sat Mar 16 23:49:24 2019 +0800

    [CARBONDATA-3317] Fix NPE when execute 'show segments' command for stream table
    
    When spark streaming app starts to create new stream segment, it does not create carbondataindex file before writing data successfully, and now if execute 'show segments' command, it will throw NPE.
    
    This closes #3149
---
 .../src/main/scala/org/apache/carbondata/api/CarbonStore.scala | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/integration/spark-common/src/main/scala/org/apache/carbondata/api/CarbonStore.scala b/integration/spark-common/src/main/scala/org/apache/carbondata/api/CarbonStore.scala
index 11db430..f5e429e 100644
--- a/integration/spark-common/src/main/scala/org/apache/carbondata/api/CarbonStore.scala
+++ b/integration/spark-common/src/main/scala/org/apache/carbondata/api/CarbonStore.scala
@@ -103,8 +103,14 @@ object CarbonStore {
             // since it is continuously inserting data
             val segmentDir = CarbonTablePath.getSegmentPath(tablePath, load.getLoadName)
             val indexPath = CarbonTablePath.getCarbonStreamIndexFilePath(segmentDir)
-            val indices = StreamSegment.readIndexFile(indexPath, FileFactory.getFileType(indexPath))
-            (indices.asScala.map(_.getFile_size).sum, FileFactory.getCarbonFile(indexPath).getSize)
+            val indexFile = FileFactory.getCarbonFile(indexPath)
+            if (indexFile.exists()) {
+              val indices =
+                StreamSegment.readIndexFile(indexPath, FileFactory.getFileType(indexPath))
+              (indices.asScala.map(_.getFile_size).sum, indexFile.getSize)
+            } else {
+              (-1L, -1L)
+            }
           } else {
             // for batch segment, we can get the data size from table status file directly
             (if (load.getDataSize == null) -1L else load.getDataSize.toLong,


[carbondata] 10/41: [CARBONDATA-3305] Added DDL to drop cache for a table

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 4eefe52147462c1b672951d063e2be22278ebf40
Author: namanrastogi <na...@gmail.com>
AuthorDate: Wed Feb 27 19:45:18 2019 +0530

    [CARBONDATA-3305] Added DDL to drop cache for a table
    
    Added CarbonDropCacheCommand to drop all cache entries for a particular table.
    
    usage: DROP METACACHE ON TABLE tableName
    Dropping cache for child table is not supported, the table has to be parent table.
    Running the above command will clear all the entries belonging to the table,
    its index entries, its datamap entries and it's forward and reverse dictionary.
    
    This closes #3138
---
 .../carbondata/core/cache/CarbonLRUCache.java      |  23 ++-
 docs/ddl-of-carbondata.md                          |  13 +-
 .../sql/commands/TestCarbonDropCacheCommand.scala  | 200 +++++++++++++++++++++
 .../sql/commands/TestCarbonShowCacheCommand.scala  |   2 +-
 .../apache/carbondata/events/DropCacheEvents.scala |  28 +++
 .../org/apache/carbondata/events/Events.scala      |   7 +
 .../scala/org/apache/spark/sql/CarbonEnv.scala     |   2 +
 .../command/cache/CarbonDropCacheCommand.scala     | 103 +++++++++++
 .../command/cache/CarbonShowCacheCommand.scala     |  56 +++---
 .../cache/DropCachePreAggEventListener.scala       |  70 ++++++++
 .../spark/sql/parser/CarbonSpark2SqlParser.scala   |  10 +-
 11 files changed, 473 insertions(+), 41 deletions(-)

diff --git a/core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java b/core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java
index 74ff8a0..0c75173 100644
--- a/core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java
+++ b/core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java
@@ -62,10 +62,13 @@ public final class CarbonLRUCache {
    */
   public CarbonLRUCache(String propertyName, String defaultPropertyName) {
     try {
-      lruCacheMemorySize = Integer
-          .parseInt(CarbonProperties.getInstance().getProperty(propertyName, defaultPropertyName));
+      lruCacheMemorySize = Long
+          .parseLong(CarbonProperties.getInstance().getProperty(propertyName, defaultPropertyName));
     } catch (NumberFormatException e) {
-      lruCacheMemorySize = Integer.parseInt(defaultPropertyName);
+      LOGGER.error(CarbonCommonConstants.CARBON_MAX_DRIVER_LRU_CACHE_SIZE
+          + " is not in a valid format. Falling back to default value: "
+          + CarbonCommonConstants.CARBON_MAX_LRU_CACHE_SIZE_DEFAULT);
+      lruCacheMemorySize = Long.parseLong(defaultPropertyName);
     }
     initCache();
     if (lruCacheMemorySize > 0) {
@@ -149,6 +152,17 @@ public final class CarbonLRUCache {
   }
 
   /**
+   * @param keys
+   */
+  public void removeAll(List<String> keys) {
+    synchronized (lruCacheMap) {
+      for (String key : keys) {
+        removeKey(key);
+      }
+    }
+  }
+
+  /**
    * This method will remove the key from lru cache
    *
    * @param key
@@ -302,6 +316,9 @@ public final class CarbonLRUCache {
    */
   public void clear() {
     synchronized (lruCacheMap) {
+      for (Cacheable cachebleObj : lruCacheMap.values()) {
+        cachebleObj.invalidate();
+      }
       lruCacheMap.clear();
     }
   }
diff --git a/docs/ddl-of-carbondata.md b/docs/ddl-of-carbondata.md
index 3476475..e6f209e 100644
--- a/docs/ddl-of-carbondata.md
+++ b/docs/ddl-of-carbondata.md
@@ -1095,7 +1095,7 @@ Users can specify which columns to include and exclude for local dictionary gene
   about current cache used status in memory through the following command:
 
   ```sql
-  SHOW METADATA
+  SHOW METACACHE
   ``` 
   
   This shows the overall memory consumed in the cache by categories - index files, dictionary and 
@@ -1103,10 +1103,19 @@ Users can specify which columns to include and exclude for local dictionary gene
   database.
   
   ```sql
-  SHOW METADATA ON TABLE tableName
+  SHOW METACACHE ON TABLE tableName
   ```
   
   This shows detailed information on cache usage by the table `tableName` and its carbonindex files, 
   its dictionary files, its datamaps and children tables.
   
   This command is not allowed on child tables.
+
+  ```sql
+    DROP METACACHE ON TABLE tableName
+   ```
+    
+  This clears any entry in cache by the table `tableName`, its carbonindex files, 
+  its dictionary files, its datamaps and children tables.
+    
+  This command is not allowed on child tables.
diff --git a/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonDropCacheCommand.scala b/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonDropCacheCommand.scala
new file mode 100644
index 0000000..982ec76
--- /dev/null
+++ b/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonDropCacheCommand.scala
@@ -0,0 +1,200 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.sql.commands
+
+import java.util
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.sql.CarbonEnv
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.analysis.NoSuchTableException
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+import org.apache.carbondata.core.cache.CacheProvider
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+
+class TestCarbonDropCacheCommand extends QueryTest with BeforeAndAfterAll {
+
+  val dbName = "cache_db"
+
+  override protected def beforeAll(): Unit = {
+    sql(s"DROP DATABASE IF EXISTS $dbName CASCADE")
+    sql(s"CREATE DATABASE $dbName")
+    sql(s"USE $dbName")
+  }
+
+  override protected def afterAll(): Unit = {
+    sql(s"use default")
+    sql(s"DROP DATABASE $dbName CASCADE")
+  }
+
+
+  test("Test dictionary") {
+    val tableName = "t1"
+
+    sql(s"CREATE TABLE $tableName(empno int, empname String, designation String, " +
+        s"doj Timestamp, workgroupcategory int, workgroupcategoryname String, deptno int, " +
+        s"deptname String, projectcode int, projectjoindate Timestamp, projectenddate Timestamp," +
+        s"attendance int,utilization int, salary int) stored by 'carbondata' " +
+        s"TBLPROPERTIES('DICTIONARY_INCLUDE'='designation, workgroupcategoryname')")
+    sql(s"LOAD DATA INPATH '$resourcesPath/data.csv' INTO TABLE $tableName")
+    sql(s"LOAD DATA INPATH '$resourcesPath/data.csv' INTO TABLE $tableName")
+    sql(s"SELECT * FROM $tableName").collect()
+
+    val droppedCacheKeys = clone(CacheProvider.getInstance().getCarbonCache.getCacheMap.keySet())
+
+    sql(s"DROP METACACHE ON TABLE $tableName")
+
+    val cacheAfterDrop = clone(CacheProvider.getInstance().getCarbonCache.getCacheMap.keySet())
+    droppedCacheKeys.removeAll(cacheAfterDrop)
+
+    val tableIdentifier = new TableIdentifier(tableName, Some(dbName))
+    val carbonTable = CarbonEnv.getCarbonTable(tableIdentifier)(sqlContext.sparkSession)
+    val tablePath = carbonTable.getTablePath + CarbonCommonConstants.FILE_SEPARATOR
+    val dictIds = carbonTable.getAllDimensions.asScala.filter(_.isGlobalDictionaryEncoding)
+      .map(_.getColumnId).toArray
+
+    // Check if table index entries are dropped
+    assert(droppedCacheKeys.asScala.exists(key => key.startsWith(tablePath)))
+
+    // check if cache does not have any more table index entries
+    assert(!cacheAfterDrop.asScala.exists(key => key.startsWith(tablePath)))
+
+    // check if table dictionary entries are dropped
+    for (dictId <- dictIds) {
+      assert(droppedCacheKeys.asScala.exists(key => key.contains(dictId)))
+    }
+
+    // check if cache does not have any more table dictionary entries
+    for (dictId <- dictIds) {
+      assert(!cacheAfterDrop.asScala.exists(key => key.contains(dictId)))
+    }
+  }
+
+
+  test("Test preaggregate datamap") {
+    val tableName = "t2"
+
+    sql(s"CREATE TABLE $tableName(empno int, empname String, designation String, " +
+        s"doj Timestamp, workgroupcategory int, workgroupcategoryname String, deptno int, " +
+        s"deptname String, projectcode int, projectjoindate Timestamp, projectenddate Timestamp," +
+        s"attendance int, utilization int, salary int) stored by 'carbondata'")
+    sql(s"CREATE DATAMAP dpagg ON TABLE $tableName USING 'preaggregate' AS " +
+        s"SELECT AVG(salary), workgroupcategoryname from $tableName GROUP BY workgroupcategoryname")
+    sql(s"LOAD DATA INPATH '$resourcesPath/data.csv' INTO TABLE $tableName")
+    sql(s"SELECT * FROM $tableName").collect()
+    sql(s"SELECT AVG(salary), workgroupcategoryname from $tableName " +
+        s"GROUP BY workgroupcategoryname").collect()
+    val droppedCacheKeys = clone(CacheProvider.getInstance().getCarbonCache.getCacheMap.keySet())
+
+    sql(s"DROP METACACHE ON TABLE $tableName")
+
+    val cacheAfterDrop = clone(CacheProvider.getInstance().getCarbonCache.getCacheMap.keySet())
+    droppedCacheKeys.removeAll(cacheAfterDrop)
+
+    val tableIdentifier = new TableIdentifier(tableName, Some(dbName))
+    val carbonTable = CarbonEnv.getCarbonTable(tableIdentifier)(sqlContext.sparkSession)
+    val dbPath = CarbonEnv
+      .getDatabaseLocation(tableIdentifier.database.get, sqlContext.sparkSession)
+    val tablePath = carbonTable.getTablePath
+    val preaggPath = dbPath + CarbonCommonConstants.FILE_SEPARATOR + carbonTable.getTableName +
+                     "_" + carbonTable.getTableInfo.getDataMapSchemaList.get(0).getDataMapName +
+                     CarbonCommonConstants.FILE_SEPARATOR
+
+    // Check if table index entries are dropped
+    assert(droppedCacheKeys.asScala.exists(key => key.startsWith(tablePath)))
+
+    // check if cache does not have any more table index entries
+    assert(!cacheAfterDrop.asScala.exists(key => key.startsWith(tablePath)))
+
+    // Check if preaggregate index entries are dropped
+    assert(droppedCacheKeys.asScala.exists(key => key.startsWith(preaggPath)))
+
+    // check if cache does not have any more preaggregate index entries
+    assert(!cacheAfterDrop.asScala.exists(key => key.startsWith(preaggPath)))
+  }
+
+
+  test("Test bloom filter") {
+    val tableName = "t3"
+
+    sql(s"CREATE TABLE $tableName(empno int, empname String, designation String, " +
+        s"doj Timestamp, workgroupcategory int, workgroupcategoryname String, deptno int, " +
+        s"deptname String, projectcode int, projectjoindate Timestamp, projectenddate Timestamp," +
+        s"attendance int, utilization int, salary int) stored by 'carbondata'")
+    sql(s"CREATE DATAMAP dblom ON TABLE $tableName USING 'bloomfilter' " +
+        "DMPROPERTIES('INDEX_COLUMNS'='deptno')")
+    sql(s"LOAD DATA INPATH '$resourcesPath/data.csv' INTO TABLE $tableName")
+    sql(s"SELECT * FROM $tableName").collect()
+    sql(s"SELECT * FROM $tableName WHERE deptno=10").collect()
+
+    val droppedCacheKeys = clone(CacheProvider.getInstance().getCarbonCache.getCacheMap.keySet())
+
+    sql(s"DROP METACACHE ON TABLE $tableName")
+
+    val cacheAfterDrop = clone(CacheProvider.getInstance().getCarbonCache.getCacheMap.keySet())
+    droppedCacheKeys.removeAll(cacheAfterDrop)
+
+    val tableIdentifier = new TableIdentifier(tableName, Some(dbName))
+    val carbonTable = CarbonEnv.getCarbonTable(tableIdentifier)(sqlContext.sparkSession)
+    val tablePath = carbonTable.getTablePath
+    val bloomPath = tablePath + CarbonCommonConstants.FILE_SEPARATOR + "dblom" +
+                    CarbonCommonConstants.FILE_SEPARATOR
+
+    // Check if table index entries are dropped
+    assert(droppedCacheKeys.asScala.exists(key => key.startsWith(tablePath)))
+
+    // check if cache does not have any more table index entries
+    assert(!cacheAfterDrop.asScala.exists(key => key.startsWith(tablePath)))
+
+    // Check if bloom entries are dropped
+    assert(droppedCacheKeys.asScala.exists(key => key.contains(bloomPath)))
+
+    // check if cache does not have any more bloom entries
+    assert(!cacheAfterDrop.asScala.exists(key => key.contains(bloomPath)))
+  }
+
+
+  test("Test preaggregate datamap fail") {
+    val tableName = "t4"
+
+    sql(s"CREATE TABLE $tableName(empno int, empname String, designation String, " +
+        s"doj Timestamp, workgroupcategory int, workgroupcategoryname String, deptno int, " +
+        s"deptname String, projectcode int, projectjoindate Timestamp, projectenddate Timestamp," +
+        s"attendance int, utilization int, salary int) stored by 'carbondata'")
+    sql(s"CREATE DATAMAP dpagg ON TABLE $tableName USING 'preaggregate' AS " +
+        s"SELECT AVG(salary), workgroupcategoryname from $tableName GROUP BY workgroupcategoryname")
+    sql(s"LOAD DATA INPATH '$resourcesPath/data.csv' INTO TABLE $tableName")
+    sql(s"SELECT * FROM $tableName").collect()
+    sql(s"SELECT AVG(salary), workgroupcategoryname from $tableName " +
+        s"GROUP BY workgroupcategoryname").collect()
+
+    val fail_message = intercept[UnsupportedOperationException] {
+      sql(s"DROP METACACHE ON TABLE ${tableName}_dpagg")
+    }.getMessage
+    assert(fail_message.contains("Operation not allowed on child table."))
+  }
+
+
+  def clone(oldSet: util.Set[String]): util.HashSet[String] = {
+    val newSet = new util.HashSet[String]
+    newSet.addAll(oldSet)
+    newSet
+  }
+}
diff --git a/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala b/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala
index 0e1cd00..e999fc7 100644
--- a/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala
+++ b/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala
@@ -128,7 +128,7 @@ class TestCarbonShowCacheCommand extends QueryTest with BeforeAndAfterAll {
     sql("use cache_empty_db").collect()
     val result1 = sql("show metacache").collect()
     assertResult(2)(result1.length)
-    assertResult(Row("cache_empty_db", "ALL", "0", "0", "0"))(result1(1))
+    assertResult(Row("cache_empty_db", "ALL", "0 bytes", "0 bytes", "0 bytes"))(result1(1))
 
     sql("use cache_db").collect()
     val result2 = sql("show metacache").collect()
diff --git a/integration/spark-common/src/main/scala/org/apache/carbondata/events/DropCacheEvents.scala b/integration/spark-common/src/main/scala/org/apache/carbondata/events/DropCacheEvents.scala
new file mode 100644
index 0000000..2e8b78e
--- /dev/null
+++ b/integration/spark-common/src/main/scala/org/apache/carbondata/events/DropCacheEvents.scala
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.events
+
+import org.apache.spark.sql.SparkSession
+
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+
+case class DropCacheEvent(
+    carbonTable: CarbonTable,
+    sparkSession: SparkSession,
+    internalCall: Boolean)
+  extends Event with DropCacheEventInfo
diff --git a/integration/spark-common/src/main/scala/org/apache/carbondata/events/Events.scala b/integration/spark-common/src/main/scala/org/apache/carbondata/events/Events.scala
index 1830a35..c03d3c6 100644
--- a/integration/spark-common/src/main/scala/org/apache/carbondata/events/Events.scala
+++ b/integration/spark-common/src/main/scala/org/apache/carbondata/events/Events.scala
@@ -63,6 +63,13 @@ trait DropTableEventInfo {
 }
 
 /**
+ * event for drop cache
+ */
+trait DropCacheEventInfo {
+  val carbonTable: CarbonTable
+}
+
+/**
  * event for alter_table_drop_column
  */
 trait AlterTableDropColumnEventInfo {
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonEnv.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
index a7677d7..60d896a 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
@@ -23,6 +23,7 @@ import org.apache.spark.sql.catalyst.TableIdentifier
 import org.apache.spark.sql.catalyst.analysis.NoSuchTableException
 import org.apache.spark.sql.catalyst.catalog.SessionCatalog
 import org.apache.spark.sql.events.{MergeBloomIndexEventListener, MergeIndexEventListener}
+import org.apache.spark.sql.execution.command.cache.DropCachePreAggEventListener
 import org.apache.spark.sql.execution.command.preaaggregate._
 import org.apache.spark.sql.execution.command.timeseries.TimeSeriesFunction
 import org.apache.spark.sql.hive._
@@ -185,6 +186,7 @@ object CarbonEnv {
       .addListener(classOf[AlterTableCompactionPostEvent], new MergeIndexEventListener)
       .addListener(classOf[AlterTableMergeIndexEvent], new MergeIndexEventListener)
       .addListener(classOf[BuildDataMapPostExecutionEvent], new MergeBloomIndexEventListener)
+      .addListener(classOf[DropCacheEvent], DropCachePreAggEventListener)
   }
 
   /**
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonDropCacheCommand.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonDropCacheCommand.scala
new file mode 100644
index 0000000..e955ed9
--- /dev/null
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonDropCacheCommand.scala
@@ -0,0 +1,103 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command.cache
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable.ListBuffer
+
+import org.apache.spark.sql.{CarbonEnv, Row, SparkSession}
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.execution.command.MetadataCommand
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.cache.CacheProvider
+import org.apache.carbondata.core.cache.dictionary.AbstractColumnDictionaryInfo
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.indexstore.BlockletDataMapIndexWrapper
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.datamap.bloom.BloomCacheKeyValue
+import org.apache.carbondata.events.{DropCacheEvent, OperationContext, OperationListenerBus}
+
+case class CarbonDropCacheCommand(tableIdentifier: TableIdentifier, internalCall: Boolean = false)
+  extends MetadataCommand {
+
+  val LOGGER = LogServiceFactory.getLogService(this.getClass.getCanonicalName)
+
+  override def processMetadata(sparkSession: SparkSession): Seq[Row] = {
+    val carbonTable = CarbonEnv.getCarbonTable(tableIdentifier)(sparkSession)
+    clearCache(carbonTable, sparkSession)
+    Seq.empty
+  }
+
+  def clearCache(carbonTable: CarbonTable, sparkSession: SparkSession): Unit = {
+    LOGGER.info("Drop cache request received for table " + carbonTable.getTableName)
+
+    val dropCacheEvent = DropCacheEvent(
+      carbonTable,
+      sparkSession,
+      internalCall
+    )
+    val operationContext = new OperationContext
+    OperationListenerBus.getInstance.fireEvent(dropCacheEvent, operationContext)
+
+    val cache = CacheProvider.getInstance().getCarbonCache
+    if (cache != null) {
+      val tablePath = carbonTable.getTablePath + CarbonCommonConstants.FILE_SEPARATOR
+
+      // Dictionary IDs
+      val dictIds = carbonTable.getAllDimensions.asScala.filter(_.isGlobalDictionaryEncoding)
+        .map(_.getColumnId).toArray
+
+      // Remove elements from cache
+      val keysToRemove = ListBuffer[String]()
+      val cacheIterator = cache.getCacheMap.entrySet().iterator()
+      while (cacheIterator.hasNext) {
+        val entry = cacheIterator.next()
+        val cache = entry.getValue
+
+        if (cache.isInstanceOf[BlockletDataMapIndexWrapper]) {
+          // index
+          val indexPath = entry.getKey.replace(CarbonCommonConstants.WINDOWS_FILE_SEPARATOR,
+            CarbonCommonConstants.FILE_SEPARATOR)
+          if (indexPath.startsWith(tablePath)) {
+            keysToRemove += entry.getKey
+          }
+        } else if (cache.isInstanceOf[BloomCacheKeyValue.CacheValue]) {
+          // bloom datamap
+          val shardPath = entry.getKey.replace(CarbonCommonConstants.WINDOWS_FILE_SEPARATOR,
+            CarbonCommonConstants.FILE_SEPARATOR)
+          if (shardPath.contains(tablePath)) {
+            keysToRemove += entry.getKey
+          }
+        } else if (cache.isInstanceOf[AbstractColumnDictionaryInfo]) {
+          // dictionary
+          val dictId = dictIds.find(id => entry.getKey.startsWith(id))
+          if (dictId.isDefined) {
+            keysToRemove += entry.getKey
+          }
+        }
+      }
+      cache.removeAll(keysToRemove.asJava)
+    }
+
+    LOGGER.info("Drop cache request received for table " + carbonTable.getTableName)
+  }
+
+  override protected def opName: String = "DROP CACHE"
+
+}
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
index e937c32..e5f89d8 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
@@ -66,29 +66,24 @@ case class CarbonShowCacheCommand(tableIdentifier: Option[TableIdentifier])
     val currentDatabase = sparkSession.sessionState.catalog.getCurrentDatabase
     val cache = CacheProvider.getInstance().getCarbonCache()
     if (cache == null) {
-      Seq(Row("ALL", "ALL", 0L, 0L, 0L),
-        Row(currentDatabase, "ALL", 0L, 0L, 0L))
+      Seq(
+        Row("ALL", "ALL", byteCountToDisplaySize(0L),
+          byteCountToDisplaySize(0L), byteCountToDisplaySize(0L)),
+        Row(currentDatabase, "ALL", byteCountToDisplaySize(0L),
+          byteCountToDisplaySize(0L), byteCountToDisplaySize(0L)))
     } else {
-      val tableIdents = sparkSession.sessionState.catalog.listTables(currentDatabase).toArray
-      val dbLocation = CarbonEnv.getDatabaseLocation(currentDatabase, sparkSession)
-      val tempLocation = dbLocation.replace(
-        CarbonCommonConstants.WINDOWS_FILE_SEPARATOR, CarbonCommonConstants.FILE_SEPARATOR)
-      val tablePaths = tableIdents.map { tableIdent =>
-        (tempLocation + CarbonCommonConstants.FILE_SEPARATOR +
-         tableIdent.table + CarbonCommonConstants.FILE_SEPARATOR,
-          CarbonEnv.getDatabaseName(tableIdent.database)(sparkSession) + "." + tableIdent.table)
+      val carbonTables = CarbonEnv.getInstance(sparkSession).carbonMetaStore
+        .listAllTables(sparkSession)
+        .filter { table =>
+        table.getDatabaseName.equalsIgnoreCase(currentDatabase)
+      }
+      val tablePaths = carbonTables
+        .map { table =>
+          (table.getTablePath + CarbonCommonConstants.FILE_SEPARATOR,
+            table.getDatabaseName + "." + table.getTableName)
       }
 
-      val dictIds = tableIdents
-        .map { tableIdent =>
-          var table: CarbonTable = null
-          try {
-            table = CarbonEnv.getCarbonTable(tableIdent)(sparkSession)
-          } catch {
-            case _ =>
-          }
-          table
-        }
+      val dictIds = carbonTables
         .filter(_ != null)
         .flatMap { table =>
           table
@@ -159,7 +154,8 @@ case class CarbonShowCacheCommand(tableIdentifier: Option[TableIdentifier])
         Seq(
           Row("ALL", "ALL", byteCountToDisplaySize(allIndexSize),
             byteCountToDisplaySize(allDatamapSize), byteCountToDisplaySize(allDictSize)),
-          Row(currentDatabase, "ALL", "0", "0", "0"))
+          Row(currentDatabase, "ALL", byteCountToDisplaySize(0),
+            byteCountToDisplaySize(0), byteCountToDisplaySize(0)))
       } else {
         val tableList = tableMapIndexSize
           .map(_._1)
@@ -187,17 +183,11 @@ case class CarbonShowCacheCommand(tableIdentifier: Option[TableIdentifier])
   }
 
   def showTableCache(sparkSession: SparkSession, carbonTable: CarbonTable): Seq[Row] = {
-    val tableName = carbonTable.getTableName
-    val databaseName = carbonTable.getDatabaseName
     val cache = CacheProvider.getInstance().getCarbonCache()
     if (cache == null) {
       Seq.empty
     } else {
-      val dbLocation = CarbonEnv
-        .getDatabaseLocation(databaseName, sparkSession)
-        .replace(CarbonCommonConstants.WINDOWS_FILE_SEPARATOR, CarbonCommonConstants.FILE_SEPARATOR)
-      val tablePath = dbLocation + CarbonCommonConstants.FILE_SEPARATOR +
-                      tableName + CarbonCommonConstants.FILE_SEPARATOR
+      val tablePath = carbonTable.getTablePath + CarbonCommonConstants.FILE_SEPARATOR
       var numIndexFilesCached = 0
 
       // Path -> Name, Type
@@ -209,8 +199,10 @@ case class CarbonShowCacheCommand(tableIdentifier: Option[TableIdentifier])
       datamapSize.put(tablePath, 0)
       // children tables
       for( schema <- carbonTable.getTableInfo.getDataMapSchemaList.asScala ) {
-        val path = dbLocation + CarbonCommonConstants.FILE_SEPARATOR + tableName + "_" +
-                   schema.getDataMapName + CarbonCommonConstants.FILE_SEPARATOR
+        val childTableName = carbonTable.getTableName + "_" + schema.getDataMapName
+        val childTable = CarbonEnv
+          .getCarbonTable(Some(carbonTable.getDatabaseName), childTableName)(sparkSession)
+        val path = childTable.getTablePath + CarbonCommonConstants.FILE_SEPARATOR
         val name = schema.getDataMapName
         val dmType = schema.getProviderName
         datamapName.put(path, (name, dmType))
@@ -219,9 +211,7 @@ case class CarbonShowCacheCommand(tableIdentifier: Option[TableIdentifier])
       // index schemas
       for (schema <- DataMapStoreManager.getInstance().getDataMapSchemasOfTable(carbonTable)
         .asScala) {
-        val path = dbLocation + CarbonCommonConstants.FILE_SEPARATOR + tableName +
-                   CarbonCommonConstants.FILE_SEPARATOR + schema.getDataMapName +
-                   CarbonCommonConstants.FILE_SEPARATOR
+        val path = tablePath + schema.getDataMapName + CarbonCommonConstants.FILE_SEPARATOR
         val name = schema.getDataMapName
         val dmType = schema.getProviderName
         datamapName.put(path, (name, dmType))
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/DropCachePreAggEventListener.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/DropCachePreAggEventListener.scala
new file mode 100644
index 0000000..3d03c60
--- /dev/null
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/DropCachePreAggEventListener.scala
@@ -0,0 +1,70 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to You under the Apache License, Version 2.0
+* (the "License"); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+*    http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+
+package org.apache.spark.sql.execution.command.cache
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.CarbonEnv
+import org.apache.spark.sql.catalyst.TableIdentifier
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.events.{DropCacheEvent, Event, OperationContext,
+  OperationEventListener}
+
+object DropCachePreAggEventListener extends OperationEventListener {
+
+  val LOGGER = LogServiceFactory.getLogService(this.getClass.getCanonicalName)
+
+  /**
+   * Called on a specified event occurrence
+   *
+   * @param event
+   * @param operationContext
+   */
+  override protected def onEvent(event: Event,
+      operationContext: OperationContext): Unit = {
+
+    event match {
+      case dropCacheEvent: DropCacheEvent =>
+        val carbonTable = dropCacheEvent.carbonTable
+        val sparkSession = dropCacheEvent.sparkSession
+        val internalCall = dropCacheEvent.internalCall
+        if (carbonTable.isChildDataMap && !internalCall) {
+          throw new UnsupportedOperationException("Operation not allowed on child table.")
+        }
+
+        if (carbonTable.hasDataMapSchema) {
+          val childrenSchemas = carbonTable.getTableInfo.getDataMapSchemaList.asScala
+            .filter(_.getRelationIdentifier != null)
+          for (childSchema <- childrenSchemas) {
+            val childTable =
+              CarbonEnv.getCarbonTable(
+                TableIdentifier(childSchema.getRelationIdentifier.getTableName,
+                  Some(childSchema.getRelationIdentifier.getDatabaseName)))(sparkSession)
+            val dropCacheCommandForChildTable =
+              CarbonDropCacheCommand(
+                TableIdentifier(childTable.getTableName, Some(childTable.getDatabaseName)),
+                internalCall = true)
+            dropCacheCommandForChildTable.processMetadata(sparkSession)
+          }
+        }
+    }
+
+  }
+}
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala
index a2923b8..5f5cc12 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala
@@ -33,7 +33,7 @@ import org.apache.spark.sql.execution.command.table.CarbonCreateTableCommand
 import org.apache.spark.sql.types.StructField
 import org.apache.spark.sql.CarbonExpressions.CarbonUnresolvedRelation
 import org.apache.spark.sql.catalyst.analysis.UnresolvedRelation
-import org.apache.spark.sql.execution.command.cache.CarbonShowCacheCommand
+import org.apache.spark.sql.execution.command.cache.{CarbonDropCacheCommand, CarbonShowCacheCommand}
 import org.apache.spark.sql.execution.command.stream.{CarbonCreateStreamCommand, CarbonDropStreamCommand, CarbonShowStreamsCommand}
 import org.apache.spark.sql.util.CarbonException
 import org.apache.spark.util.CarbonReflectionUtils
@@ -95,7 +95,7 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser {
     createStream | dropStream | showStreams
 
   protected lazy val cacheManagement: Parser[LogicalPlan] =
-    showCache
+    showCache | dropCache
 
   protected lazy val alterAddPartition: Parser[LogicalPlan] =
     ALTER ~> TABLE ~> (ident <~ ".").? ~ ident ~ (ADD ~> PARTITION ~>
@@ -503,6 +503,12 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser {
         CarbonShowCacheCommand(table)
     }
 
+  protected lazy val dropCache: Parser[LogicalPlan] =
+    DROP ~> METACACHE ~> ontable <~ opt(";") ^^ {
+      case table =>
+        CarbonDropCacheCommand(table)
+    }
+
   protected lazy val cli: Parser[LogicalPlan] =
     (CARBONCLI ~> FOR ~> TABLE) ~> (ident <~ ".").? ~ ident ~
     (OPTIONS ~> "(" ~> commandOptions <~ ")").? <~


[carbondata] 16/41: [CARBONDATA-3300] Fixed ClassNotFoundException when using UDF in spark-shell

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit bfdff7ff65a831c5aa5baa5ed9c7ead990bafa40
Author: kunal642 <ku...@gmail.com>
AuthorDate: Fri Feb 22 12:03:42 2019 +0530

    [CARBONDATA-3300] Fixed ClassNotFoundException when using UDF in spark-shell
    
    Analysis:
    When a spark-shell is run a scala interpreter session is started which is the main thread for that shell. This session uses TranslatingClassLoader, therefore the UDF(  in the stacktrace) that is defined would be loaded into TranslatingClassLoader.
    
    When deserialization happens an ObjectInputStream is create and the application tries to read the object, the ObjectInputStream uses a native method(sun.misc.VM.latestUserDefinedLoader() ) call to determine the ClassLoader that will be used to load the class. This native method returns URLClassLoader which is the parent of TranslatingClassLoader where the class was loaded.
    Because of this ClassNotFoundException is thrown.
    
    Class Loader Hierarchy
    
    ExtClassLoader(head) -> AppClassLoader -> URLClassLoader -> TranslatingClassLoader
    
    This looks like a bug in the java ObjectInputStream implementation as suggested by the following post
    https://stackoverflow.com/questions/1771679/difference-between-threads-context-class-loader-and-normal-classloader
    
    Operation	Thread	Thread ClassLoader	ClassLoader
    Register	Main	Translating	Translating
    Serialize	Main	Translating	Translating
    Deserialize	Thread-1	Translating	URLClassLoader
    Solution:
    Use ClassLoaderObjectInputStream to specify the class loader that should be used to load the class.
    
    This closes #3132
---
 .../org/apache/carbondata/core/metadata/blocklet/BlockletInfo.java    | 4 +++-
 core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java    | 4 +++-
 .../java/org/apache/carbondata/core/util/ObjectSerializationUtil.java | 3 ++-
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/core/src/main/java/org/apache/carbondata/core/metadata/blocklet/BlockletInfo.java b/core/src/main/java/org/apache/carbondata/core/metadata/blocklet/BlockletInfo.java
index 88706b1..104ef1a 100644
--- a/core/src/main/java/org/apache/carbondata/core/metadata/blocklet/BlockletInfo.java
+++ b/core/src/main/java/org/apache/carbondata/core/metadata/blocklet/BlockletInfo.java
@@ -31,6 +31,7 @@ import java.util.List;
 import org.apache.carbondata.core.metadata.blocklet.datachunk.DataChunk;
 import org.apache.carbondata.core.metadata.blocklet.index.BlockletIndex;
 
+import org.apache.commons.io.input.ClassLoaderObjectInputStream;
 import org.apache.hadoop.io.Writable;
 
 /**
@@ -261,7 +262,8 @@ public class BlockletInfo implements Serializable, Writable {
 
   private DataChunk deserializeDataChunk(byte[] bytes) throws IOException {
     ByteArrayInputStream stream = new ByteArrayInputStream(bytes);
-    ObjectInputStream inputStream = new ObjectInputStream(stream);
+    ObjectInputStream inputStream =
+        new ClassLoaderObjectInputStream(Thread.currentThread().getContextClassLoader(), stream);
     DataChunk dataChunk = null;
     try {
       dataChunk = (DataChunk) inputStream.readObject();
diff --git a/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java b/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
index ffab9c8..d9f69e3 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
@@ -100,6 +100,7 @@ import com.google.gson.Gson;
 import com.google.gson.GsonBuilder;
 import org.apache.commons.codec.binary.Base64;
 import org.apache.commons.io.FileUtils;
+import org.apache.commons.io.input.ClassLoaderObjectInputStream;
 import org.apache.commons.lang.ArrayUtils;
 import org.apache.commons.lang.StringUtils;
 import org.apache.commons.lang3.StringEscapeUtils;
@@ -1536,7 +1537,8 @@ public final class CarbonUtil {
     ValueEncoderMeta meta = null;
     try {
       aos = new ByteArrayInputStream(encoderMeta);
-      objStream = new ObjectInputStream(aos);
+      objStream =
+          new ClassLoaderObjectInputStream(Thread.currentThread().getContextClassLoader(), aos);
       meta = (ValueEncoderMeta) objStream.readObject();
     } catch (ClassNotFoundException e) {
       LOGGER.error(e.getMessage(), e);
diff --git a/core/src/main/java/org/apache/carbondata/core/util/ObjectSerializationUtil.java b/core/src/main/java/org/apache/carbondata/core/util/ObjectSerializationUtil.java
index 48c6e65..169a3da 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/ObjectSerializationUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/ObjectSerializationUtil.java
@@ -26,6 +26,7 @@ import java.util.zip.GZIPOutputStream;
 
 import org.apache.carbondata.common.logging.LogServiceFactory;
 
+import org.apache.commons.io.input.ClassLoaderObjectInputStream;
 import org.apache.log4j.Logger;
 
 /**
@@ -94,7 +95,7 @@ public class ObjectSerializationUtil {
     try {
       bais = new ByteArrayInputStream(bytes);
       gis = new GZIPInputStream(bais);
-      ois = new ObjectInputStream(gis);
+      ois = new ClassLoaderObjectInputStream(Thread.currentThread().getContextClassLoader(), gis);
       return ois.readObject();
     } catch (ClassNotFoundException e) {
       throw new IOException("Could not read object", e);


[carbondata] 03/41: [CARBONDATA-3278] Remove duplicate code to get filter string of date/timestamp

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 9e6b544e419c9715e807e91837c0b46b877fd017
Author: Manhua <ke...@qq.com>
AuthorDate: Mon Jan 28 15:55:45 2019 +0800

    [CARBONDATA-3278] Remove duplicate code to get filter string of date/timestamp
    
    Remove duplicated code to get filter string of date/timestamp by method
    `ExpressionResult.getString()` instead.
    
    This closes #3109
---
 .../datamap/bloom/BloomCoarseGrainDataMap.java     | 41 +++-------------------
 1 file changed, 5 insertions(+), 36 deletions(-)

diff --git a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java
index 4459fc5..fea48c3 100644
--- a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java
+++ b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java
@@ -19,17 +19,13 @@ package org.apache.carbondata.datamap.bloom;
 
 import java.io.IOException;
 import java.io.UnsupportedEncodingException;
-import java.text.DateFormat;
-import java.text.SimpleDateFormat;
 import java.util.ArrayList;
 import java.util.Arrays;
-import java.util.Date;
 import java.util.HashMap;
 import java.util.HashSet;
 import java.util.List;
 import java.util.Map;
 import java.util.Set;
-import java.util.TimeZone;
 import java.util.concurrent.ConcurrentHashMap;
 
 import org.apache.carbondata.common.annotations.InterfaceAudience;
@@ -57,6 +53,7 @@ import org.apache.carbondata.core.scan.expression.LiteralExpression;
 import org.apache.carbondata.core.scan.expression.conditional.EqualToExpression;
 import org.apache.carbondata.core.scan.expression.conditional.InExpression;
 import org.apache.carbondata.core.scan.expression.conditional.ListExpression;
+import org.apache.carbondata.core.scan.expression.exception.FilterIllegalMemberException;
 import org.apache.carbondata.core.scan.expression.logical.AndExpression;
 import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;
 import org.apache.carbondata.core.util.CarbonProperties;
@@ -303,35 +300,6 @@ public class BloomCoarseGrainDataMap extends CoarseGrainDataMap {
     return queryModels;
   }
 
-  /**
-   * Here preprocessed NULL and date/timestamp data type.
-   *
-   * Note that if the datatype is date/timestamp, the expressionValue is long type.
-   */
-  private Object getLiteralExpValue(LiteralExpression le) {
-    Object expressionValue = le.getLiteralExpValue();
-    Object literalValue;
-
-    if (null == expressionValue) {
-      literalValue = null;
-    } else if (le.getLiteralExpDataType() == DataTypes.DATE) {
-      DateFormat format = new SimpleDateFormat(CarbonCommonConstants.CARBON_DATE_DEFAULT_FORMAT);
-      // the below settings are set statically according to DateDirectDirectionaryGenerator
-      format.setLenient(false);
-      format.setTimeZone(TimeZone.getTimeZone("GMT"));
-      literalValue = format.format(new Date((long) expressionValue / 1000));
-    } else if (le.getLiteralExpDataType() == DataTypes.TIMESTAMP) {
-      DateFormat format =
-          new SimpleDateFormat(CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT);
-      // the below settings are set statically according to TimeStampDirectDirectionaryGenerator
-      format.setLenient(false);
-      literalValue = format.format(new Date((long) expressionValue / 1000));
-    } else {
-      literalValue = expressionValue;
-    }
-    return literalValue;
-  }
-
 
   private BloomQueryModel buildQueryModelForEqual(ColumnExpression ce,
       LiteralExpression le) throws DictionaryGenerationException, UnsupportedEncodingException {
@@ -358,11 +326,12 @@ public class BloomCoarseGrainDataMap extends CoarseGrainDataMap {
 
   private byte[] getInternalFilterValue(CarbonColumn carbonColumn, LiteralExpression le) throws
       DictionaryGenerationException, UnsupportedEncodingException {
-    Object filterLiteralValue = getLiteralExpValue(le);
     // convert the filter value to string and apply converters on it to get carbon internal value
     String strFilterValue = null;
-    if (null != filterLiteralValue) {
-      strFilterValue = String.valueOf(filterLiteralValue);
+    try {
+      strFilterValue = le.getExpressionResult().getString();
+    } catch (FilterIllegalMemberException e) {
+      throw new RuntimeException("Error while resolving filter expression", e);
     }
 
     Object convertedValue = this.name2Converters.get(carbonColumn.getColName()).convert(


[carbondata] 29/41: [DOC] Fix the spell mistake of enable.unsafe.in.query.processing

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 976e48a843a37531b06b62c2e79e1ab465d316af
Author: qiuchenjian <80...@qq.com>
AuthorDate: Sat Mar 23 09:28:32 2019 +0800

    [DOC] Fix the spell mistake of enable.unsafe.in.query.processing
    
    Fix the spell mistake of enable.unsafe.in.query.processing
    
    This closes #3160
---
 docs/usecases.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/usecases.md b/docs/usecases.md
index 0cfcf85..8ff4975 100644
--- a/docs/usecases.md
+++ b/docs/usecases.md
@@ -148,8 +148,8 @@ Use all columns are no-dictionary as the cardinality is high.
 | Compaction | carbon.number.of.cores.while.compacting | 12                      | Higher number of cores can improve the compaction speed.Data size is huge.Compaction need to use more threads to speed up the process |
 | Compaction | carbon.enable.auto.load.merge           | FALSE                   | Doing auto minor compaction is costly process as data size is huge.Perform manual compaction when the cluster is less loaded |
 | Query | carbon.enable.vector.reader             | true                    | To fetch results faster, supporting spark vector processing will speed up the query |
-| Query | enable.unsafe.in.query.procressing      | true                    | Data that needs to be scanned in huge which in turn generates more short lived Java objects. This cause pressure of GC.using unsafe and offheap will reduce the GC overhead |
-| Query | use.offheap.in.query.processing         | true                    | Data that needs to be scanned in huge which in turn generates more short lived Java objects. This cause pressure of GC.using unsafe and offheap will reduce the GC overhead.offheap can be accessed through java unsafe.hence enable.unsafe.in.query.procressing needs to be true |
+| Query | enable.unsafe.in.query.processing      | true                    | Data that needs to be scanned in huge which in turn generates more short lived Java objects. This cause pressure of GC.using unsafe and offheap will reduce the GC overhead |
+| Query | use.offheap.in.query.processing         | true                    | Data that needs to be scanned in huge which in turn generates more short lived Java objects. This cause pressure of GC.using unsafe and offheap will reduce the GC overhead.offheap can be accessed through java unsafe.hence enable.unsafe.in.query.processing needs to be true |
 | Query | enable.unsafe.columnpage                | TRUE                    | Keep the column pages in offheap memory so that the memory overhead due to java object is less and also reduces GC pressure. |
 | Query | carbon.unsafe.working.memory.in.mb      | 10240                   | Amount of memory to use for offheap operations, you can increase this memory based on the data size |
 


[carbondata] 07/41: [CARBONDATA-3276] Compacting table that do not exist should modify the message of MalformedCarbonCommandException

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit f6e1c2e6a7c926d1a73865e2d0a3d267a70a204c
Author: qiuchenjian <80...@qq.com>
AuthorDate: Mon Jan 28 10:26:46 2019 +0800

    [CARBONDATA-3276] Compacting table that do not exist should modify the message of MalformedCarbonCommandException
    
    This closes #3106
---
 .../cluster/sdv/generated/AlterTableTestCase.scala           | 10 +++++++++-
 .../apache/spark/sql/execution/strategy/DDLStrategy.scala    | 12 +++++++++---
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/AlterTableTestCase.scala b/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/AlterTableTestCase.scala
index 2cf1794..d15f70b 100644
--- a/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/AlterTableTestCase.scala
+++ b/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/AlterTableTestCase.scala
@@ -23,11 +23,11 @@ import org.apache.spark.sql.Row
 import org.apache.spark.sql.common.util._
 import org.apache.spark.util.SparkUtil
 import org.scalatest.BeforeAndAfterAll
-
 import org.apache.carbondata.common.constants.LoggerAction
 import org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
 import org.apache.carbondata.core.constants.CarbonCommonConstants
 import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.spark.sql.catalyst.analysis.NoSuchTableException
 
 /**
  * Test Class for AlterTableTestCase to verify all scenerios
@@ -895,6 +895,14 @@ class AlterTableTestCase extends QueryTest with BeforeAndAfterAll {
      sql(s"""drop table if exists test1""").collect
   }
 
+  test("Compaction_001_13", Include) {
+    sql("drop table if exists no_table")
+    var ex = intercept[MalformedCarbonCommandException] {
+      sql("alter table no_table compact 'major'")
+    }
+    assertResult("Table or view 'no_table' not found in database 'default' or not carbon fileformat")(ex.getMessage)
+  }
+
 
   //Check bad record locaion isnot changed when table name is altered
   test("BadRecords_001_01", Include) {
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/DDLStrategy.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/DDLStrategy.scala
index 40a8fd5..7d449b5 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/DDLStrategy.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/DDLStrategy.scala
@@ -36,7 +36,7 @@ import org.apache.spark.sql.types.StructField
 import org.apache.spark.util.{CarbonReflectionUtils, FileUtils, SparkUtil}
 
 import org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
-import org.apache.carbondata.common.logging.{LogService, LogServiceFactory}
+import org.apache.carbondata.common.logging.LogServiceFactory
 import org.apache.carbondata.core.metadata.schema.table.CarbonTable
 import org.apache.carbondata.core.util.{CarbonProperties, DataTypeUtil, ThreadLocalSessionInfo}
 import org.apache.carbondata.spark.util.Util
@@ -125,7 +125,9 @@ class DDLStrategy(sparkSession: SparkSession) extends SparkStrategy {
             ExecutedCommandExec(alterTable) :: Nil
         } else {
           throw new MalformedCarbonCommandException(
-            "Operation not allowed : " + altertablemodel.alterSql)
+            String.format("Table or view '%s' not found in database '%s' or not carbon fileformat",
+            altertablemodel.tableName,
+            altertablemodel.dbName.getOrElse("default")))
         }
       case colRenameDataTypeChange@CarbonAlterTableColRenameDataTypeChangeCommand(
       alterTableColRenameAndDataTypeChangeModel, _) =>
@@ -146,7 +148,11 @@ class DDLStrategy(sparkSession: SparkSession) extends SparkStrategy {
             ExecutedCommandExec(colRenameDataTypeChange) :: Nil
           }
         } else {
-          throw new MalformedCarbonCommandException("Unsupported alter operation on hive table")
+          throw new MalformedCarbonCommandException(
+            String.format("Table or view '%s' not found in database '%s' or not carbon fileformat",
+              alterTableColRenameAndDataTypeChangeModel.tableName,
+              alterTableColRenameAndDataTypeChangeModel.
+                databaseName.getOrElse("default")))
         }
       case addColumn@CarbonAlterTableAddColumnCommand(alterTableAddColumnsModel) =>
         val isCarbonTable = CarbonEnv.getInstance(sparkSession).carbonMetaStore


[carbondata] 21/41: [CARBONDATA-3314] Fix for Index Cache Size in SHOW METACACHE DDL

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 3f6a8534e30a356af2fcd4883f7ed27e5f8c6d79
Author: shivamasn <sh...@gmail.com>
AuthorDate: Tue Mar 12 14:40:13 2019 +0530

    [CARBONDATA-3314] Fix for Index Cache Size in SHOW METACACHE DDL
    
    Problem :
    Index Cache Size printed in SHOW METACACHE on TABLE DDL is not accurate.
    
    Solution :
    Added a utility function in CommonUtil which will convert the bytes count
    to display size and display the accurate cache size upto 2 decimal places.
    
    This closes #3143
---
 .../sql/commands/TestCarbonShowCacheCommand.scala  |  4 +--
 .../apache/carbondata/spark/util/CommonUtil.scala  | 39 +++++++++++++++++++-
 .../command/cache/CarbonShowCacheCommand.scala     | 41 +++++++++++-----------
 3 files changed, 60 insertions(+), 24 deletions(-)

diff --git a/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala b/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala
index 69c5f7e..e7fd5fa 100644
--- a/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala
+++ b/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala
@@ -151,7 +151,7 @@ class TestCarbonShowCacheCommand extends QueryTest with BeforeAndAfterAll {
     sql("use cache_empty_db").collect()
     val result1 = sql("show metacache").collect()
     assertResult(2)(result1.length)
-    assertResult(Row("cache_empty_db", "ALL", "0 bytes", "0 bytes", "0 bytes"))(result1(1))
+    assertResult(Row("cache_empty_db", "ALL", "0 B", "0 B", "0 B"))(result1(1))
 
     sql("use cache_db").collect()
     val result2 = sql("show metacache").collect()
@@ -174,7 +174,7 @@ class TestCarbonShowCacheCommand extends QueryTest with BeforeAndAfterAll {
     assertResult(2)(result2.length)
 
     checkAnswer(sql("show metacache on table cache_db.cache_3"),
-      Seq(Row("Index", "0 bytes", "0/1 index files cached"), Row("Dictionary", "0 bytes", "")))
+      Seq(Row("Index", "0 B", "0/1 index files cached"), Row("Dictionary", "0 B", "")))
 
     val result4 = sql("show metacache on table default.cache_4").collect()
     assertResult(3)(result4.length)
diff --git a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
index 34813ca..7887d87 100644
--- a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
+++ b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
@@ -19,13 +19,14 @@ package org.apache.carbondata.spark.util
 
 
 import java.io.File
+import java.math.BigDecimal
 import java.text.SimpleDateFormat
 import java.util
 import java.util.regex.{Matcher, Pattern}
 
 import scala.collection.JavaConverters._
 import scala.collection.mutable.Map
-import scala.util.Random
+import scala.math.BigDecimal.RoundingMode
 
 import org.apache.hadoop.conf.Configuration
 import org.apache.hadoop.mapreduce.lib.input.FileInputFormat
@@ -62,6 +63,19 @@ object CommonUtil {
   val FIXED_DECIMAL = """decimal\(\s*(\d+)\s*,\s*(\-?\d+)\s*\)""".r
   val FIXED_DECIMALTYPE = """decimaltype\(\s*(\d+)\s*,\s*(\-?\d+)\s*\)""".r
 
+  val ONE_KB: Long = 1024L
+  val ONE_KB_BI: BigDecimal = BigDecimal.valueOf(ONE_KB)
+  val ONE_MB: Long = ONE_KB * ONE_KB
+  val ONE_MB_BI: BigDecimal = BigDecimal.valueOf(ONE_MB)
+  val ONE_GB: Long = ONE_KB * ONE_MB
+  val ONE_GB_BI: BigDecimal = BigDecimal.valueOf(ONE_GB)
+  val ONE_TB: Long = ONE_KB * ONE_GB
+  val ONE_TB_BI: BigDecimal = BigDecimal.valueOf(ONE_TB)
+  val ONE_PB: Long = ONE_KB * ONE_TB
+  val ONE_PB_BI: BigDecimal = BigDecimal.valueOf(ONE_PB)
+  val ONE_EB: Long = ONE_KB * ONE_PB
+  val ONE_EB_BI: BigDecimal = BigDecimal.valueOf(ONE_EB)
+
   def getColumnProperties(column: String,
       tableProperties: Map[String, String]): Option[util.List[ColumnProperty]] = {
     val fieldProps = new util.ArrayList[ColumnProperty]()
@@ -862,4 +876,27 @@ object CommonUtil {
       }
     }
   }
+
+  def bytesToDisplaySize(size: Long): String = bytesToDisplaySize(BigDecimal.valueOf(size))
+
+  // This method converts the bytes count to display size upto 2 decimal places
+  def bytesToDisplaySize(size: BigDecimal): String = {
+    var displaySize: String = null
+    if (size.divideToIntegralValue(ONE_EB_BI).compareTo(BigDecimal.ZERO) > 0) {
+      displaySize = size.divide(ONE_EB_BI).setScale(2, RoundingMode.HALF_DOWN).doubleValue() + " EB"
+    } else if (size.divideToIntegralValue(ONE_PB_BI).compareTo(BigDecimal.ZERO) > 0) {
+      displaySize = size.divide(ONE_PB_BI).setScale(2, RoundingMode.HALF_DOWN).doubleValue() + " PB"
+    } else if (size.divideToIntegralValue(ONE_TB_BI).compareTo(BigDecimal.ZERO) > 0) {
+      displaySize = size.divide(ONE_TB_BI).setScale(2, RoundingMode.HALF_DOWN).doubleValue() + " TB"
+    } else if (size.divideToIntegralValue(ONE_GB_BI).compareTo(BigDecimal.ZERO) > 0) {
+      displaySize = size.divide(ONE_GB_BI).setScale(2, RoundingMode.HALF_DOWN).doubleValue() + " GB"
+    } else if (size.divideToIntegralValue(ONE_MB_BI).compareTo(BigDecimal.ZERO) > 0) {
+      displaySize = size.divide(ONE_MB_BI).setScale(2, RoundingMode.HALF_DOWN).doubleValue() + " MB"
+    } else if (size.divideToIntegralValue(ONE_KB_BI).compareTo(BigDecimal.ZERO) > 0) {
+      displaySize = size.divide(ONE_KB_BI).setScale(2, RoundingMode.HALF_DOWN).doubleValue() + " KB"
+    } else {
+      displaySize = size + " B"
+    }
+    displaySize
+  }
 }
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
index e5f89d8..462be83 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
@@ -20,13 +20,11 @@ package org.apache.spark.sql.execution.command.cache
 import scala.collection.mutable
 import scala.collection.JavaConverters._
 
-import org.apache.commons.io.FileUtils.byteCountToDisplaySize
 import org.apache.spark.sql.{CarbonEnv, Row, SparkSession}
 import org.apache.spark.sql.catalyst.TableIdentifier
-import org.apache.spark.sql.catalyst.analysis.NoSuchTableException
-import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeReference}
+import org.apache.spark.sql.catalyst.expressions.AttributeReference
 import org.apache.spark.sql.execution.command.MetadataCommand
-import org.apache.spark.sql.types.{LongType, StringType}
+import org.apache.spark.sql.types.StringType
 
 import org.apache.carbondata.core.cache.CacheProvider
 import org.apache.carbondata.core.cache.dictionary.AbstractColumnDictionaryInfo
@@ -37,6 +35,7 @@ import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier
 import org.apache.carbondata.core.metadata.schema.table.{CarbonTable, DataMapSchema}
 import org.apache.carbondata.datamap.bloom.BloomCacheKeyValue
 import org.apache.carbondata.processing.merger.CarbonDataMergerUtil
+import org.apache.carbondata.spark.util.CommonUtil.bytesToDisplaySize
 
 /**
  * SHOW CACHE
@@ -67,10 +66,10 @@ case class CarbonShowCacheCommand(tableIdentifier: Option[TableIdentifier])
     val cache = CacheProvider.getInstance().getCarbonCache()
     if (cache == null) {
       Seq(
-        Row("ALL", "ALL", byteCountToDisplaySize(0L),
-          byteCountToDisplaySize(0L), byteCountToDisplaySize(0L)),
-        Row(currentDatabase, "ALL", byteCountToDisplaySize(0L),
-          byteCountToDisplaySize(0L), byteCountToDisplaySize(0L)))
+        Row("ALL", "ALL", bytesToDisplaySize(0L),
+          bytesToDisplaySize(0L), bytesToDisplaySize(0L)),
+        Row(currentDatabase, "ALL", bytesToDisplaySize(0L),
+          bytesToDisplaySize(0L), bytesToDisplaySize(0L)))
     } else {
       val carbonTables = CarbonEnv.getInstance(sparkSession).carbonMetaStore
         .listAllTables(sparkSession)
@@ -152,10 +151,10 @@ case class CarbonShowCacheCommand(tableIdentifier: Option[TableIdentifier])
       }
       if (tableMapIndexSize.isEmpty && tableMapDatamapSize.isEmpty && tableMapDictSize.isEmpty) {
         Seq(
-          Row("ALL", "ALL", byteCountToDisplaySize(allIndexSize),
-            byteCountToDisplaySize(allDatamapSize), byteCountToDisplaySize(allDictSize)),
-          Row(currentDatabase, "ALL", byteCountToDisplaySize(0),
-            byteCountToDisplaySize(0), byteCountToDisplaySize(0)))
+          Row("ALL", "ALL", bytesToDisplaySize(allIndexSize),
+            bytesToDisplaySize(allDatamapSize), bytesToDisplaySize(allDictSize)),
+          Row(currentDatabase, "ALL", bytesToDisplaySize(0),
+            bytesToDisplaySize(0), bytesToDisplaySize(0)))
       } else {
         val tableList = tableMapIndexSize
           .map(_._1)
@@ -168,15 +167,15 @@ case class CarbonShowCacheCommand(tableIdentifier: Option[TableIdentifier])
             val indexSize = tableMapIndexSize.getOrElse(uniqueName, 0L)
             val datamapSize = tableMapDatamapSize.getOrElse(uniqueName, 0L)
             val dictSize = tableMapDictSize.getOrElse(uniqueName, 0L)
-            Row(values(0), values(1), byteCountToDisplaySize(indexSize),
-              byteCountToDisplaySize(datamapSize), byteCountToDisplaySize(dictSize))
+            Row(values(0), values(1), bytesToDisplaySize(indexSize),
+              bytesToDisplaySize(datamapSize), bytesToDisplaySize(dictSize))
           }
 
         Seq(
-          Row("ALL", "ALL", byteCountToDisplaySize(allIndexSize),
-            byteCountToDisplaySize(allDatamapSize), byteCountToDisplaySize(allDictSize)),
-          Row(currentDatabase, "ALL", byteCountToDisplaySize(dbIndexSize),
-            byteCountToDisplaySize(dbDatamapSize), byteCountToDisplaySize(dbDictSize))
+          Row("ALL", "ALL", bytesToDisplaySize(allIndexSize),
+            bytesToDisplaySize(allDatamapSize), bytesToDisplaySize(allDictSize)),
+          Row(currentDatabase, "ALL", bytesToDisplaySize(dbIndexSize),
+            bytesToDisplaySize(dbDatamapSize), bytesToDisplaySize(dbDictSize))
         ) ++ tableList
       }
     }
@@ -274,14 +273,14 @@ case class CarbonShowCacheCommand(tableIdentifier: Option[TableIdentifier])
       }.size
 
       var result = Seq(
-        Row("Index", byteCountToDisplaySize(datamapSize.get(tablePath).get),
+        Row("Index", bytesToDisplaySize(datamapSize.get(tablePath).get),
           numIndexFilesCached + "/" + numIndexFilesAll + " index files cached"),
-        Row("Dictionary", byteCountToDisplaySize(dictSize), "")
+        Row("Dictionary", bytesToDisplaySize(dictSize), "")
       )
       for ((path, size) <- datamapSize) {
         if (path != tablePath) {
           val (dmName, dmType) = datamapName.get(path).get
-          result = result :+ Row(dmName, byteCountToDisplaySize(size), dmType)
+          result = result :+ Row(dmName, bytesToDisplaySize(size), dmType)
         }
       }
       result


[carbondata] 30/41: [CARBONDATA-3322] [CARBONDATA-3323] Added check for invalid tables in ShowCacheCommand & Standard output on ShowCacheCommand on table

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 0f6ab067a6e4871295afb0f3112d60a6a4e4507d
Author: namanrastogi <na...@gmail.com>
AuthorDate: Thu Mar 21 20:33:55 2019 +0530

    [CARBONDATA-3322] [CARBONDATA-3323] Added check for invalid tables in ShowCacheCommand & Standard output on ShowCacheCommand on table
    
    Problem 1:
    After we alter table name from t1 to t2, SHOW METACACHE ON TABLE works for both old table name t1 and new table name t2.
    Fix:
    Added check for table.
    
    Problem 2:
    When SHOW METACACHE ON TABLE is executed and carbonLRUCAche is null, output is empty sequence, which is not standard.
    Fix:
    Return standard output even when carbonLRUCache is not initalised (null) with size for index and dictionary as 0.
    
    This closes #3157
---
 .../command/cache/CarbonShowCacheCommand.scala     | 31 +++++++++++++++++-----
 1 file changed, 24 insertions(+), 7 deletions(-)

diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
index e19ee48..8461bf3 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
@@ -23,8 +23,9 @@ import scala.collection.JavaConverters._
 import org.apache.hadoop.mapred.JobConf
 import org.apache.spark.sql.{CarbonEnv, Row, SparkSession}
 import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.analysis.NoSuchTableException
 import org.apache.spark.sql.catalyst.expressions.AttributeReference
-import org.apache.spark.sql.execution.command.MetadataCommand
+import org.apache.spark.sql.execution.command.{Checker, MetadataCommand}
 import org.apache.spark.sql.types.StringType
 
 import org.apache.carbondata.core.cache.{CacheProvider, CacheType}
@@ -64,7 +65,7 @@ case class CarbonShowCacheCommand(tableIdentifier: Option[TableIdentifier],
 
   def getAllTablesCache(sparkSession: SparkSession): Seq[Row] = {
     val currentDatabase = sparkSession.sessionState.catalog.getCurrentDatabase
-    val cache = CacheProvider.getInstance().getCarbonCache()
+    val cache = CacheProvider.getInstance().getCarbonCache
     if (cache == null) {
       Seq(
         Row("ALL", "ALL", 0L, 0L, 0L),
@@ -74,6 +75,7 @@ case class CarbonShowCacheCommand(tableIdentifier: Option[TableIdentifier],
         .listAllTables(sparkSession).filter {
         carbonTable =>
           carbonTable.getDatabaseName.equalsIgnoreCase(currentDatabase) &&
+          isValidTable(carbonTable, sparkSession) &&
           !carbonTable.isChildDataMap
       }
 
@@ -131,6 +133,18 @@ case class CarbonShowCacheCommand(tableIdentifier: Option[TableIdentifier],
 
   def getTableCache(sparkSession: SparkSession, carbonTable: CarbonTable): Seq[Row] = {
     val cache = CacheProvider.getInstance().getCarbonCache
+    val allIndexFiles: List[String] = CacheUtil.getAllIndexFiles(carbonTable)
+    if (cache == null) {
+      var comments = 0 + "/" + allIndexFiles.size + " index files cached"
+      if (!carbonTable.isTransactionalTable) {
+        comments += " (external table)"
+      }
+      return Seq(
+        Row("Index", 0L, comments),
+        Row("Dictionary", 0L, "")
+      )
+    }
+
     val showTableCacheEvent = ShowTableCacheEvent(carbonTable, sparkSession, internalCall)
     val operationContext = new OperationContext
     // datamapName -> (datamapProviderName, indexSize, datamapSize)
@@ -138,8 +152,7 @@ case class CarbonShowCacheCommand(tableIdentifier: Option[TableIdentifier],
     operationContext.setProperty(carbonTable.getTableUniqueName, currentTableSizeMap)
     OperationListenerBus.getInstance.fireEvent(showTableCacheEvent, operationContext)
 
-    // Get all Index files for the specified table.
-    val allIndexFiles: List[String] = CacheUtil.getAllIndexFiles(carbonTable)
+    // Get all Index files for the specified table in cache
     val indexFilesInCache: List[String] = allIndexFiles.filter {
       indexFile =>
         cache.get(indexFile) != null
@@ -190,9 +203,8 @@ case class CarbonShowCacheCommand(tableIdentifier: Option[TableIdentifier],
        * Assemble result for table
        */
       val carbonTable = CarbonEnv.getCarbonTable(tableIdentifier.get)(sparkSession)
-      if (CacheProvider.getInstance().getCarbonCache == null) {
-        return Seq.empty
-      }
+      Checker
+        .validateTableExists(tableIdentifier.get.database, tableIdentifier.get.table, sparkSession)
       val rawResult = getTableCache(sparkSession, carbonTable)
       val result = rawResult.slice(0, 2) ++
                    rawResult.drop(2).map {
@@ -205,4 +217,9 @@ case class CarbonShowCacheCommand(tableIdentifier: Option[TableIdentifier],
       }
     }
   }
+
+  def isValidTable(carbonTable: CarbonTable, sparkSession: SparkSession): Boolean = {
+    CarbonEnv.getInstance(sparkSession).carbonMetaStore.tableExists(carbonTable.getTableName,
+      Some(carbonTable.getDatabaseName))(sparkSession)
+  }
 }


[carbondata] 37/41: [CARBONDATA-3333]Fixed No Sort Store Size issue and Compatibility issue after alter added column done in 1.1 and load in 1.5

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit d52ef3220b24d5320df9bc5b874c7960239bf463
Author: kumarvishal09 <ku...@gmail.com>
AuthorDate: Tue Mar 26 22:46:01 2019 +0530

    [CARBONDATA-3333]Fixed No Sort Store Size issue and Compatibility issue after alter added column done in 1.1 and load in 1.5
    
    Issue 1: Load is failing in latest version with alter in older version
    This is because in table spec was not created based on sort column order and while writing re-arranging the schema the column page is not handled
    
    Issue 2: After PR#3140 store size got increased
    Store size got increased after pr#3140 and because of this query performance got degraded, in this pr reverted back the changes done in PR#3140
---
 .../carbondata/core/datastore/TableSpec.java       | 58 ++++++++++++++------
 .../CarbonRowDataWriterProcessorStepImpl.java      | 61 ++++++++++++----------
 .../carbondata/processing/store/TablePage.java     | 49 ++++++++++++++---
 3 files changed, 117 insertions(+), 51 deletions(-)

diff --git a/core/src/main/java/org/apache/carbondata/core/datastore/TableSpec.java b/core/src/main/java/org/apache/carbondata/core/datastore/TableSpec.java
index a26d6ae..002104a 100644
--- a/core/src/main/java/org/apache/carbondata/core/datastore/TableSpec.java
+++ b/core/src/main/java/org/apache/carbondata/core/datastore/TableSpec.java
@@ -71,30 +71,52 @@ public class TableSpec {
   }
 
   private void addDimensions(List<CarbonDimension> dimensions) {
-    int dimIndex = 0;
+    List<DimensionSpec> sortDimSpec = new ArrayList<>();
+    List<DimensionSpec> noSortDimSpec = new ArrayList<>();
+    List<DimensionSpec> noSortNoDictDimSpec = new ArrayList<>();
+    List<DimensionSpec> sortNoDictDimSpec = new ArrayList<>();
+    DimensionSpec spec;
+    short actualPosition = 0;
+    // sort step's output is based on sort column order i.e sort columns data will be present
+    // ahead of non sort columns, so table spec also need to add dimension spec in same manner
     for (int i = 0; i < dimensions.size(); i++) {
       CarbonDimension dimension = dimensions.get(i);
       if (dimension.isComplex()) {
-        DimensionSpec spec = new DimensionSpec(ColumnType.COMPLEX, dimension);
-        dimensionSpec[dimIndex++] = spec;
-        noDictionaryDimensionSpec.add(spec);
+        spec = new DimensionSpec(ColumnType.COMPLEX, dimension, actualPosition++);
       } else if (dimension.getDataType() == DataTypes.TIMESTAMP && !dimension
           .isDirectDictionaryEncoding()) {
-        DimensionSpec spec = new DimensionSpec(ColumnType.PLAIN_VALUE, dimension);
-        dimensionSpec[dimIndex++] = spec;
-        noDictionaryDimensionSpec.add(spec);
+        spec = new DimensionSpec(ColumnType.PLAIN_VALUE, dimension, actualPosition++);
       } else if (dimension.isDirectDictionaryEncoding()) {
-        DimensionSpec spec = new DimensionSpec(ColumnType.DIRECT_DICTIONARY, dimension);
-        dimensionSpec[dimIndex++] = spec;
+        spec = new DimensionSpec(ColumnType.DIRECT_DICTIONARY, dimension, actualPosition++);
       } else if (dimension.isGlobalDictionaryEncoding()) {
-        DimensionSpec spec = new DimensionSpec(ColumnType.GLOBAL_DICTIONARY, dimension);
-        dimensionSpec[dimIndex++] = spec;
+        spec = new DimensionSpec(ColumnType.GLOBAL_DICTIONARY, dimension, actualPosition++);
       } else {
-        DimensionSpec spec = new DimensionSpec(ColumnType.PLAIN_VALUE, dimension);
-        dimensionSpec[dimIndex++] = spec;
-        noDictionaryDimensionSpec.add(spec);
+        spec = new DimensionSpec(ColumnType.PLAIN_VALUE, dimension, actualPosition++);
+      }
+      if (dimension.isSortColumn()) {
+        sortDimSpec.add(spec);
+        if (!dimension.isDirectDictionaryEncoding() && !dimension.isGlobalDictionaryEncoding()
+            || spec.getColumnType() == ColumnType.COMPLEX) {
+          sortNoDictDimSpec.add(spec);
+        }
+      } else {
+        noSortDimSpec.add(spec);
+        if (!dimension.isDirectDictionaryEncoding() && !dimension.isGlobalDictionaryEncoding()
+            || spec.getColumnType() == ColumnType.COMPLEX) {
+          noSortNoDictDimSpec.add(spec);
+        }
       }
     }
+    // combine the result
+    final DimensionSpec[] sortDimensionSpecs =
+        sortDimSpec.toArray(new DimensionSpec[sortDimSpec.size()]);
+    final DimensionSpec[] noSortDimensionSpecs =
+        noSortDimSpec.toArray(new DimensionSpec[noSortDimSpec.size()]);
+    System.arraycopy(sortDimensionSpecs, 0, dimensionSpec, 0, sortDimensionSpecs.length);
+    System.arraycopy(noSortDimensionSpecs, 0, dimensionSpec, sortDimensionSpecs.length,
+        noSortDimensionSpecs.length);
+    noDictionaryDimensionSpec.addAll(sortNoDictDimSpec);
+    noDictionaryDimensionSpec.addAll(noSortNoDictDimSpec);
   }
 
   private void addMeasures(List<CarbonMeasure> measures) {
@@ -255,10 +277,13 @@ public class TableSpec {
     // indicate whether this dimension need to do inverted index
     private boolean doInvertedIndex;
 
-    DimensionSpec(ColumnType columnType, CarbonDimension dimension) {
+    // indicate the actual postion in blocklet
+    private short actualPostion;
+    DimensionSpec(ColumnType columnType, CarbonDimension dimension, short actualPostion) {
       super(dimension.getColName(), dimension.getDataType(), columnType);
       this.inSortColumns = dimension.isSortColumn();
       this.doInvertedIndex = dimension.isUseInvertedIndex();
+      this.actualPostion = actualPostion;
     }
 
     public boolean isInSortColumns() {
@@ -269,6 +294,9 @@ public class TableSpec {
       return doInvertedIndex;
     }
 
+    public short getActualPostion() {
+      return actualPostion;
+    }
     @Override
     public void write(DataOutput out) throws IOException {
       super.write(out);
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/steps/CarbonRowDataWriterProcessorStepImpl.java b/processing/src/main/java/org/apache/carbondata/processing/loading/steps/CarbonRowDataWriterProcessorStepImpl.java
index 6345035..25f7cfb 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/steps/CarbonRowDataWriterProcessorStepImpl.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/steps/CarbonRowDataWriterProcessorStepImpl.java
@@ -18,7 +18,9 @@ package org.apache.carbondata.processing.loading.steps;
 
 import java.io.IOException;
 import java.util.Iterator;
+import java.util.List;
 import java.util.Map;
+import java.util.concurrent.CopyOnWriteArrayList;
 import java.util.concurrent.ExecutorService;
 import java.util.concurrent.Executors;
 import java.util.concurrent.Future;
@@ -81,17 +83,16 @@ public class CarbonRowDataWriterProcessorStepImpl extends AbstractDataLoadProces
 
   private Map<String, LocalDictionaryGenerator> localDictionaryGeneratorMap;
 
-  private CarbonFactHandler dataHandler;
+  private List<CarbonFactHandler> carbonFactHandlers;
 
   private ExecutorService executorService = null;
 
-  private static final Object lock = new Object();
-
   public CarbonRowDataWriterProcessorStepImpl(CarbonDataLoadConfiguration configuration,
       AbstractDataLoadProcessorStep child) {
     super(configuration, child);
     this.localDictionaryGeneratorMap =
         CarbonUtil.getLocalDictionaryModel(configuration.getTableSpec().getCarbonTable());
+    this.carbonFactHandlers = new CopyOnWriteArrayList<>();
   }
 
   @Override public void initialize() throws IOException {
@@ -128,31 +129,20 @@ public class CarbonRowDataWriterProcessorStepImpl extends AbstractDataLoadProces
           .recordDictionaryValue2MdkAdd2FileTime(CarbonTablePath.DEPRECATED_PARTITION_ID,
               System.currentTimeMillis());
 
-      //Creating a Instance of CarbonFacthandler that will be passed to all the threads
-      String[] storeLocation = getStoreLocation();
-      DataMapWriterListener listener = getDataMapWriterListener(0);
-      CarbonFactDataHandlerModel model = CarbonFactDataHandlerModel
-          .createCarbonFactDataHandlerModel(configuration, storeLocation, 0, 0, listener);
-      model.setColumnLocalDictGenMap(localDictionaryGeneratorMap);
-      dataHandler = CarbonFactHandlerFactory.createCarbonFactHandler(model);
-      dataHandler.initialise();
-
       if (iterators.length == 1) {
-        doExecute(iterators[0], 0, dataHandler);
+        doExecute(iterators[0], 0);
       } else {
         executorService = Executors.newFixedThreadPool(iterators.length,
             new CarbonThreadFactory("NoSortDataWriterPool:" + configuration.getTableIdentifier()
                 .getCarbonTableIdentifier().getTableName(), true));
         Future[] futures = new Future[iterators.length];
         for (int i = 0; i < iterators.length; i++) {
-          futures[i] = executorService.submit(new DataWriterRunnable(iterators[i], i, dataHandler));
+          futures[i] = executorService.submit(new DataWriterRunnable(iterators[i], i));
         }
         for (Future future : futures) {
           future.get();
         }
       }
-      finish(dataHandler, 0);
-      dataHandler = null;
     } catch (CarbonDataWriterException e) {
       LOGGER.error("Failed for table: " + tableName + " in DataWriterProcessorStepImpl", e);
       throw new CarbonDataLoadingException(
@@ -167,15 +157,31 @@ public class CarbonRowDataWriterProcessorStepImpl extends AbstractDataLoadProces
     return null;
   }
 
-  private void doExecute(Iterator<CarbonRowBatch> iterator, int iteratorIndex,
-      CarbonFactHandler dataHandler) throws IOException {
+  private void doExecute(Iterator<CarbonRowBatch> iterator, int iteratorIndex) throws IOException {
+    String[] storeLocation = getStoreLocation();
+    DataMapWriterListener listener = getDataMapWriterListener(0);
+    CarbonFactDataHandlerModel model = CarbonFactDataHandlerModel.createCarbonFactDataHandlerModel(
+        configuration, storeLocation, 0, iteratorIndex, listener);
+    model.setColumnLocalDictGenMap(localDictionaryGeneratorMap);
+    CarbonFactHandler dataHandler = null;
     boolean rowsNotExist = true;
     while (iterator.hasNext()) {
       if (rowsNotExist) {
         rowsNotExist = false;
+        dataHandler = CarbonFactHandlerFactory.createCarbonFactHandler(model);
+        this.carbonFactHandlers.add(dataHandler);
+        dataHandler.initialise();
       }
       processBatch(iterator.next(), dataHandler, iteratorIndex);
     }
+    try {
+      if (!rowsNotExist) {
+        finish(dataHandler, iteratorIndex);
+      }
+    } finally {
+      carbonFactHandlers.remove(dataHandler);
+    }
+
 
   }
 
@@ -300,9 +306,7 @@ public class CarbonRowDataWriterProcessorStepImpl extends AbstractDataLoadProces
       while (batch.hasNext()) {
         CarbonRow row = batch.next();
         CarbonRow converted = convertRow(row);
-        synchronized (lock) {
-          dataHandler.addDataToStore(converted);
-        }
+        dataHandler.addDataToStore(converted);
         readCounter[iteratorIndex]++;
       }
       writeCounter[iteratorIndex] += batch.getSize();
@@ -316,18 +320,15 @@ public class CarbonRowDataWriterProcessorStepImpl extends AbstractDataLoadProces
 
     private Iterator<CarbonRowBatch> iterator;
     private int iteratorIndex = 0;
-    private CarbonFactHandler dataHandler = null;
 
-    DataWriterRunnable(Iterator<CarbonRowBatch> iterator, int iteratorIndex,
-        CarbonFactHandler dataHandler) {
+    DataWriterRunnable(Iterator<CarbonRowBatch> iterator, int iteratorIndex) {
       this.iterator = iterator;
       this.iteratorIndex = iteratorIndex;
-      this.dataHandler = dataHandler;
     }
 
     @Override public void run() {
       try {
-        doExecute(this.iterator, iteratorIndex, dataHandler);
+        doExecute(this.iterator, iteratorIndex);
       } catch (IOException e) {
         LOGGER.error(e.getMessage(), e);
         throw new RuntimeException(e);
@@ -341,9 +342,11 @@ public class CarbonRowDataWriterProcessorStepImpl extends AbstractDataLoadProces
       if (null != executorService) {
         executorService.shutdownNow();
       }
-      if (null != dataHandler) {
-        dataHandler.finish();
-        dataHandler.closeHandler();
+      if (null != this.carbonFactHandlers && !this.carbonFactHandlers.isEmpty()) {
+        for (CarbonFactHandler carbonFactHandler : this.carbonFactHandlers) {
+          carbonFactHandler.finish();
+          carbonFactHandler.closeHandler();
+        }
       }
     }
   }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/store/TablePage.java b/processing/src/main/java/org/apache/carbondata/processing/store/TablePage.java
index 7cc8932..5687549 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/store/TablePage.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/store/TablePage.java
@@ -22,7 +22,6 @@ import java.io.DataOutputStream;
 import java.io.IOException;
 import java.nio.ByteBuffer;
 import java.util.ArrayList;
-import java.util.Arrays;
 import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
@@ -393,12 +392,14 @@ public class TablePage {
   private EncodedColumnPage[] encodeAndCompressDimensions()
       throws KeyGenException, IOException, MemoryException {
     List<EncodedColumnPage> encodedDimensions = new ArrayList<>();
-    List<EncodedColumnPage> encodedComplexDimensions = new ArrayList<>();
+    EncodedColumnPage[][] complexColumnPages =
+        new EncodedColumnPage[complexDimensionPages.length][];
     TableSpec tableSpec = model.getTableSpec();
     int dictIndex = 0;
     int noDictIndex = 0;
     int complexDimIndex = 0;
     int numDimensions = tableSpec.getNumDimensions();
+    int totalComplexColumnSize = 0;
     for (int i = 0; i < numDimensions; i++) {
       ColumnPageEncoder columnPageEncoder;
       EncodedColumnPage encodedPage;
@@ -434,17 +435,51 @@ public class TablePage {
           break;
         case COMPLEX:
           EncodedColumnPage[] encodedPages = ColumnPageEncoder.encodeComplexColumn(
-              complexDimensionPages[complexDimIndex++]);
-          encodedComplexDimensions.addAll(Arrays.asList(encodedPages));
+              complexDimensionPages[complexDimIndex]);
+          complexColumnPages[complexDimIndex] = encodedPages;
+          totalComplexColumnSize += encodedPages.length;
+          complexDimIndex++;
           break;
         default:
           throw new IllegalArgumentException("unsupported dimension type:" + spec
               .getColumnType());
       }
     }
-
-    encodedDimensions.addAll(encodedComplexDimensions);
-    return encodedDimensions.toArray(new EncodedColumnPage[encodedDimensions.size()]);
+    // below code is to combine the list based on actual order present in carbon table
+    // in case of older version(eg:1.1) alter add column was supported only with sort columns
+    // and sort step will return the data based on sort column order(sort columns first)
+    // so arranging the column pages based on schema is required otherwise query will
+    // either give wrong result(for string columns) or throw exception in case of non string
+    // column as reading is based on schema order
+    int complexEncodedPageIndex = 0;
+    int normalEncodedPageIndex  = 0;
+    int currentPosition = 0;
+    EncodedColumnPage[] combinedList =
+        new EncodedColumnPage[encodedDimensions.size() + totalComplexColumnSize];
+    for (int i = 0; i < numDimensions; i++) {
+      TableSpec.DimensionSpec spec = tableSpec.getDimensionSpec(i);
+      switch (spec.getColumnType()) {
+        case GLOBAL_DICTIONARY:
+        case DIRECT_DICTIONARY:
+        case PLAIN_VALUE:
+          // add the dimension based on actual postion
+          // current position is considered as complex column will have multiple children
+          combinedList[currentPosition + spec.getActualPostion()] =
+              encodedDimensions.get(normalEncodedPageIndex++);
+          break;
+        case COMPLEX:
+          EncodedColumnPage[] complexColumnPage = complexColumnPages[complexEncodedPageIndex++];
+          for (int j = 0; j < complexColumnPage.length; j++) {
+            combinedList[currentPosition + spec.getActualPostion() + j] = complexColumnPage[j];
+          }
+          // as for complex type 1 position is already considered, so subtract -1
+          currentPosition += complexColumnPage.length - 1;
+          break;
+        default:
+          throw new IllegalArgumentException("unsupported dimension type:" + spec.getColumnType());
+      }
+    }
+    return combinedList;
   }
 
   /**


[carbondata] 41/41: [HOTFIX]Fixed data map loading issue when number of segments are high

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 6a57b4b5884e2155b8b8f27fedf847a081c25c8f
Author: kumarvishal09 <ku...@gmail.com>
AuthorDate: Fri Mar 29 06:41:00 2019 +0530

    [HOTFIX]Fixed data map loading issue when number of segments are high
    
    Problem:
    When number of segments are high then sometimes data map loading is throwing NPE
    
    Solution:
    If two segments having same schema once first one is loaded and second loading is in progress first one tries to clear segment properties cache and clearing the min max
    Now added a check if min max is not present then get the min max again as after loading each segment will clear the min max cache so there will not be any leak
    
    This closes #3169
---
 .../block/SegmentPropertiesAndSchemaHolder.java    | 62 +++++++++++++---------
 .../indexstore/blockletindex/BlockDataMap.java     |  4 +-
 2 files changed, 39 insertions(+), 27 deletions(-)

diff --git a/core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java b/core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java
index 6f9a93d..34ce5d0 100644
--- a/core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java
+++ b/core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java
@@ -235,20 +235,20 @@ public class SegmentPropertiesAndSchemaHolder {
           segmentPropWrapperToSegmentSetMap.get(segmentPropertiesWrapper);
       synchronized (getOrCreateTableLock(segmentPropertiesWrapper.getTableIdentifier())) {
         segmentIdAndSegmentPropertiesIndexWrapper.removeSegmentId(segmentId);
-      }
-      // if after removal of given SegmentId, the segmentIdSet becomes empty that means this
-      // segmentPropertiesWrapper is not getting used at all. In that case this object can be
-      // removed from all the holders
-      if (clearSegmentWrapperFromMap && segmentIdAndSegmentPropertiesIndexWrapper.segmentIdSet
-          .isEmpty()) {
-        indexToSegmentPropertiesWrapperMapping.remove(segmentPropertiesIndex);
-        segmentPropWrapperToSegmentSetMap.remove(segmentPropertiesWrapper);
-      } else if (!clearSegmentWrapperFromMap
-          && segmentIdAndSegmentPropertiesIndexWrapper.segmentIdSet.isEmpty()) {
-        // min max columns can very when cache is modified. So even though entry is not required
-        // to be deleted from map clear the column cache so that it can filled again
-        segmentPropertiesWrapper.clear();
-        LOGGER.info("cleared min max for segmentProperties at index: " + segmentPropertiesIndex);
+        // if after removal of given SegmentId, the segmentIdSet becomes empty that means this
+        // segmentPropertiesWrapper is not getting used at all. In that case this object can be
+        // removed from all the holders
+        if (clearSegmentWrapperFromMap && segmentIdAndSegmentPropertiesIndexWrapper.segmentIdSet
+            .isEmpty()) {
+          indexToSegmentPropertiesWrapperMapping.remove(segmentPropertiesIndex);
+          segmentPropWrapperToSegmentSetMap.remove(segmentPropertiesWrapper);
+        } else if (!clearSegmentWrapperFromMap
+            && segmentIdAndSegmentPropertiesIndexWrapper.segmentIdSet.isEmpty()) {
+          // min max columns can very when cache is modified. So even though entry is not required
+          // to be deleted from map clear the column cache so that it can filled again
+          segmentPropertiesWrapper.clear();
+          LOGGER.info("cleared min max for segmentProperties at index: " + segmentPropertiesIndex);
+        }
       }
     }
   }
@@ -280,12 +280,13 @@ public class SegmentPropertiesAndSchemaHolder {
 
     private static final Object taskSchemaLock = new Object();
     private static final Object fileFooterSchemaLock = new Object();
+    private static final Object minMaxLock = new Object();
 
-    private AbsoluteTableIdentifier tableIdentifier;
     private List<ColumnSchema> columnsInTable;
     private int[] columnCardinality;
     private SegmentProperties segmentProperties;
     private List<CarbonColumn> minMaxCacheColumns;
+    private CarbonTable carbonTable;
     // in case of hybrid store we can have block as well as blocklet schema
     // Scenario: When there is a hybrid store in which few loads are from legacy store which do
     // not contain the blocklet information and hence they will be, by default have cache_level as
@@ -300,7 +301,7 @@ public class SegmentPropertiesAndSchemaHolder {
 
     public SegmentPropertiesWrapper(CarbonTable carbonTable,
         List<ColumnSchema> columnsInTable, int[] columnCardinality) {
-      this.tableIdentifier = carbonTable.getAbsoluteTableIdentifier();
+      this.carbonTable = carbonTable;
       this.columnsInTable = columnsInTable;
       this.columnCardinality = columnCardinality;
     }
@@ -320,8 +321,9 @@ public class SegmentPropertiesAndSchemaHolder {
      */
     public void clear() {
       if (null != minMaxCacheColumns) {
-        minMaxCacheColumns.clear();
+        minMaxCacheColumns = null;
       }
+
       taskSummarySchemaForBlock = null;
       taskSummarySchemaForBlocklet = null;
       fileFooterEntrySchemaForBlock = null;
@@ -334,7 +336,8 @@ public class SegmentPropertiesAndSchemaHolder {
       }
       SegmentPropertiesAndSchemaHolder.SegmentPropertiesWrapper other =
           (SegmentPropertiesAndSchemaHolder.SegmentPropertiesWrapper) obj;
-      return tableIdentifier.equals(other.tableIdentifier) && checkColumnSchemaEquality(
+      return carbonTable.getAbsoluteTableIdentifier()
+          .equals(other.carbonTable.getAbsoluteTableIdentifier()) && checkColumnSchemaEquality(
           columnsInTable, other.columnsInTable) && Arrays
           .equals(columnCardinality, other.columnCardinality);
     }
@@ -372,12 +375,12 @@ public class SegmentPropertiesAndSchemaHolder {
       for (ColumnSchema columnSchema: columnsInTable) {
         allColumnsHashCode = allColumnsHashCode + columnSchema.strictHashCode();
       }
-      return tableIdentifier.hashCode() + allColumnsHashCode + Arrays
+      return carbonTable.getAbsoluteTableIdentifier().hashCode() + allColumnsHashCode + Arrays
           .hashCode(columnCardinality);
     }
 
     public AbsoluteTableIdentifier getTableIdentifier() {
-      return tableIdentifier;
+      return carbonTable.getAbsoluteTableIdentifier();
     }
 
     public SegmentProperties getSegmentProperties() {
@@ -398,8 +401,8 @@ public class SegmentPropertiesAndSchemaHolder {
         synchronized (taskSchemaLock) {
           if (null == taskSummarySchemaForBlock) {
             taskSummarySchemaForBlock = SchemaGenerator
-                .createTaskSummarySchema(segmentProperties, minMaxCacheColumns, storeBlockletCount,
-                    filePathToBeStored);
+                .createTaskSummarySchema(segmentProperties, getMinMaxCacheColumns(),
+                    storeBlockletCount, filePathToBeStored);
           }
         }
       }
@@ -412,8 +415,8 @@ public class SegmentPropertiesAndSchemaHolder {
         synchronized (taskSchemaLock) {
           if (null == taskSummarySchemaForBlocklet) {
             taskSummarySchemaForBlocklet = SchemaGenerator
-                .createTaskSummarySchema(segmentProperties, minMaxCacheColumns, storeBlockletCount,
-                    filePathToBeStored);
+                .createTaskSummarySchema(segmentProperties, getMinMaxCacheColumns(),
+                    storeBlockletCount, filePathToBeStored);
           }
         }
       }
@@ -425,7 +428,7 @@ public class SegmentPropertiesAndSchemaHolder {
         synchronized (fileFooterSchemaLock) {
           if (null == fileFooterEntrySchemaForBlock) {
             fileFooterEntrySchemaForBlock =
-                SchemaGenerator.createBlockSchema(segmentProperties, minMaxCacheColumns);
+                SchemaGenerator.createBlockSchema(segmentProperties, getMinMaxCacheColumns());
           }
         }
       }
@@ -437,7 +440,7 @@ public class SegmentPropertiesAndSchemaHolder {
         synchronized (fileFooterSchemaLock) {
           if (null == fileFooterEntrySchemaForBlocklet) {
             fileFooterEntrySchemaForBlocklet =
-                SchemaGenerator.createBlockletSchema(segmentProperties, minMaxCacheColumns);
+                SchemaGenerator.createBlockletSchema(segmentProperties, getMinMaxCacheColumns());
           }
         }
       }
@@ -445,6 +448,13 @@ public class SegmentPropertiesAndSchemaHolder {
     }
 
     public List<CarbonColumn> getMinMaxCacheColumns() {
+      if (null == minMaxCacheColumns) {
+        synchronized (minMaxLock) {
+          if (null == minMaxCacheColumns) {
+            addMinMaxColumns(carbonTable);
+          }
+        }
+      }
       return minMaxCacheColumns;
     }
 
diff --git a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
index 4b32688..5b2132c 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
@@ -460,7 +460,9 @@ public class BlockDataMap extends CoarseGrainDataMap
       addMinMaxFlagValues(row, schema[ordinal], minMaxFlagValuesForColumnsToBeCached, ordinal);
       memoryDMStore.addIndexRow(schema, row);
     } catch (Exception e) {
-      throw new RuntimeException(e);
+      String message = "Load to unsafe failed for block: " + filePath;
+      LOGGER.error(message, e);
+      throw new RuntimeException(message, e);
     }
     return summaryRow;
   }


[carbondata] 18/41: [CARBONDATA-3304] Distinguish the thread names created by thread pool of CarbonThreadFactory

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 271fd552fe9b870b08dae67c80ed039d8b0bc993
Author: qiuchenjian <80...@qq.com>
AuthorDate: Wed Feb 27 22:04:36 2019 +0800

    [CARBONDATA-3304] Distinguish the thread names created by thread pool of CarbonThreadFactory
    
    This closes #3137
---
 .../impl/unsafe/UnsafeAbstractDimensionDataChunkStore.java     | 10 +++-------
 .../org/apache/carbondata/core/memory/UnsafeMemoryManager.java |  3 +--
 .../apache/carbondata/hadoop/api/CarbonTableOutputFormat.java  |  3 ++-
 .../processing/loading/TableProcessingOperations.java          |  3 ++-
 .../processing/loading/converter/impl/RowConverterImpl.java    |  2 +-
 .../loading/sort/impl/ParallelReadMergeSorterImpl.java         |  3 ++-
 .../loading/sort/impl/UnsafeParallelReadMergeSorterImpl.java   |  3 ++-
 .../processing/loading/sort/unsafe/UnsafeSortDataRows.java     |  9 ++++-----
 .../sort/unsafe/merger/UnsafeIntermediateFileMerger.java       |  2 +-
 .../loading/sort/unsafe/merger/UnsafeIntermediateMerger.java   |  5 +++--
 .../loading/steps/CarbonRowDataWriterProcessorStepImpl.java    |  2 +-
 .../processing/loading/steps/DataWriterProcessorStepImpl.java  |  2 +-
 .../processing/loading/steps/InputProcessorStepImpl.java       |  2 +-
 .../carbondata/processing/sort/sortdata/SortDataRows.java      |  3 ++-
 .../processing/sort/sortdata/SortIntermediateFileMerger.java   |  3 ++-
 .../processing/sort/sortdata/SortTempFileChunkHolder.java      |  3 ++-
 .../processing/store/writer/AbstractFactDataWriter.java        |  6 ++++--
 17 files changed, 34 insertions(+), 30 deletions(-)

diff --git a/core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/unsafe/UnsafeAbstractDimensionDataChunkStore.java b/core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/unsafe/UnsafeAbstractDimensionDataChunkStore.java
index 0150179..ca1bfa7 100644
--- a/core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/unsafe/UnsafeAbstractDimensionDataChunkStore.java
+++ b/core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/unsafe/UnsafeAbstractDimensionDataChunkStore.java
@@ -21,7 +21,6 @@ import org.apache.carbondata.core.constants.CarbonCommonConstants;
 import org.apache.carbondata.core.datastore.chunk.store.DimensionDataChunkStore;
 import org.apache.carbondata.core.memory.CarbonUnsafe;
 import org.apache.carbondata.core.memory.MemoryBlock;
-import org.apache.carbondata.core.memory.MemoryException;
 import org.apache.carbondata.core.memory.UnsafeMemoryManager;
 import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
 import org.apache.carbondata.core.scan.result.vector.ColumnVectorInfo;
@@ -74,12 +73,9 @@ public abstract class UnsafeAbstractDimensionDataChunkStore implements Dimension
    */
   public UnsafeAbstractDimensionDataChunkStore(long totalSize, boolean isInvertedIdex,
       int numberOfRows, int dataLength) {
-    try {
-      // allocating the data page
-      this.dataPageMemoryBlock = UnsafeMemoryManager.allocateMemoryWithRetry(taskId, totalSize);
-    } catch (MemoryException e) {
-      throw new RuntimeException(e);
-    }
+    // allocating the data page
+    this.dataPageMemoryBlock = UnsafeMemoryManager.allocateMemoryWithRetry(taskId, totalSize);
+
     this.dataLength = dataLength;
     this.isExplicitSorted = isInvertedIdex;
   }
diff --git a/core/src/main/java/org/apache/carbondata/core/memory/UnsafeMemoryManager.java b/core/src/main/java/org/apache/carbondata/core/memory/UnsafeMemoryManager.java
index c59698f..f4c4f85 100644
--- a/core/src/main/java/org/apache/carbondata/core/memory/UnsafeMemoryManager.java
+++ b/core/src/main/java/org/apache/carbondata/core/memory/UnsafeMemoryManager.java
@@ -185,8 +185,7 @@ public class UnsafeMemoryManager {
   /**
    * It tries to allocate memory of `size` bytes, keep retry until it allocates successfully.
    */
-  public static MemoryBlock allocateMemoryWithRetry(String taskId, long size)
-      throws MemoryException {
+  public static MemoryBlock allocateMemoryWithRetry(String taskId, long size) {
     return allocateMemoryWithRetry(INSTANCE.memoryType, taskId, size);
   }
 
diff --git a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableOutputFormat.java b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableOutputFormat.java
index 85fb315..9ba5e97 100644
--- a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableOutputFormat.java
+++ b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableOutputFormat.java
@@ -262,7 +262,8 @@ public class CarbonTableOutputFormat extends FileOutputFormat<NullWritable, Obje
     DataTypeUtil.clearFormatter();
     final DataLoadExecutor dataLoadExecutor = new DataLoadExecutor();
     final ExecutorService executorService = Executors.newFixedThreadPool(1,
-        new CarbonThreadFactory("CarbonRecordWriter:" + loadModel.getTableName()));
+        new CarbonThreadFactory("CarbonRecordWriter:" + loadModel.getTableName(),
+                true));
     // It should be started in new thread as the underlying iterator uses blocking queue.
     Future future = executorService.submit(new Thread() {
       @Override public void run() {
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/TableProcessingOperations.java b/processing/src/main/java/org/apache/carbondata/processing/loading/TableProcessingOperations.java
index f08de59..d67979a 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/TableProcessingOperations.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/TableProcessingOperations.java
@@ -126,7 +126,8 @@ public class TableProcessingOperations {
     }
     // submit local folder clean up in another thread so that main thread execution is not blocked
     ExecutorService localFolderDeletionService = Executors
-        .newFixedThreadPool(1, new CarbonThreadFactory("LocalFolderDeletionPool:" + tableName));
+        .newFixedThreadPool(1, new CarbonThreadFactory("LocalFolderDeletionPool:" + tableName,
+                true));
     try {
       localFolderDeletionService.submit(new Callable<Void>() {
         @Override public Void call() throws Exception {
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/converter/impl/RowConverterImpl.java b/processing/src/main/java/org/apache/carbondata/processing/loading/converter/impl/RowConverterImpl.java
index df50e25..ac9413c 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/converter/impl/RowConverterImpl.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/converter/impl/RowConverterImpl.java
@@ -121,7 +121,7 @@ public class RowConverterImpl implements RowConverter {
       if (executorService == null) {
         executorService = Executors.newCachedThreadPool(new CarbonThreadFactory(
             "DictionaryClientPool:" + configuration.getTableIdentifier().getCarbonTableIdentifier()
-                .getTableName()));
+                .getTableName(), true));
       }
       DictionaryOnePassService
           .setDictionaryServiceProvider(configuration.getDictionaryServiceProvider());
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/ParallelReadMergeSorterImpl.java b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/ParallelReadMergeSorterImpl.java
index 02d6309..61869c5 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/ParallelReadMergeSorterImpl.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/ParallelReadMergeSorterImpl.java
@@ -94,7 +94,8 @@ public class ParallelReadMergeSorterImpl extends AbstractMergeSorter {
       throw new CarbonDataLoadingException(e);
     }
     this.executorService = Executors.newFixedThreadPool(iterators.length,
-        new CarbonThreadFactory("SafeParallelSorterPool:" + sortParameters.getTableName()));
+        new CarbonThreadFactory("SafeParallelSorterPool:" + sortParameters.getTableName(),
+                true));
     this.threadStatusObserver = new ThreadStatusObserver(executorService);
 
     try {
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/UnsafeParallelReadMergeSorterImpl.java b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/UnsafeParallelReadMergeSorterImpl.java
index 8af3ae2..aaa40e0 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/UnsafeParallelReadMergeSorterImpl.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/UnsafeParallelReadMergeSorterImpl.java
@@ -86,7 +86,8 @@ public class UnsafeParallelReadMergeSorterImpl extends AbstractMergeSorter {
       throw new CarbonDataLoadingException(e);
     }
     this.executorService = Executors.newFixedThreadPool(iterators.length,
-        new CarbonThreadFactory("UnsafeParallelSorterPool:" + sortParameters.getTableName()));
+        new CarbonThreadFactory("UnsafeParallelSorterPool:" + sortParameters.getTableName(),
+                true));
     this.threadStatusObserver = new ThreadStatusObserver(executorService);
 
     try {
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeSortDataRows.java b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeSortDataRows.java
index 87f97be..60dd7f1 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeSortDataRows.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeSortDataRows.java
@@ -138,7 +138,8 @@ public class UnsafeSortDataRows {
     CarbonDataProcessorUtil.createLocations(parameters.getTempFileLocation());
     this.dataSorterAndWriterExecutorService = Executors
         .newFixedThreadPool(parameters.getNumberOfCores(),
-            new CarbonThreadFactory("UnsafeSortDataRowPool:" + parameters.getTableName()));
+            new CarbonThreadFactory("UnsafeSortDataRowPool:" + parameters.getTableName(),
+                    true));
     semaphore = new Semaphore(parameters.getNumberOfCores());
   }
 
@@ -206,8 +207,7 @@ public class UnsafeSortDataRows {
         }
         bytesAdded += rowPage.addRow(rowBatch[i], reUsableByteArrayDataOutputStream.get());
       } catch (Exception e) {
-        if (e.getMessage().contains("cannot handle this row. create new page"))
-        {
+        if (e.getMessage().contains("cannot handle this row. create new page")) {
           rowPage.makeCanAddFail();
           // so that same rowBatch will be handled again in new page
           i--;
@@ -243,8 +243,7 @@ public class UnsafeSortDataRows {
       }
       rowPage.addRow(row, reUsableByteArrayDataOutputStream.get());
     } catch (Exception e) {
-      if (e.getMessage().contains("cannot handle this row. create new page"))
-      {
+      if (e.getMessage().contains("cannot handle this row. create new page")) {
         rowPage.makeCanAddFail();
         addRow(row);
       } else {
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/merger/UnsafeIntermediateFileMerger.java b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/merger/UnsafeIntermediateFileMerger.java
index 041544b..f7e38b3 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/merger/UnsafeIntermediateFileMerger.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/merger/UnsafeIntermediateFileMerger.java
@@ -103,7 +103,7 @@ public class UnsafeIntermediateFileMerger implements Callable<Void> {
       }
       double intermediateMergeCostTime =
           (System.currentTimeMillis() - intermediateMergeStartTime) / 1000.0;
-      LOGGER.info("============================== Intermediate Merge of " + fileConterConst
+      LOGGER.info("Intermediate Merge of " + fileConterConst
           + " Sort Temp Files Cost Time: " + intermediateMergeCostTime + "(s)");
     } catch (Exception e) {
       LOGGER.error("Problem while intermediate merging", e);
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/merger/UnsafeIntermediateMerger.java b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/merger/UnsafeIntermediateMerger.java
index ea12263..1b44cc6 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/merger/UnsafeIntermediateMerger.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/merger/UnsafeIntermediateMerger.java
@@ -75,7 +75,8 @@ public class UnsafeIntermediateMerger {
     this.rowPages = new ArrayList<UnsafeCarbonRowPage>(CarbonCommonConstants.CONSTANT_SIZE_TEN);
     this.mergedPages = new ArrayList<>();
     this.executorService = Executors.newFixedThreadPool(parameters.getNumberOfCores(),
-        new CarbonThreadFactory("UnsafeIntermediatePool:" + parameters.getTableName()));
+        new CarbonThreadFactory("UnsafeIntermediatePool:" + parameters.getTableName(),
+                true));
     this.procFiles = new ArrayList<>(CarbonCommonConstants.CONSTANT_SIZE_TEN);
     this.mergerTask = new ArrayList<>();
 
@@ -182,7 +183,7 @@ public class UnsafeIntermediateMerger {
    * @param spillDisk whether to spill the merged result to disk
    */
   private void startIntermediateMerging(UnsafeCarbonRowPage[] rowPages, int totalRows,
-      boolean spillDisk) throws CarbonSortKeyAndGroupByException {
+      boolean spillDisk) {
     UnsafeInMemoryIntermediateDataMerger merger =
         new UnsafeInMemoryIntermediateDataMerger(rowPages, totalRows, parameters, spillDisk);
     mergedPages.add(merger);
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/steps/CarbonRowDataWriterProcessorStepImpl.java b/processing/src/main/java/org/apache/carbondata/processing/loading/steps/CarbonRowDataWriterProcessorStepImpl.java
index 184248c..6345035 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/steps/CarbonRowDataWriterProcessorStepImpl.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/steps/CarbonRowDataWriterProcessorStepImpl.java
@@ -142,7 +142,7 @@ public class CarbonRowDataWriterProcessorStepImpl extends AbstractDataLoadProces
       } else {
         executorService = Executors.newFixedThreadPool(iterators.length,
             new CarbonThreadFactory("NoSortDataWriterPool:" + configuration.getTableIdentifier()
-                .getCarbonTableIdentifier().getTableName()));
+                .getCarbonTableIdentifier().getTableName(), true));
         Future[] futures = new Future[iterators.length];
         for (int i = 0; i < iterators.length; i++) {
           futures[i] = executorService.submit(new DataWriterRunnable(iterators[i], i, dataHandler));
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/steps/DataWriterProcessorStepImpl.java b/processing/src/main/java/org/apache/carbondata/processing/loading/steps/DataWriterProcessorStepImpl.java
index 7beca48..d1b1e76 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/steps/DataWriterProcessorStepImpl.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/steps/DataWriterProcessorStepImpl.java
@@ -115,7 +115,7 @@ public class DataWriterProcessorStepImpl extends AbstractDataLoadProcessorStep {
           .recordDictionaryValue2MdkAdd2FileTime(CarbonTablePath.DEPRECATED_PARTITION_ID,
               System.currentTimeMillis());
       rangeExecutorService = Executors.newFixedThreadPool(iterators.length,
-          new CarbonThreadFactory("WriterForwardPool: " + tableName));
+          new CarbonThreadFactory("WriterForwardPool: " + tableName, true));
       List<Future<Void>> rangeExecutorServiceSubmitList = new ArrayList<>(iterators.length);
       int i = 0;
       // do this concurrently
diff --git a/processing/src/main/java/org/apache/carbondata/processing/loading/steps/InputProcessorStepImpl.java b/processing/src/main/java/org/apache/carbondata/processing/loading/steps/InputProcessorStepImpl.java
index f540b3e..c44c3f5 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/loading/steps/InputProcessorStepImpl.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/loading/steps/InputProcessorStepImpl.java
@@ -71,7 +71,7 @@ public class InputProcessorStepImpl extends AbstractDataLoadProcessorStep {
     rowParser = new RowParserImpl(getOutput(), configuration);
     executorService = Executors.newCachedThreadPool(new CarbonThreadFactory(
         "InputProcessorPool:" + configuration.getTableIdentifier().getCarbonTableIdentifier()
-            .getTableName()));
+            .getTableName(), true));
     // if logger is enabled then raw data will be required.
     this.isRawDataRequired = CarbonDataProcessorUtil.isRawDataRequired(configuration);
   }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortDataRows.java b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortDataRows.java
index 128547d..174d5d1 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortDataRows.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortDataRows.java
@@ -112,7 +112,8 @@ public class SortDataRows {
     CarbonDataProcessorUtil.createLocations(parameters.getTempFileLocation());
     this.dataSorterAndWriterExecutorService = Executors
         .newFixedThreadPool(parameters.getNumberOfCores(),
-            new CarbonThreadFactory("SortDataRowPool:" + parameters.getTableName()));
+            new CarbonThreadFactory("SortDataRowPool:" + parameters.getTableName(),
+                    true));
     semaphore = new Semaphore(parameters.getNumberOfCores());
   }
 
diff --git a/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortIntermediateFileMerger.java b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortIntermediateFileMerger.java
index 1f4f1e7..7079443 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortIntermediateFileMerger.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortIntermediateFileMerger.java
@@ -61,7 +61,8 @@ public class SortIntermediateFileMerger {
     // processed file list
     this.procFiles = new ArrayList<File>(CarbonCommonConstants.CONSTANT_SIZE_TEN);
     this.executorService = Executors.newFixedThreadPool(parameters.getNumberOfCores(),
-        new CarbonThreadFactory("SafeIntermediateMergerPool:" + parameters.getTableName()));
+        new CarbonThreadFactory("SafeIntermediateMergerPool:" + parameters.getTableName(),
+                true));
     mergerTask = new ArrayList<>();
   }
 
diff --git a/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortTempFileChunkHolder.java b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortTempFileChunkHolder.java
index eeea2ec..2ae90fa 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortTempFileChunkHolder.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortTempFileChunkHolder.java
@@ -124,7 +124,8 @@ public class SortTempFileChunkHolder implements Comparable<SortTempFileChunkHold
     this.compressorName = sortParameters.getSortTempCompressorName();
     this.sortStepRowHandler = new SortStepRowHandler(tableFieldStat);
     this.executorService = Executors
-        .newFixedThreadPool(1, new CarbonThreadFactory("SafeSortTempChunkHolderPool:" + tableName));
+        .newFixedThreadPool(1, new CarbonThreadFactory("SafeSortTempChunkHolderPool:" + tableName,
+                true));
     this.convertToActualField = convertToActualField;
   }
 
diff --git a/processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java b/processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java
index 472f143..eb1b15d 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java
@@ -186,7 +186,8 @@ public abstract class AbstractFactDataWriter implements CarbonFactDataWriter {
     }
 
     this.executorService = Executors.newFixedThreadPool(1,
-        new CarbonThreadFactory("CompleteHDFSBackendPool:" + this.model.getTableName()));
+        new CarbonThreadFactory("CompleteHDFSBackendPool:" + this.model.getTableName(),
+                true));
     executorServiceSubmitList = new ArrayList<>(CarbonCommonConstants.DEFAULT_COLLECTION_SIZE);
     // in case of compaction we will pass the cardinality.
     this.localCardinality = this.model.getColCardinality();
@@ -208,7 +209,8 @@ public abstract class AbstractFactDataWriter implements CarbonFactDataWriter {
         numberOfCores = model.getNumberOfCores() / 2;
       }
       fallbackExecutorService = Executors.newFixedThreadPool(numberOfCores, new CarbonThreadFactory(
-          "FallbackPool:" + model.getTableName() + ", range: " + model.getBucketId()));
+          "FallbackPool:" + model.getTableName() + ", range: " + model.getBucketId(),
+              true));
     }
   }
 


[carbondata] 05/41: [CARBONDATA-2447] Block update operation on range/list/hash partition table

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 1c4c6dc511e461d27eed7699cf87706609fce6ff
Author: qiuchenjian <80...@qq.com>
AuthorDate: Tue Jan 22 10:17:00 2019 +0800

    [CARBONDATA-2447] Block update operation on range/list/hash partition table
    
    [problem]
    when update the data on range partition table, it will lost data or update failed , see the jira or new test case
    
    [Cause]
    Range partition table take taskNo in filename as partitionId, when update the taskNo is inscreasing ,the taskNo didn't changed with partitionId
    
    [Solution]
    (1) When query the range partition table, don't match the partitionid ---this method losses the meaning of partition
    (2) Range partition table use directory or seperate part as partitionid ---this is not necessary and suggest to use standard partition
    (3) Range partition table doesn't support update opretion ---this PR use this method
    
    This closes #3091
---
 .../partition/TestUpdateForPartitionTable.scala    | 71 ++++++++++++++++++++++
 .../mutation/CarbonProjectForUpdateCommand.scala   |  9 ++-
 2 files changed, 79 insertions(+), 1 deletion(-)

diff --git a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/partition/TestUpdateForPartitionTable.scala b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/partition/TestUpdateForPartitionTable.scala
new file mode 100644
index 0000000..14dab1e
--- /dev/null
+++ b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/partition/TestUpdateForPartitionTable.scala
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.spark.testsuite.partition
+
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest. BeforeAndAfterAll
+
+class TestUpdateForPartitionTable extends QueryTest with BeforeAndAfterAll {
+
+  override def beforeAll: Unit = {
+    dropTable
+
+    sql("create table test_range_partition_table (id int) partitioned by (name string) " +
+      "stored by 'carbondata' TBLPROPERTIES('PARTITION_TYPE' = 'RANGE','RANGE_INFO' = 'a,e,f')")
+    sql("create table test_hive_partition_table (id int) partitioned by (name string) " +
+      "stored by 'carbondata'")
+    sql("create table test_hash_partition_table (id int) partitioned by (name string) " +
+      "stored by 'carbondata' TBLPROPERTIES('PARTITION_TYPE' = 'HASH','NUM_PARTITIONS' = '2')")
+    sql("create table test_list_partition_table (id int) partitioned by (name string) " +
+      "stored by 'carbondata' TBLPROPERTIES('PARTITION_TYPE' = 'LIST','LIST_INFO' = 'a,e,f')")
+  }
+
+  def dropTable = {
+    sql("drop table if exists test_hash_partition_table")
+    sql("drop table if exists test_list_partition_table")
+    sql("drop table if exists test_range_partition_table")
+    sql("drop table if exists test_hive_partition_table")
+  }
+
+
+  test ("test update for unsupported partition table") {
+    val updateTables = Array(
+      "test_range_partition_table",
+      "test_list_partition_table",
+      "test_hash_partition_table")
+
+    updateTables.foreach(table => {
+      sql("insert into " + table + " select 1,'b' ")
+      val ex = intercept[UnsupportedOperationException] {
+        sql("update " + table + " set (name) = ('c') where id = 1").collect()
+      }
+      assertResult("Unsupported update operation for range/hash/list partition table")(ex.getMessage)
+    })
+
+  }
+
+  test ("test update for hive(standard) partition table") {
+
+    sql("insert into test_hive_partition_table select 1,'b' ")
+    sql("update test_hive_partition_table set (name) = ('c') where id = 1").collect()
+    assertResult(1)(sql("select * from test_hive_partition_table where name = 'c'").collect().length)
+  }
+
+  override def afterAll() : Unit = {
+    dropTable
+  }
+}
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForUpdateCommand.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForUpdateCommand.scala
index 0f23081..e4abae1 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForUpdateCommand.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForUpdateCommand.scala
@@ -32,7 +32,7 @@ import org.apache.carbondata.core.datamap.Segment
 import org.apache.carbondata.core.exception.ConcurrentOperationException
 import org.apache.carbondata.core.features.TableOperation
 import org.apache.carbondata.core.locks.{CarbonLockFactory, CarbonLockUtil, LockUsage}
-import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.metadata.schema.partition.PartitionType
 import org.apache.carbondata.core.mutate.CarbonUpdateUtil
 import org.apache.carbondata.core.statusmanager.SegmentStatusManager
 import org.apache.carbondata.core.util.CarbonProperties
@@ -60,6 +60,13 @@ private[sql] case class CarbonProjectForUpdateCommand(
       return Seq.empty
     }
     val carbonTable = CarbonEnv.getCarbonTable(databaseNameOp, tableName)(sparkSession)
+    if (carbonTable.getPartitionInfo != null &&
+      (carbonTable.getPartitionInfo.getPartitionType == PartitionType.RANGE ||
+        carbonTable.getPartitionInfo.getPartitionType == PartitionType.HASH ||
+        carbonTable.getPartitionInfo.getPartitionType == PartitionType.LIST)) {
+      throw new UnsupportedOperationException("Unsupported update operation for range/" +
+        "hash/list partition table")
+    }
     setAuditTable(carbonTable)
     setAuditInfo(Map("plan" -> plan.simpleString))
     columns.foreach { col =>


[carbondata] 12/41: [CARBONDATA-3301]Fix inserting null values to Array columns in carbon file format data load

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 6fa9bd2720536956bebb2a3ff37b25946efec545
Author: akashrn5 <ak...@gmail.com>
AuthorDate: Fri Feb 22 17:11:57 2019 +0530

    [CARBONDATA-3301]Fix inserting null values to Array<date> columns in carbon file format data load
    
    Problem:
    When carbon datasource table contains columns like complex column with Array or Array and data is inserted and queried, it gives null data for those columns.
    
    Solution:
    In file format case before the actual load, we hwt the internal row object from spark, and we convert the internal row to carbondata understandable object, so that obejvct for date will be of Integertype, So while inserting data only long case is handled and we were passing this interger value to parse in SimpleDateFormat, which throws exception and we were inserting null. SO handled for integer. In this case directly assign the surrogate key with this value.
    
    This closes #3133
---
 .../sql/carbondata/datasource/SparkCarbonDataSourceTest.scala | 11 +++++++++++
 .../carbondata/processing/datatypes/PrimitiveDataType.java    |  7 ++++++-
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/integration/spark-datasource/src/test/scala/org/apache/spark/sql/carbondata/datasource/SparkCarbonDataSourceTest.scala b/integration/spark-datasource/src/test/scala/org/apache/spark/sql/carbondata/datasource/SparkCarbonDataSourceTest.scala
index fa37548..d25e675 100644
--- a/integration/spark-datasource/src/test/scala/org/apache/spark/sql/carbondata/datasource/SparkCarbonDataSourceTest.scala
+++ b/integration/spark-datasource/src/test/scala/org/apache/spark/sql/carbondata/datasource/SparkCarbonDataSourceTest.scala
@@ -1760,6 +1760,16 @@ class SparkCarbonDataSourceTest extends FunSuite with BeforeAndAfterAll {
     spark.sql("drop table if exists fileformat_drop_hive")
   }
 
+  test("test complexdatype for date and timestamp datatype") {
+    spark.sql("drop table if exists fileformat_date")
+    spark.sql("drop table if exists fileformat_date_hive")
+    spark.sql("create table fileformat_date_hive(name string, age int, dob array<date>, joinTime array<timestamp>) using parquet")
+    spark.sql("create table fileformat_date(name string, age int, dob array<date>, joinTime array<timestamp>) using carbon")
+    spark.sql("insert into fileformat_date_hive select 'joey', 32, array('1994-04-06','1887-05-06'), array('1994-04-06 00:00:05','1887-05-06 00:00:08')")
+    spark.sql("insert into fileformat_date select 'joey', 32, array('1994-04-06','1887-05-06'), array('1994-04-06 00:00:05','1887-05-06 00:00:08')")
+    checkAnswer(spark.sql("select * from fileformat_date_hive"), spark.sql("select * from fileformat_date"))
+  }
+
   test("validate the columns not present in schema") {
     spark.sql("drop table if exists validate")
     spark.sql("create table validate (name string, age int, address string) using carbon options('inverted_index'='abc')")
@@ -1785,5 +1795,6 @@ class SparkCarbonDataSourceTest extends FunSuite with BeforeAndAfterAll {
     spark.sql("drop table if exists par_table")
     spark.sql("drop table if exists sdkout")
     spark.sql("drop table if exists validate")
+    spark.sql("drop table if exists fileformat_date")
   }
 }
diff --git a/processing/src/main/java/org/apache/carbondata/processing/datatypes/PrimitiveDataType.java b/processing/src/main/java/org/apache/carbondata/processing/datatypes/PrimitiveDataType.java
index cfbaa11..18dc89d 100644
--- a/processing/src/main/java/org/apache/carbondata/processing/datatypes/PrimitiveDataType.java
+++ b/processing/src/main/java/org/apache/carbondata/processing/datatypes/PrimitiveDataType.java
@@ -344,7 +344,7 @@ public class PrimitiveDataType implements GenericDataType<Object> {
             byte[] value = null;
             if (isDirectDictionary) {
               int surrogateKey;
-              if (!(input instanceof Long)) {
+              if (!(input instanceof Long) && !(input instanceof Integer)) {
                 SimpleDateFormat parser = new SimpleDateFormat(getDateFormat(carbonDimension));
                 parser.parse(parsedValue);
               }
@@ -353,6 +353,11 @@ public class PrimitiveDataType implements GenericDataType<Object> {
               // using dictionaryGenerator.
               if (dictionaryGenerator instanceof DirectDictionary && input instanceof Long) {
                 surrogateKey = ((DirectDictionary) dictionaryGenerator).generateKey((long) input);
+              } else if (dictionaryGenerator instanceof DirectDictionary
+                  && input instanceof Integer) {
+                // In case of file format, for complex type date or time type, input data comes as a
+                // Integer object, so just assign the surrogate key with the input object value
+                surrogateKey = (int) input;
               } else {
                 surrogateKey = dictionaryGenerator.getOrGenerateKey(parsedValue);
               }


[carbondata] 02/41: [CARBONDATA-3284] [CARBONDATA-3285] Workaround for Create-PreAgg Datamap Fail & Sort-Columns Fix

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 3df5a2f0712bee9c83ba12bf86712c4b10763ce8
Author: namanrastogi <na...@gmail.com>
AuthorDate: Tue Jan 29 15:14:18 2019 +0530

    [CARBONDATA-3284] [CARBONDATA-3285] Workaround for Create-PreAgg Datamap Fail & Sort-Columns Fix
    
    If for some reason**[1]**, creating PreAgg datamap failed and its dropping also failed.
    Then dropping datamap also cannot be done, as the datamap was not registered to the parent table schema file, but got registered in spark-hive, so it shows it as a table, but won't let us drop it as carbon throws error if we try to drop it as a table.
    
    Workaround:
    After this change, we can at lease drop that as a hive folder by command
    
    **[1]** - Reason could be something like setting HDFS Quota on database folder, so that parent table schema file cound not be modified.
    
    *
    
    This closes #3113
---
 .../apache/carbondata/core/util/CarbonUtil.java    | 16 +++++++++++
 docs/faq.md                                        | 18 ++++++++++++
 .../cluster/sdv/generated/QueriesBVATestCase.scala |  4 +--
 .../carbondata/cluster/sdv/suite/SDVSuites.scala   |  2 +-
 .../command/management/CarbonLoadDataCommand.scala |  2 +-
 .../schema/CarbonAlterTableRenameCommand.scala     |  2 --
 .../command/table/CarbonDropTableCommand.scala     | 32 +++++++++++++++++++++-
 7 files changed, 69 insertions(+), 7 deletions(-)

diff --git a/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java b/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
index 2b1cd6e..7147bd6 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
@@ -3348,4 +3348,20 @@ public final class CarbonUtil {
   public static String generateUUID() {
     return UUID.randomUUID().toString();
   }
+
+  /**
+   * Below method will be used to get the datamap schema name from datamap table name
+   * it will split name based on character '_' and get the last name
+   * This is only for pre aggregate and timeseries tables
+   *
+   * @param tableName
+   * @return datamapschema name
+   */
+  public static String getDatamapNameFromTableName(String tableName) {
+    int i = tableName.lastIndexOf('_');
+    if (i != -1) {
+      return tableName.substring(i + 1, tableName.length());
+    }
+    return null;
+  }
 }
diff --git a/docs/faq.md b/docs/faq.md
index 7317d1c..9ba7082 100644
--- a/docs/faq.md
+++ b/docs/faq.md
@@ -43,6 +43,7 @@
 - [Failed to insert data on the cluster](#failed-to-insert-data-on-the-cluster)
 - [Failed to execute Concurrent Operations(Load,Insert,Update) on table by multiple workers](#failed-to-execute-concurrent-operations-on-table-by-multiple-workers)
 - [Failed to create a table with a single numeric column](#failed-to-create-a-table-with-a-single-numeric-column)
+- [Failed to create datamap and drop datamap is also not working](#failed-to-create-datamap-and-drop-datamap-is-also-not-working)
 
 ## 
 
@@ -474,4 +475,21 @@ Note : Refrain from using "mvn clean package" without specifying the profile.
 
   A single column that can be considered as dimension is mandatory for table creation.
 
+## Failed to create datamap and drop datamap is also not working
+  
+  **Symptom**
+
+  Execution fails with the following exception :
+
+  ```
+  HDFS Quota Exceeded
+  ```
+
+  **Possible Cause**
+
+  HDFS Quota is set, and it is not letting carbondata write or modify any files.
+
+  **Procedure**
 
+  Drop that particular datamap using Drop Table command using table name as
+  parentTableName_datamapName so as to clear the stale folders.
diff --git a/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/QueriesBVATestCase.scala b/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/QueriesBVATestCase.scala
index 67f4068..130fe08 100644
--- a/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/QueriesBVATestCase.scala
+++ b/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/QueriesBVATestCase.scala
@@ -10697,8 +10697,8 @@ class QueriesBVATestCase extends QueryTest with BeforeAndAfterAll {
   //PushUP_FILTER_test_boundary_TC194
   test("PushUP_FILTER_test_boundary_TC194", Include) {
 
-    checkAnswer(s"""select min(c2_Bigint),max(c2_Bigint),sum(c2_Bigint),avg(c2_Bigint) , count(c2_Bigint), variance(c2_Bigint) from Test_Boundary where sin(c1_int)=0.18796200317975467 or sin(c1_int)=-0.18796200317975467""",
-      s"""select min(c2_Bigint),max(c2_Bigint),sum(c2_Bigint),avg(c2_Bigint) , count(c2_Bigint), variance(c2_Bigint) from Test_Boundary_hive where sin(c1_int)=0.18796200317975467 or sin(c1_int)=-0.18796200317975467""", "QueriesBVATestCase_PushUP_FILTER_test_boundary_TC194")
+    checkAnswer(s"""select min(c2_Bigint),max(c2_Bigint),sum(c2_Bigint),avg(c2_Bigint) , count(c2_Bigint), variance(c2_Bigint) from (select c2_Bigint from Test_Boundary where sin(c1_int)=0.18796200317975467 or sin(c1_int)=-0.18796200317975467 order by c2_Bigint""",
+      s"""select min(c2_Bigint),max(c2_Bigint),sum(c2_Bigint),avg(c2_Bigint) , count(c2_Bigint), variance(c2_Bigint) from (select c2_Bigint from Test_Boundary_hive where sin(c1_int)=0.18796200317975467 or sin(c1_int)=-0.18796200317975467 order by c2_Bigint""", "QueriesBVATestCase_PushUP_FILTER_test_boundary_TC194")
 
   }
 
diff --git a/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/suite/SDVSuites.scala b/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/suite/SDVSuites.scala
index 5367e0d..7448e95 100644
--- a/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/suite/SDVSuites.scala
+++ b/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/suite/SDVSuites.scala
@@ -83,6 +83,7 @@ class SDVSuites extends Suites with BeforeAndAfterAll {
 class SDVSuites1 extends Suites with BeforeAndAfterAll {
 
   val suites =     new BadRecordTestCase ::
+                   new ComplexDataTypeTestCase ::
                    new BatchSortLoad1TestCase ::
                    new BatchSortQueryTestCase ::
                    new DataLoadingTestCase ::
@@ -156,7 +157,6 @@ class SDVSuites3 extends Suites with BeforeAndAfterAll {
                     new TestPartitionWithGlobalSort ::
                     new SDKwriterTestCase ::
                     new SetParameterTestCase ::
-                    new ComplexDataTypeTestCase ::
                     new PartitionWithPreAggregateTestCase ::
                     new CreateTableWithLocalDictionaryTestCase ::
                     new LoadTableWithLocalDictionaryTestCase :: Nil
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala
index 307e62d..0c8a1df 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala
@@ -206,7 +206,7 @@ case class CarbonLoadDataCommand(
     * 4. Session property CARBON_OPTIONS_SORT_SCOPE
     * 5. Default Sort Scope LOAD_SORT_SCOPE
     */
-    if (StringUtils.isBlank(tableProperties.get(CarbonCommonConstants.SORT_COLUMNS))) {
+    if (table.getNumberOfSortColumns == 0) {
       // If tableProperties.SORT_COLUMNS is null
       optionsFinal.put(CarbonCommonConstants.SORT_SCOPE,
         SortScopeOptions.SortScope.NO_SORT.name)
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableRenameCommand.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableRenameCommand.scala
index 33f3cd9..f41cfc1 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableRenameCommand.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableRenameCommand.scala
@@ -127,8 +127,6 @@ private[sql] case class CarbonAlterTableRenameCommand(
       schemaEvolutionEntry.setTime_stamp(timeStamp)
       val newCarbonTableIdentifier = new CarbonTableIdentifier(oldDatabaseName,
         newTableName, carbonTable.getCarbonTableIdentifier.getTableId)
-      val oldIdentifier = TableIdentifier(oldTableName, Some(oldDatabaseName))
-      val newIdentifier = TableIdentifier(newTableName, Some(oldDatabaseName))
       metastore.removeTableFromMetadata(oldDatabaseName, oldTableName)
       var partitions: Seq[CatalogTablePartition] = Seq.empty
       if (carbonTable.isHivePartitionTable) {
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDropTableCommand.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDropTableCommand.scala
index f69ef9e..a117814 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDropTableCommand.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDropTableCommand.scala
@@ -77,7 +77,37 @@ case class CarbonDropTableCommand(
       }
       val relationIdentifiers = carbonTable.getTableInfo.getParentRelationIdentifiers
       if (relationIdentifiers != null && !relationIdentifiers.isEmpty) {
-        if (!dropChildTable) {
+        var ignoreParentTableCheck = false
+        if (carbonTable.getTableInfo.getParentRelationIdentifiers.size() == 1) {
+          /**
+           * below handling in case when pre aggregation creation failed in scenario
+           * while creating a pre aggregate data map it created pre aggregate table and registered
+           * in hive, but failed to register in main table because of some exception.
+           * in this case if it will not allow user to drop datamap and data map table
+           * for this if user run drop table command for pre aggregate it should allow user to drop
+           * the same
+           */
+          val parentDbName =
+            carbonTable.getTableInfo.getParentRelationIdentifiers.get(0).getDatabaseName
+          val parentTableName =
+            carbonTable.getTableInfo.getParentRelationIdentifiers.get(0).getTableName
+          val parentCarbonTable = try {
+            Some(CarbonEnv.getCarbonTable(Some(parentDbName), parentTableName)(sparkSession))
+          } catch {
+            case _: Exception => None
+          }
+          if (parentCarbonTable.isDefined) {
+            val dataMapSchemaName = CarbonUtil.getDatamapNameFromTableName(carbonTable.getTableName)
+            if (null != dataMapSchemaName) {
+              val dataMapSchema = parentCarbonTable.get.getDataMapSchema(dataMapSchemaName)
+              if (null == dataMapSchema) {
+                LOGGER.info(s"Force dropping datamap ${carbonTable.getTableName}")
+                ignoreParentTableCheck = true
+              }
+            }
+          }
+        }
+        if (!ignoreParentTableCheck && !dropChildTable) {
           if (!ifExistsSet) {
             throwMetadataException(dbName, tableName,
               "Child table which is associated with datamap cannot be dropped, " +


[carbondata] 08/41: [CARBONDATA-3298]Removed Log Message for Already Deleted Segments

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 02cc2b29800799b23d9b07990a46817c4653b7c0
Author: Indhumathi27 <in...@gmail.com>
AuthorDate: Tue Feb 19 18:20:40 2019 +0530

    [CARBONDATA-3298]Removed Log Message for Already Deleted Segments
    
    Problem:
    In old store, Create table and perform one Insert operation. Now,
    update and delete that record, which marks that segment as "MARKED FOR DELETE".
    Now run "clean files command" to delete the segment.
    Note: In this case, Metadata folder doesn't contain segment file.
    In new store, Refresh the table and again perform IUD operation. Now,
    when "clean files" command is executed, We will check if physically segment file
    exists or not, if not present, then we will log a warning message, as file not present.
    If old store contains more segments, then for each segment, this log message will be
    getting printed, which is not required.
    
    Solution:
    Removed log message for already deleted segments
    
    This closes #3131
---
 .../main/java/org/apache/carbondata/core/util/DeleteLoadFolders.java  | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/core/src/main/java/org/apache/carbondata/core/util/DeleteLoadFolders.java b/core/src/main/java/org/apache/carbondata/core/util/DeleteLoadFolders.java
index b614f55..21c504b 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/DeleteLoadFolders.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/DeleteLoadFolders.java
@@ -156,11 +156,7 @@ public final class DeleteLoadFolders {
                 }
               }
 
-            } else {
-              LOGGER.warn("Files are not found in segment " + path
-                  + " it seems, files are already being deleted");
             }
-
           }
           List<Segment> segments = new ArrayList<>(1);
           for (TableDataMap dataMap : indexDataMaps) {


[carbondata] 09/41: [CARBONDATA-3305] Support show metacache command to list the cache sizes for all tables

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit fdb48d0d05bd7f6923b8d6b7493f13beb2f56d93
Author: QiangCai <qi...@qq.com>
AuthorDate: Thu Jan 17 15:39:33 2019 +0800

    [CARBONDATA-3305] Support show metacache command to list the cache sizes for all tables
    
    >>> SHOW METACACHE
    +--------+--------+----------+------------+---------------+
    |Database|Table   |Index size|Datamap size|Dictionary size|
    +--------+--------+----------+------------+---------------+
    |ALL     |ALL     |842 Bytes |982 Bytes   |80.34 KB       |
    |default |ALL     |842 Bytes |982 Bytes   |80.34 KB       |
    |default |t1      |225 Bytes |982 Bytes   |0              |
    |default |t1_dpagg|259 Bytes |0           |0              |
    |default |t2      |358 Bytes |0           |80.34 KB       |
    +--------+--------+----------+------------+---------------+
    
    >>> SHOW METACACHE FOR TABLE t1
    +----------+---------+----------------------+
    |Field     |Size     |Comment               |
    +----------+---------+----------------------+
    |Index     |225 Bytes|1/1 index files cached|
    |Dictionary|0        |                      |
    |dpagg     |259 Bytes|preaggregate          |
    |dblom     |982 Bytes|bloomfilter           |
    +----------+---------+----------------------+
    
    >>> SHOW METACACHE FOR TABLE t2
    +----------+---------+----------------------+
    |Field     |Size     |Comment               |
    +----------+---------+----------------------+
    |Index     |358 Bytes|2/2 index files cached|
    |Dictionary|80.34 KB |                      |
    +----------+---------+----------------------+
    
    This closes #3078
---
 .../carbondata/core/cache/CacheProvider.java       |   4 +
 .../carbondata/core/cache/CarbonLRUCache.java      |   4 +
 docs/ddl-of-carbondata.md                          |  21 ++
 .../sql/commands/TestCarbonShowCacheCommand.scala  | 163 +++++++++++
 .../spark/sql/catalyst/CarbonDDLSqlParser.scala    |   1 +
 .../command/cache/CarbonShowCacheCommand.scala     | 312 +++++++++++++++++++++
 .../spark/sql/parser/CarbonSpark2SqlParser.scala   |  12 +-
 7 files changed, 515 insertions(+), 2 deletions(-)

diff --git a/core/src/main/java/org/apache/carbondata/core/cache/CacheProvider.java b/core/src/main/java/org/apache/carbondata/core/cache/CacheProvider.java
index 99b1693..deb48e2 100644
--- a/core/src/main/java/org/apache/carbondata/core/cache/CacheProvider.java
+++ b/core/src/main/java/org/apache/carbondata/core/cache/CacheProvider.java
@@ -195,4 +195,8 @@ public class CacheProvider {
     }
     cacheTypeToCacheMap.clear();
   }
+
+  public CarbonLRUCache getCarbonCache() {
+    return carbonLRUCache;
+  }
 }
diff --git a/core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java b/core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java
index 87254e3..74ff8a0 100644
--- a/core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java
+++ b/core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java
@@ -305,4 +305,8 @@ public final class CarbonLRUCache {
       lruCacheMap.clear();
     }
   }
+
+  public Map<String, Cacheable> getCacheMap() {
+    return lruCacheMap;
+  }
 }
diff --git a/docs/ddl-of-carbondata.md b/docs/ddl-of-carbondata.md
index 0d0e5bd..3476475 100644
--- a/docs/ddl-of-carbondata.md
+++ b/docs/ddl-of-carbondata.md
@@ -67,6 +67,7 @@ CarbonData DDL statements are documented here,which includes:
   * [SPLIT PARTITION](#split-a-partition)
   * [DROP PARTITION](#drop-a-partition)
 * [BUCKETING](#bucketing)
+* [CACHE](#cache)
 
 ## CREATE TABLE
 
@@ -1088,4 +1089,24 @@ Users can specify which columns to include and exclude for local dictionary gene
   TBLPROPERTIES ('BUCKETNUMBER'='4', 'BUCKETCOLUMNS'='productName')
   ```
 
+## CACHE
 
+  CarbonData internally uses LRU caching to improve the performance. The user can get information 
+  about current cache used status in memory through the following command:
+
+  ```sql
+  SHOW METADATA
+  ``` 
+  
+  This shows the overall memory consumed in the cache by categories - index files, dictionary and 
+  datamaps. This also shows the cache usage by all the tables and children tables in the current 
+  database.
+  
+  ```sql
+  SHOW METADATA ON TABLE tableName
+  ```
+  
+  This shows detailed information on cache usage by the table `tableName` and its carbonindex files, 
+  its dictionary files, its datamaps and children tables.
+  
+  This command is not allowed on child tables.
diff --git a/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala b/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala
new file mode 100644
index 0000000..0e1cd00
--- /dev/null
+++ b/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala
@@ -0,0 +1,163 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.sql.commands
+
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+class TestCarbonShowCacheCommand extends QueryTest with BeforeAndAfterAll {
+  override protected def beforeAll(): Unit = {
+    // use new database
+    sql("drop database if exists cache_db cascade").collect()
+    sql("drop database if exists cache_empty_db cascade").collect()
+    sql("create database cache_db").collect()
+    sql("create database cache_empty_db").collect()
+    dropTable
+    sql("use cache_db").collect()
+    sql(
+      """
+        | CREATE TABLE cache_db.cache_1
+        | (empno int, empname String, designation String, doj Timestamp, workgroupcategory int,
+        |  workgroupcategoryname String, deptno int, deptname String, projectcode int,
+        |  projectjoindate Timestamp, projectenddate Timestamp,attendance int,utilization int,
+        |  salary int)
+        | STORED BY 'org.apache.carbondata.format'
+        | TBLPROPERTIES('DICTIONARY_INCLUDE'='deptname')
+      """.stripMargin)
+    // bloom
+    sql("CREATE DATAMAP IF NOT EXISTS cache_1_bloom ON TABLE cache_db.cache_1 USING 'bloomfilter' " +
+        "DMPROPERTIES('INDEX_COLUMNS'='deptno')")
+    sql(s"LOAD DATA INPATH '$resourcesPath/data.csv' INTO TABLE cache_1 ")
+
+    sql(
+      """
+        | CREATE TABLE cache_2
+        | (empno int, empname String, designation String, doj Timestamp, workgroupcategory int,
+        |  workgroupcategoryname String, deptno int, deptname String, projectcode int,
+        |  projectjoindate Timestamp, projectenddate Timestamp,attendance int,utilization int,
+        |  salary int)
+        | STORED BY 'org.apache.carbondata.format'
+      """.stripMargin)
+    sql(s"LOAD DATA INPATH '$resourcesPath/data.csv' INTO TABLE cache_db.cache_2 ")
+    sql("insert into table cache_2 select * from cache_1").collect()
+
+    sql(
+      """
+        | CREATE TABLE cache_3
+        | (empno int, empname String, designation String, doj Timestamp, workgroupcategory int,
+        |  workgroupcategoryname String, deptno int, deptname String, projectcode int,
+        |  projectjoindate Timestamp, projectenddate Timestamp,attendance int,utilization int,
+        |  salary int)
+        | STORED BY 'org.apache.carbondata.format'
+      """.stripMargin)
+    sql(s"LOAD DATA INPATH '$resourcesPath/data.csv' INTO TABLE cache_3 ")
+
+    // use default database
+    sql("use default").collect()
+    sql(
+      """
+        | CREATE TABLE cache_4
+        | (empno int, empname String, designation String, doj Timestamp, workgroupcategory int,
+        |  workgroupcategoryname String, deptno int, deptname String, projectcode int,
+        |  projectjoindate Timestamp, projectenddate Timestamp,attendance int,utilization int,
+        |  salary int)
+        | STORED BY 'org.apache.carbondata.format'
+      """.stripMargin)
+    sql("insert into table cache_4 select * from cache_db.cache_2").collect()
+
+    // standard partition table
+    sql(
+      """
+        | CREATE TABLE cache_5
+        | (empno int, empname String, designation String, doj Timestamp, workgroupcategory int,
+        |  workgroupcategoryname String, deptname String, projectcode int,
+        |  projectjoindate Timestamp, projectenddate Timestamp,attendance int,utilization int,
+        |  salary int)
+        | PARTITIONED BY (deptno int)
+        | STORED BY 'org.apache.carbondata.format'
+      """.stripMargin)
+    sql(
+      "insert into table cache_5 select empno,empname,designation,doj,workgroupcategory," +
+      "workgroupcategoryname,deptname,projectcode,projectjoindate,projectenddate,attendance," +
+      "utilization,salary,deptno from cache_4").collect()
+
+    // datamap
+    sql("create datamap cache_4_count on table cache_4 using 'preaggregate' as " +
+        "select workgroupcategoryname,count(empname) as count from cache_4 group by workgroupcategoryname")
+
+    // count star to cache index
+    sql("select max(deptname) from cache_db.cache_1").collect()
+    sql("SELECT deptno FROM cache_db.cache_1 where deptno=10").collect()
+    sql("select count(*) from cache_db.cache_2").collect()
+    sql("select count(*) from cache_4").collect()
+    sql("select count(*) from cache_5").collect()
+    sql("select workgroupcategoryname,count(empname) as count from cache_4 group by workgroupcategoryname").collect()
+  }
+
+
+  override protected def afterAll(): Unit = {
+    sql("use default").collect()
+    dropTable
+  }
+
+  private def dropTable = {
+    sql("DROP TABLE IF EXISTS cache_db.cache_1")
+    sql("DROP TABLE IF EXISTS cache_db.cache_2")
+    sql("DROP TABLE IF EXISTS cache_db.cache_3")
+    sql("DROP TABLE IF EXISTS default.cache_4")
+    sql("DROP TABLE IF EXISTS default.cache_5")
+  }
+
+  test("show cache") {
+    sql("use cache_empty_db").collect()
+    val result1 = sql("show metacache").collect()
+    assertResult(2)(result1.length)
+    assertResult(Row("cache_empty_db", "ALL", "0", "0", "0"))(result1(1))
+
+    sql("use cache_db").collect()
+    val result2 = sql("show metacache").collect()
+    assertResult(4)(result2.length)
+
+    sql("use default").collect()
+    val result3 = sql("show metacache").collect()
+    val dataMapCacheInfo = result3
+      .map(row => row.getString(1))
+      .filter(table => table.equals("cache_4_cache_4_count"))
+    assertResult(1)(dataMapCacheInfo.length)
+  }
+
+  test("show metacache on table") {
+    sql("use cache_db").collect()
+    val result1 = sql("show metacache on table cache_1").collect()
+    assertResult(3)(result1.length)
+
+    val result2 = sql("show metacache on table cache_db.cache_2").collect()
+    assertResult(2)(result2.length)
+
+    checkAnswer(sql("show metacache on table cache_db.cache_3"),
+      Seq(Row("Index", "0 bytes", "0/1 index files cached"), Row("Dictionary", "0 bytes", "")))
+
+    val result4 = sql("show metacache on table default.cache_4").collect()
+    assertResult(3)(result4.length)
+
+    sql("use default").collect()
+    val result5 = sql("show metacache on table cache_5").collect()
+    assertResult(2)(result5.length)
+  }
+}
diff --git a/integration/spark-common/src/main/scala/org/apache/spark/sql/catalyst/CarbonDDLSqlParser.scala b/integration/spark-common/src/main/scala/org/apache/spark/sql/catalyst/CarbonDDLSqlParser.scala
index dc75243..e03bebd 100644
--- a/integration/spark-common/src/main/scala/org/apache/spark/sql/catalyst/CarbonDDLSqlParser.scala
+++ b/integration/spark-common/src/main/scala/org/apache/spark/sql/catalyst/CarbonDDLSqlParser.scala
@@ -155,6 +155,7 @@ abstract class CarbonDDLSqlParser extends AbstractCarbonSparkSQLParser {
   protected val HISTORY = carbonKeyWord("HISTORY")
   protected val SEGMENTS = carbonKeyWord("SEGMENTS")
   protected val SEGMENT = carbonKeyWord("SEGMENT")
+  protected val METACACHE = carbonKeyWord("METACACHE")
 
   protected val STRING = carbonKeyWord("STRING")
   protected val INTEGER = carbonKeyWord("INTEGER")
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
new file mode 100644
index 0000000..e937c32
--- /dev/null
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
@@ -0,0 +1,312 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command.cache
+
+import scala.collection.mutable
+import scala.collection.JavaConverters._
+
+import org.apache.commons.io.FileUtils.byteCountToDisplaySize
+import org.apache.spark.sql.{CarbonEnv, Row, SparkSession}
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.analysis.NoSuchTableException
+import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeReference}
+import org.apache.spark.sql.execution.command.MetadataCommand
+import org.apache.spark.sql.types.{LongType, StringType}
+
+import org.apache.carbondata.core.cache.CacheProvider
+import org.apache.carbondata.core.cache.dictionary.AbstractColumnDictionaryInfo
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datamap.DataMapStoreManager
+import org.apache.carbondata.core.indexstore.BlockletDataMapIndexWrapper
+import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier
+import org.apache.carbondata.core.metadata.schema.table.{CarbonTable, DataMapSchema}
+import org.apache.carbondata.datamap.bloom.BloomCacheKeyValue
+import org.apache.carbondata.processing.merger.CarbonDataMergerUtil
+
+/**
+ * SHOW CACHE
+ */
+case class CarbonShowCacheCommand(tableIdentifier: Option[TableIdentifier])
+  extends MetadataCommand {
+
+  override def output: Seq[AttributeReference] = {
+    if (tableIdentifier.isEmpty) {
+      Seq(
+        AttributeReference("Database", StringType, nullable = false)(),
+        AttributeReference("Table", StringType, nullable = false)(),
+        AttributeReference("Index size", StringType, nullable = false)(),
+        AttributeReference("Datamap size", StringType, nullable = false)(),
+        AttributeReference("Dictionary size", StringType, nullable = false)())
+    } else {
+      Seq(
+        AttributeReference("Field", StringType, nullable = false)(),
+        AttributeReference("Size", StringType, nullable = false)(),
+        AttributeReference("Comment", StringType, nullable = false)())
+    }
+  }
+
+  override protected def opName: String = "SHOW CACHE"
+
+  def showAllTablesCache(sparkSession: SparkSession): Seq[Row] = {
+    val currentDatabase = sparkSession.sessionState.catalog.getCurrentDatabase
+    val cache = CacheProvider.getInstance().getCarbonCache()
+    if (cache == null) {
+      Seq(Row("ALL", "ALL", 0L, 0L, 0L),
+        Row(currentDatabase, "ALL", 0L, 0L, 0L))
+    } else {
+      val tableIdents = sparkSession.sessionState.catalog.listTables(currentDatabase).toArray
+      val dbLocation = CarbonEnv.getDatabaseLocation(currentDatabase, sparkSession)
+      val tempLocation = dbLocation.replace(
+        CarbonCommonConstants.WINDOWS_FILE_SEPARATOR, CarbonCommonConstants.FILE_SEPARATOR)
+      val tablePaths = tableIdents.map { tableIdent =>
+        (tempLocation + CarbonCommonConstants.FILE_SEPARATOR +
+         tableIdent.table + CarbonCommonConstants.FILE_SEPARATOR,
+          CarbonEnv.getDatabaseName(tableIdent.database)(sparkSession) + "." + tableIdent.table)
+      }
+
+      val dictIds = tableIdents
+        .map { tableIdent =>
+          var table: CarbonTable = null
+          try {
+            table = CarbonEnv.getCarbonTable(tableIdent)(sparkSession)
+          } catch {
+            case _ =>
+          }
+          table
+        }
+        .filter(_ != null)
+        .flatMap { table =>
+          table
+            .getAllDimensions
+            .asScala
+            .filter(_.isGlobalDictionaryEncoding)
+            .toArray
+            .map(dim => (dim.getColumnId, table.getDatabaseName + "." + table.getTableName))
+        }
+
+      // all databases
+      var (allIndexSize, allDatamapSize, allDictSize) = (0L, 0L, 0L)
+      // current database
+      var (dbIndexSize, dbDatamapSize, dbDictSize) = (0L, 0L, 0L)
+      val tableMapIndexSize = mutable.HashMap[String, Long]()
+      val tableMapDatamapSize = mutable.HashMap[String, Long]()
+      val tableMapDictSize = mutable.HashMap[String, Long]()
+      val cacheIterator = cache.getCacheMap.entrySet().iterator()
+      while (cacheIterator.hasNext) {
+        val entry = cacheIterator.next()
+        val cache = entry.getValue
+        if (cache.isInstanceOf[BlockletDataMapIndexWrapper]) {
+          // index
+          allIndexSize = allIndexSize + cache.getMemorySize
+          val indexPath = entry.getKey.replace(
+            CarbonCommonConstants.WINDOWS_FILE_SEPARATOR, CarbonCommonConstants.FILE_SEPARATOR)
+          val tablePath = tablePaths.find(path => indexPath.startsWith(path._1))
+          if (tablePath.isDefined) {
+            dbIndexSize = dbIndexSize + cache.getMemorySize
+            val memorySize = tableMapIndexSize.get(tablePath.get._2)
+            if (memorySize.isEmpty) {
+              tableMapIndexSize.put(tablePath.get._2, cache.getMemorySize)
+            } else {
+              tableMapIndexSize.put(tablePath.get._2, memorySize.get + cache.getMemorySize)
+            }
+          }
+        } else if (cache.isInstanceOf[BloomCacheKeyValue.CacheValue]) {
+          // bloom datamap
+          allDatamapSize = allDatamapSize + cache.getMemorySize
+          val shardPath = entry.getKey.replace(CarbonCommonConstants.WINDOWS_FILE_SEPARATOR,
+            CarbonCommonConstants.FILE_SEPARATOR)
+          val tablePath = tablePaths.find(path => shardPath.contains(path._1))
+          if (tablePath.isDefined) {
+            dbDatamapSize = dbDatamapSize + cache.getMemorySize
+            val memorySize = tableMapDatamapSize.get(tablePath.get._2)
+            if (memorySize.isEmpty) {
+              tableMapDatamapSize.put(tablePath.get._2, cache.getMemorySize)
+            } else {
+              tableMapDatamapSize.put(tablePath.get._2, memorySize.get + cache.getMemorySize)
+            }
+          }
+        } else if (cache.isInstanceOf[AbstractColumnDictionaryInfo]) {
+          // dictionary
+          allDictSize = allDictSize + cache.getMemorySize
+          val dictId = dictIds.find(id => entry.getKey.startsWith(id._1))
+          if (dictId.isDefined) {
+            dbDictSize = dbDictSize + cache.getMemorySize
+            val memorySize = tableMapDictSize.get(dictId.get._2)
+            if (memorySize.isEmpty) {
+              tableMapDictSize.put(dictId.get._2, cache.getMemorySize)
+            } else {
+              tableMapDictSize.put(dictId.get._2, memorySize.get + cache.getMemorySize)
+            }
+          }
+        }
+      }
+      if (tableMapIndexSize.isEmpty && tableMapDatamapSize.isEmpty && tableMapDictSize.isEmpty) {
+        Seq(
+          Row("ALL", "ALL", byteCountToDisplaySize(allIndexSize),
+            byteCountToDisplaySize(allDatamapSize), byteCountToDisplaySize(allDictSize)),
+          Row(currentDatabase, "ALL", "0", "0", "0"))
+      } else {
+        val tableList = tableMapIndexSize
+          .map(_._1)
+          .toSeq
+          .union(tableMapDictSize.map(_._1).toSeq)
+          .distinct
+          .sorted
+          .map { uniqueName =>
+            val values = uniqueName.split("\\.")
+            val indexSize = tableMapIndexSize.getOrElse(uniqueName, 0L)
+            val datamapSize = tableMapDatamapSize.getOrElse(uniqueName, 0L)
+            val dictSize = tableMapDictSize.getOrElse(uniqueName, 0L)
+            Row(values(0), values(1), byteCountToDisplaySize(indexSize),
+              byteCountToDisplaySize(datamapSize), byteCountToDisplaySize(dictSize))
+          }
+
+        Seq(
+          Row("ALL", "ALL", byteCountToDisplaySize(allIndexSize),
+            byteCountToDisplaySize(allDatamapSize), byteCountToDisplaySize(allDictSize)),
+          Row(currentDatabase, "ALL", byteCountToDisplaySize(dbIndexSize),
+            byteCountToDisplaySize(dbDatamapSize), byteCountToDisplaySize(dbDictSize))
+        ) ++ tableList
+      }
+    }
+  }
+
+  def showTableCache(sparkSession: SparkSession, carbonTable: CarbonTable): Seq[Row] = {
+    val tableName = carbonTable.getTableName
+    val databaseName = carbonTable.getDatabaseName
+    val cache = CacheProvider.getInstance().getCarbonCache()
+    if (cache == null) {
+      Seq.empty
+    } else {
+      val dbLocation = CarbonEnv
+        .getDatabaseLocation(databaseName, sparkSession)
+        .replace(CarbonCommonConstants.WINDOWS_FILE_SEPARATOR, CarbonCommonConstants.FILE_SEPARATOR)
+      val tablePath = dbLocation + CarbonCommonConstants.FILE_SEPARATOR +
+                      tableName + CarbonCommonConstants.FILE_SEPARATOR
+      var numIndexFilesCached = 0
+
+      // Path -> Name, Type
+      val datamapName = mutable.Map[String, (String, String)]()
+      // Path -> Size
+      val datamapSize = mutable.Map[String, Long]()
+      // parent table
+      datamapName.put(tablePath, ("", ""))
+      datamapSize.put(tablePath, 0)
+      // children tables
+      for( schema <- carbonTable.getTableInfo.getDataMapSchemaList.asScala ) {
+        val path = dbLocation + CarbonCommonConstants.FILE_SEPARATOR + tableName + "_" +
+                   schema.getDataMapName + CarbonCommonConstants.FILE_SEPARATOR
+        val name = schema.getDataMapName
+        val dmType = schema.getProviderName
+        datamapName.put(path, (name, dmType))
+        datamapSize.put(path, 0)
+      }
+      // index schemas
+      for (schema <- DataMapStoreManager.getInstance().getDataMapSchemasOfTable(carbonTable)
+        .asScala) {
+        val path = dbLocation + CarbonCommonConstants.FILE_SEPARATOR + tableName +
+                   CarbonCommonConstants.FILE_SEPARATOR + schema.getDataMapName +
+                   CarbonCommonConstants.FILE_SEPARATOR
+        val name = schema.getDataMapName
+        val dmType = schema.getProviderName
+        datamapName.put(path, (name, dmType))
+        datamapSize.put(path, 0)
+      }
+
+      var dictSize = 0L
+
+      // dictionary column ids
+      val dictIds = carbonTable
+        .getAllDimensions
+        .asScala
+        .filter(_.isGlobalDictionaryEncoding)
+        .map(_.getColumnId)
+        .toArray
+
+      val cacheIterator = cache.getCacheMap.entrySet().iterator()
+      while (cacheIterator.hasNext) {
+        val entry = cacheIterator.next()
+        val cache = entry.getValue
+
+        if (cache.isInstanceOf[BlockletDataMapIndexWrapper]) {
+          // index
+          val indexPath = entry.getKey.replace(CarbonCommonConstants.WINDOWS_FILE_SEPARATOR,
+            CarbonCommonConstants.FILE_SEPARATOR)
+          val pathEntry = datamapSize.filter(entry => indexPath.startsWith(entry._1))
+          if(pathEntry.nonEmpty) {
+            val (path, size) = pathEntry.iterator.next()
+            datamapSize.put(path, size + cache.getMemorySize)
+          }
+          if(indexPath.startsWith(tablePath)) {
+            numIndexFilesCached += 1
+          }
+        } else if (cache.isInstanceOf[BloomCacheKeyValue.CacheValue]) {
+          // bloom datamap
+          val shardPath = entry.getKey.replace(CarbonCommonConstants.WINDOWS_FILE_SEPARATOR,
+            CarbonCommonConstants.FILE_SEPARATOR)
+          val pathEntry = datamapSize.filter(entry => shardPath.contains(entry._1))
+          if(pathEntry.nonEmpty) {
+            val (path, size) = pathEntry.iterator.next()
+            datamapSize.put(path, size + cache.getMemorySize)
+          }
+        } else if (cache.isInstanceOf[AbstractColumnDictionaryInfo]) {
+          // dictionary
+          val dictId = dictIds.find(id => entry.getKey.startsWith(id))
+          if (dictId.isDefined) {
+            dictSize = dictSize + cache.getMemorySize
+          }
+        }
+      }
+
+      // get all index files
+      val absoluteTableIdentifier = AbsoluteTableIdentifier.from(tablePath)
+      val numIndexFilesAll = CarbonDataMergerUtil.getValidSegmentList(absoluteTableIdentifier)
+        .asScala.map {
+          segment =>
+            segment.getCommittedIndexFile
+        }.flatMap {
+        indexFilesMap => indexFilesMap.keySet().toArray
+      }.size
+
+      var result = Seq(
+        Row("Index", byteCountToDisplaySize(datamapSize.get(tablePath).get),
+          numIndexFilesCached + "/" + numIndexFilesAll + " index files cached"),
+        Row("Dictionary", byteCountToDisplaySize(dictSize), "")
+      )
+      for ((path, size) <- datamapSize) {
+        if (path != tablePath) {
+          val (dmName, dmType) = datamapName.get(path).get
+          result = result :+ Row(dmName, byteCountToDisplaySize(size), dmType)
+        }
+      }
+      result
+    }
+  }
+
+  override def processMetadata(sparkSession: SparkSession): Seq[Row] = {
+    if (tableIdentifier.isEmpty) {
+      showAllTablesCache(sparkSession)
+    } else {
+      val carbonTable = CarbonEnv.getCarbonTable(tableIdentifier.get)(sparkSession)
+      if (carbonTable.isChildDataMap) {
+        throw new UnsupportedOperationException("Operation not allowed on child table.")
+      }
+      showTableCache(sparkSession, carbonTable)
+    }
+  }
+}
diff --git a/integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala b/integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala
index d1023fa..a2923b8 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala
@@ -33,11 +33,11 @@ import org.apache.spark.sql.execution.command.table.CarbonCreateTableCommand
 import org.apache.spark.sql.types.StructField
 import org.apache.spark.sql.CarbonExpressions.CarbonUnresolvedRelation
 import org.apache.spark.sql.catalyst.analysis.UnresolvedRelation
+import org.apache.spark.sql.execution.command.cache.CarbonShowCacheCommand
 import org.apache.spark.sql.execution.command.stream.{CarbonCreateStreamCommand, CarbonDropStreamCommand, CarbonShowStreamsCommand}
 import org.apache.spark.sql.util.CarbonException
 import org.apache.spark.util.CarbonReflectionUtils
 
-import org.apache.carbondata.api.CarbonStore.LOGGER
 import org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
 import org.apache.carbondata.core.constants.CarbonCommonConstants
 import org.apache.carbondata.spark.CarbonOption
@@ -77,7 +77,7 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser {
 
   protected lazy val startCommand: Parser[LogicalPlan] =
     loadManagement | showLoads | alterTable | restructure | updateTable | deleteRecords |
-    alterPartition | datamapManagement | alterTableFinishStreaming | stream | cli
+    alterPartition | datamapManagement | alterTableFinishStreaming | stream | cli | cacheManagement
 
   protected lazy val loadManagement: Parser[LogicalPlan] =
     deleteLoadsByID | deleteLoadsByLoadDate | cleanFiles | loadDataNew
@@ -94,6 +94,9 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser {
   protected lazy val stream: Parser[LogicalPlan] =
     createStream | dropStream | showStreams
 
+  protected lazy val cacheManagement: Parser[LogicalPlan] =
+    showCache
+
   protected lazy val alterAddPartition: Parser[LogicalPlan] =
     ALTER ~> TABLE ~> (ident <~ ".").? ~ ident ~ (ADD ~> PARTITION ~>
       "(" ~> repsep(stringLit, ",") <~ ")") <~ opt(";") ^^ {
@@ -494,6 +497,11 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser {
           showHistory.isDefined)
     }
 
+  protected lazy val showCache: Parser[LogicalPlan] =
+    SHOW ~> METACACHE ~> opt(ontable) <~ opt(";") ^^ {
+      case table =>
+        CarbonShowCacheCommand(table)
+    }
 
   protected lazy val cli: Parser[LogicalPlan] =
     (CARBONCLI ~> FOR ~> TABLE) ~> (ident <~ ".").? ~ ident ~


[carbondata] 33/41: [CARBONDATA-3328]Fixed performance issue with merge small files distribution

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit f4141cbc5365e56c54762d2c1221a6666b11d084
Author: kumarvishal09 <ku...@gmail.com>
AuthorDate: Mon Mar 25 12:36:34 2019 +0530

    [CARBONDATA-3328]Fixed performance issue with merge small files distribution
    
    Problem
    After PR#3154 in case of merge small files split length was coming 0 because of this it was merging all the files and impacting query performance when merge small files distribution is true
    
    Solution
    Now in CarbonInputSplit getLength method if it is -1 it get from datamaprow and if data map row is null then it will get from detailinfo
    
    This closes #3161
---
 .../carbondata/core/indexstore/ExtendedBlocklet.java    |  2 +-
 .../org/apache/carbondata/hadoop/CarbonInputSplit.java  | 17 +++++++++++++----
 .../apache/carbondata/hadoop/CarbonMultiBlockSplit.java |  8 ++------
 .../org/apache/carbondata/spark/rdd/CarbonScanRDD.scala |  2 +-
 4 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java b/core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java
index 8c4ea06..3d6cedd 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java
@@ -38,7 +38,7 @@ public class ExtendedBlocklet extends Blocklet {
       boolean compareBlockletIdForObjectMatching, ColumnarFormatVersion version) {
     super(filePath, blockletId, compareBlockletIdForObjectMatching);
     try {
-      this.inputSplit = CarbonInputSplit.from(null, blockletId, filePath, 0, 0, version, null);
+      this.inputSplit = CarbonInputSplit.from(null, blockletId, filePath, 0, -1, version, null);
     } catch (IOException e) {
       throw new RuntimeException(e);
     }
diff --git a/core/src/main/java/org/apache/carbondata/hadoop/CarbonInputSplit.java b/core/src/main/java/org/apache/carbondata/hadoop/CarbonInputSplit.java
index bb1742c..406456f 100644
--- a/core/src/main/java/org/apache/carbondata/hadoop/CarbonInputSplit.java
+++ b/core/src/main/java/org/apache/carbondata/hadoop/CarbonInputSplit.java
@@ -96,7 +96,7 @@ public class CarbonInputSplit extends FileSplit
 
   private transient List<ColumnSchema> columnSchema;
 
-  private transient boolean useMinMaxForPruning;
+  private boolean useMinMaxForPruning = true;
 
   private boolean isBlockCache = true;
 
@@ -534,7 +534,7 @@ public class CarbonInputSplit extends FileSplit
       out.writeInt(blockletInfoBinary.length);
       out.write(blockletInfoBinary);
     }
-    out.writeLong(this.dataMapRow.getLong(BlockletDataMapRowIndexes.BLOCK_LENGTH));
+    out.writeLong(getLength());
     out.writeBoolean(this.isLegacyStore);
     out.writeBoolean(this.useMinMaxForPruning);
   }
@@ -553,7 +553,7 @@ public class CarbonInputSplit extends FileSplit
       detailInfo.setBlockFooterOffset(
           this.dataMapRow.getLong(BlockletDataMapRowIndexes.BLOCK_FOOTER_OFFSET));
       detailInfo
-          .setBlockSize(this.dataMapRow.getLong(BlockletDataMapRowIndexes.BLOCK_LENGTH));
+          .setBlockSize(getLength());
       detailInfo.setLegacyStore(isLegacyStore);
       detailInfo.setUseMinMaxForPruning(useMinMaxForPruning);
       if (!this.isBlockCache) {
@@ -602,7 +602,16 @@ public class CarbonInputSplit extends FileSplit
   public long getStart() { return start; }
 
   @Override
-  public long getLength() { return length; }
+  public long getLength() {
+    if (length == -1) {
+      if (null != dataMapRow) {
+        length = this.dataMapRow.getLong(BlockletDataMapRowIndexes.BLOCK_LENGTH);
+      } else if (null != detailInfo) {
+        length = detailInfo.getBlockSize();
+      }
+    }
+    return length;
+  }
 
   @Override
   public String toString() { return filePath + ":" + start + "+" + length; }
diff --git a/hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonMultiBlockSplit.java b/hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonMultiBlockSplit.java
index 4c99c4f..7ac5bc0 100644
--- a/hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonMultiBlockSplit.java
+++ b/hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonMultiBlockSplit.java
@@ -100,18 +100,14 @@ public class CarbonMultiBlockSplit extends InputSplit implements Serializable, W
 
   public void calculateLength() {
     long total = 0;
-    if (splitList.size() > 1 && splitList.get(0).getDetailInfo() != null) {
+    if (splitList.size() > 0) {
       Map<String, Long> blockSizes = new HashMap<>();
       for (CarbonInputSplit split : splitList) {
-        blockSizes.put(split.getBlockPath(), split.getDetailInfo().getBlockSize());
+        blockSizes.put(split.getFilePath(), split.getLength());
       }
       for (Map.Entry<String, Long> entry : blockSizes.entrySet()) {
         total += entry.getValue();
       }
-    } else {
-      for (CarbonInputSplit split : splitList) {
-        total += split.getLength();
-      }
     }
     length = total;
   }
diff --git a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonScanRDD.scala b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonScanRDD.scala
index 9e66139..6cee8dc 100644
--- a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonScanRDD.scala
+++ b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonScanRDD.scala
@@ -304,7 +304,7 @@ class CarbonScanRDD[T: ClassTag](
           val blockSplits = splits
             .asScala
             .map(_.asInstanceOf[CarbonInputSplit])
-            .groupBy(f => f.getBlockPath)
+            .groupBy(f => f.getFilePath)
             .map { blockSplitEntry =>
               new CarbonMultiBlockSplit(
                 blockSplitEntry._2.asJava,


[carbondata] 36/41: [CARBONDATA-3330] Fix Invalid Exception while clearing datamap from SDK carbon reader

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 09c598f9cb16cf040d2e2ee0707113be56a67445
Author: ajantha-bhat <aj...@gmail.com>
AuthorDate: Mon Mar 25 16:27:14 2019 +0800

    [CARBONDATA-3330] Fix Invalid Exception while clearing datamap from SDK carbon reader
    
    problem:
    When clearing the datamap in SDK carbon reader close(), it was always checking for schema file to build table. for SDK it will not find schema file and hence the exception.
    
    cause : This was already an existing bug, now visible after #2878
    Carbon table is required only for launching datamap clearing job, for SDK no need to launch the job
    
    solution: Look for carbon table only if need to launch the datamap clearing job.
    
    This closes #3162
---
 .../core/datamap/DataMapStoreManager.java          | 16 +++++----
 .../carbondata/sdk/file/CarbonReaderTest.java      | 38 +++++++++++-----------
 2 files changed, 28 insertions(+), 26 deletions(-)

diff --git a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
index 524d8b0..a797b11 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
@@ -482,7 +482,6 @@ public final class DataMapStoreManager {
    * @param identifier Table identifier
    */
   public void clearDataMaps(AbsoluteTableIdentifier identifier, boolean launchJob) {
-    CarbonTable carbonTable = getCarbonTable(identifier);
     String tableUniqueName = identifier.getCarbonTableIdentifier().getTableUniqueName();
     List<TableDataMap> tableIndices = allDataMaps.get(tableUniqueName);
     if (tableIndices == null) {
@@ -492,12 +491,15 @@ public final class DataMapStoreManager {
         tableIndices = allDataMaps.get(tableUniqueName);
       }
     }
-    if (null != carbonTable && tableIndices != null && launchJob) {
-      try {
-        DataMapUtil.executeDataMapJobForClearingDataMaps(carbonTable);
-      } catch (IOException e) {
-        LOGGER.error("clear dataMap job failed", e);
-        // ignoring the exception
+    if (launchJob && tableIndices != null) {
+      CarbonTable carbonTable = getCarbonTable(identifier);
+      if (null != carbonTable) {
+        try {
+          DataMapUtil.executeDataMapJobForClearingDataMaps(carbonTable);
+        } catch (IOException e) {
+          LOGGER.error("clear dataMap job failed", e);
+          // ignoring the exception
+        }
       }
     }
     segmentRefreshMap.remove(identifier.uniqueName());
diff --git a/store/sdk/src/test/java/org/apache/carbondata/sdk/file/CarbonReaderTest.java b/store/sdk/src/test/java/org/apache/carbondata/sdk/file/CarbonReaderTest.java
index 871d51b..f09581a 100644
--- a/store/sdk/src/test/java/org/apache/carbondata/sdk/file/CarbonReaderTest.java
+++ b/store/sdk/src/test/java/org/apache/carbondata/sdk/file/CarbonReaderTest.java
@@ -65,7 +65,7 @@ public class CarbonReaderTest extends TestCase {
     String path = "./testWriteFiles";
     FileUtils.deleteDirectory(new File(path));
     DataMapStoreManager.getInstance()
-        .clearDataMaps(AbsoluteTableIdentifier.from(path));
+        .clearDataMaps(AbsoluteTableIdentifier.from(path), false);
     Field[] fields = new Field[2];
     fields[0] = new Field("name", DataTypes.STRING);
     fields[1] = new Field("age", DataTypes.INT);
@@ -107,7 +107,7 @@ public class CarbonReaderTest extends TestCase {
   @Test public void testReadWithZeroBatchSize() throws Exception {
     String path = "./testWriteFiles";
     FileUtils.deleteDirectory(new File(path));
-    DataMapStoreManager.getInstance().clearDataMaps(AbsoluteTableIdentifier.from(path));
+    DataMapStoreManager.getInstance().clearDataMaps(AbsoluteTableIdentifier.from(path), false);
     Field[] fields = new Field[2];
     fields[0] = new Field("name", DataTypes.STRING);
     fields[1] = new Field("age", DataTypes.INT);
@@ -132,7 +132,7 @@ public class CarbonReaderTest extends TestCase {
   public void testReadBatchWithZeroBatchSize() throws Exception {
     String path = "./testWriteFiles";
     FileUtils.deleteDirectory(new File(path));
-    DataMapStoreManager.getInstance().clearDataMaps(AbsoluteTableIdentifier.from(path));
+    DataMapStoreManager.getInstance().clearDataMaps(AbsoluteTableIdentifier.from(path), false);
     Field[] fields = new Field[2];
     fields[0] = new Field("name", DataTypes.STRING);
     fields[1] = new Field("age", DataTypes.INT);
@@ -156,7 +156,7 @@ public class CarbonReaderTest extends TestCase {
     String path = "./testWriteFiles";
     FileUtils.deleteDirectory(new File(path));
     DataMapStoreManager.getInstance()
-        .clearDataMaps(AbsoluteTableIdentifier.from(path));
+        .clearDataMaps(AbsoluteTableIdentifier.from(path), false);
     String path1 = path + "/0testdir";
     String path2 = path + "/testdir";
 
@@ -203,7 +203,7 @@ public class CarbonReaderTest extends TestCase {
     String path = "./testWriteFiles";
     FileUtils.deleteDirectory(new File(path));
     DataMapStoreManager.getInstance()
-        .clearDataMaps(AbsoluteTableIdentifier.from(path));
+        .clearDataMaps(AbsoluteTableIdentifier.from(path), false);
     Field[] fields = new Field[2];
     fields[0] = new Field("name", DataTypes.STRING);
     fields[1] = new Field("age", DataTypes.INT);
@@ -240,7 +240,7 @@ public class CarbonReaderTest extends TestCase {
     String path = "./testWriteFiles";
     FileUtils.deleteDirectory(new File(path));
     DataMapStoreManager.getInstance()
-        .clearDataMaps(AbsoluteTableIdentifier.from(path));
+        .clearDataMaps(AbsoluteTableIdentifier.from(path), false);
     Field[] fields = new Field[3];
     fields[0] = new Field("name", DataTypes.STRING);
     fields[1] = new Field("age", DataTypes.INT);
@@ -283,7 +283,7 @@ public class CarbonReaderTest extends TestCase {
     String path = "./testWriteFiles";
     FileUtils.deleteDirectory(new File(path));
     DataMapStoreManager.getInstance()
-        .clearDataMaps(AbsoluteTableIdentifier.from(path));
+        .clearDataMaps(AbsoluteTableIdentifier.from(path), false);
     Field[] fields = new Field[3];
     fields[0] = new Field("name", DataTypes.STRING);
     fields[1] = new Field("age", DataTypes.INT);
@@ -326,7 +326,7 @@ public class CarbonReaderTest extends TestCase {
     String path = "./testWriteFiles";
     FileUtils.deleteDirectory(new File(path));
     DataMapStoreManager.getInstance()
-        .clearDataMaps(AbsoluteTableIdentifier.from(path));
+        .clearDataMaps(AbsoluteTableIdentifier.from(path), false);
     Field[] fields = new Field[3];
     fields[0] = new Field("name", DataTypes.STRING);
     fields[1] = new Field("age", DataTypes.INT);
@@ -369,7 +369,7 @@ public class CarbonReaderTest extends TestCase {
     String path = "./testWriteFiles";
     FileUtils.deleteDirectory(new File(path));
     DataMapStoreManager.getInstance()
-        .clearDataMaps(AbsoluteTableIdentifier.from(path));
+        .clearDataMaps(AbsoluteTableIdentifier.from(path), false);
     Field[] fields = new Field[3];
     fields[0] = new Field("name", DataTypes.STRING);
     fields[1] = new Field("age", DataTypes.INT);
@@ -412,7 +412,7 @@ public class CarbonReaderTest extends TestCase {
     String path = "./testWriteFiles";
     FileUtils.deleteDirectory(new File(path));
     DataMapStoreManager.getInstance()
-        .clearDataMaps(AbsoluteTableIdentifier.from(path));
+        .clearDataMaps(AbsoluteTableIdentifier.from(path), false);
     Field[] fields = new Field[3];
     fields[0] = new Field("name", DataTypes.STRING);
     fields[1] = new Field("age", DataTypes.INT);
@@ -462,7 +462,7 @@ public class CarbonReaderTest extends TestCase {
 
     TestUtil.writeFilesAndVerify(200, new Schema(fields), path);
     DataMapStoreManager.getInstance()
-        .clearDataMaps(AbsoluteTableIdentifier.from(path));
+        .clearDataMaps(AbsoluteTableIdentifier.from(path), false);
     ColumnExpression columnExpression = new ColumnExpression("doubleField", DataTypes.DOUBLE);
     LessThanExpression lessThanExpression = new LessThanExpression(columnExpression,
         new LiteralExpression("13.5", DataTypes.DOUBLE));
@@ -498,7 +498,7 @@ public class CarbonReaderTest extends TestCase {
     String path = "./testWriteFiles";
     FileUtils.deleteDirectory(new File(path));
     DataMapStoreManager.getInstance()
-        .clearDataMaps(AbsoluteTableIdentifier.from(path));
+        .clearDataMaps(AbsoluteTableIdentifier.from(path), false);
     Field[] fields = new Field[3];
     fields[0] = new Field("name", DataTypes.STRING);
     fields[1] = new Field("age", DataTypes.INT);
@@ -543,9 +543,9 @@ public class CarbonReaderTest extends TestCase {
     FileUtils.deleteDirectory(new File(path1));
     FileUtils.deleteDirectory(new File(path2));
     DataMapStoreManager.getInstance()
-        .clearDataMaps(AbsoluteTableIdentifier.from(path1));
+        .clearDataMaps(AbsoluteTableIdentifier.from(path1), false);
     DataMapStoreManager.getInstance()
-        .clearDataMaps(AbsoluteTableIdentifier.from(path2));
+        .clearDataMaps(AbsoluteTableIdentifier.from(path2), false);
     Field[] fields = new Field[] { new Field("c1", "string"),
          new Field("c2", "int") };
     Schema schema = new Schema(fields);
@@ -607,7 +607,7 @@ public class CarbonReaderTest extends TestCase {
     String path = "./testWriteFiles";
     FileUtils.deleteDirectory(new File(path));
     DataMapStoreManager.getInstance()
-        .clearDataMaps(AbsoluteTableIdentifier.from(path));
+        .clearDataMaps(AbsoluteTableIdentifier.from(path), false);
     Field[] fields = new Field[2];
     fields[0] = new Field("name", DataTypes.STRING);
     fields[1] = new Field("age", DataTypes.INT);
@@ -644,7 +644,7 @@ public class CarbonReaderTest extends TestCase {
     String path = "./testWriteFiles";
     FileUtils.deleteDirectory(new File(path));
     DataMapStoreManager.getInstance()
-        .clearDataMaps(AbsoluteTableIdentifier.from(path));
+        .clearDataMaps(AbsoluteTableIdentifier.from(path), false);
     Field[] fields = new Field[2];
     fields[0] = new Field("name", DataTypes.STRING);
     fields[1] = new Field("age", DataTypes.INT);
@@ -680,7 +680,7 @@ public class CarbonReaderTest extends TestCase {
     String path = "./testWriteFiles";
     FileUtils.deleteDirectory(new File(path));
     DataMapStoreManager.getInstance()
-        .clearDataMaps(AbsoluteTableIdentifier.from(path));
+        .clearDataMaps(AbsoluteTableIdentifier.from(path), false);
     Field[] fields = new Field[2];
     fields[0] = new Field("name", DataTypes.STRING);
     fields[1] = new Field("age", DataTypes.INT);
@@ -721,7 +721,7 @@ public class CarbonReaderTest extends TestCase {
     String path = "./testWriteFiles";
     FileUtils.deleteDirectory(new File(path));
     DataMapStoreManager.getInstance()
-        .clearDataMaps(AbsoluteTableIdentifier.from(path));
+        .clearDataMaps(AbsoluteTableIdentifier.from(path), false);
     Field[] fields = new Field[2];
     fields[0] = new Field("name", DataTypes.STRING);
     fields[1] = new Field("age", DataTypes.INT);
@@ -751,7 +751,7 @@ public class CarbonReaderTest extends TestCase {
     String path = "./testWriteFiles";
     FileUtils.deleteDirectory(new File(path));
     DataMapStoreManager.getInstance()
-        .clearDataMaps(AbsoluteTableIdentifier.from(path));
+        .clearDataMaps(AbsoluteTableIdentifier.from(path), false);
     Field[] fields = new Field[2];
     fields[0] = new Field("name", DataTypes.STRING);
     fields[1] = new Field("age", DataTypes.INT);


[carbondata] 11/41: [CARBONDATA-3281] Add validation for the size of the LRU cache

Posted by ra...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit ffa77304a794ec5efa9eabe73a6eef68740e48fd
Author: litt <li...@126.com>
AuthorDate: Tue Jan 29 16:34:40 2019 +0800

    [CARBONDATA-3281] Add validation for the size of the LRU cache
    
    If configure the LRU bigger than jvm xmx size, then use CARBON_MAX_LRU_CACHE_SIZE_DEFAULT instead.Because if setting LRU bigger than xmx size,if we query for a big table with too many carbonfiles, it may cause "Error: java.io.IOException: Problem in loading segment blocks: GC overhead
    limit exceeded (state=,code=0)" and the jdbc server will restart.
    
    This closes #3118
---
 .../carbondata/core/cache/CarbonLRUCache.java      | 32 ++++++++++++++++++++++
 .../core/constants/CarbonCommonConstants.java      |  5 ++++
 .../carbondata/core/cache/CarbonLRUCacheTest.java  |  7 +++++
 3 files changed, 44 insertions(+)

diff --git a/core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java b/core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java
index 0c75173..3371d0d 100644
--- a/core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java
+++ b/core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java
@@ -70,6 +70,16 @@ public final class CarbonLRUCache {
           + CarbonCommonConstants.CARBON_MAX_LRU_CACHE_SIZE_DEFAULT);
       lruCacheMemorySize = Long.parseLong(defaultPropertyName);
     }
+
+    // if lru cache is bigger than jvm max heap then set part size of max heap (60% default)
+    if (isBeyondMaxMemory()) {
+      double changeSize = getPartOfXmx();
+      LOGGER.warn("Configured LRU size " + lruCacheMemorySize +
+              "MB exceeds the max size of JVM heap. Carbon will fallback to use " +
+              changeSize + " MB instead");
+      lruCacheMemorySize = (long)changeSize;
+    }
+
     initCache();
     if (lruCacheMemorySize > 0) {
       LOGGER.info("Configured LRU cache size is " + lruCacheMemorySize + " MB");
@@ -326,4 +336,26 @@ public final class CarbonLRUCache {
   public Map<String, Cacheable> getCacheMap() {
     return lruCacheMap;
   }
+
+  /**
+   * Check if LRU cache setting is bigger than max memory of jvm.
+   * if LRU cache is bigger than max memory of jvm when query for a big segments table,
+   * may cause JDBC server crash.
+   * @return true LRU cache is bigger than max memory of jvm, false otherwise
+   */
+  private boolean isBeyondMaxMemory() {
+    long mSize = Runtime.getRuntime().maxMemory();
+    long lruSize = lruCacheMemorySize * BYTE_CONVERSION_CONSTANT;
+    return lruSize >= mSize;
+  }
+
+  /**
+   * when LRU cache is bigger than max heap of jvm.
+   * set to part of  max heap size, use CARBON_LRU_CACHE_PERCENT_OVER_MAX_SIZE default 60%.
+   * @return the LRU cache size
+   */
+  private double getPartOfXmx() {
+    long mSizeMB = Runtime.getRuntime().maxMemory() / BYTE_CONVERSION_CONSTANT;
+    return mSizeMB * CarbonCommonConstants.CARBON_LRU_CACHE_PERCENT_OVER_MAX_SIZE;
+  }
 }
diff --git a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
index f5c07a4..69374ad 100644
--- a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
+++ b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
@@ -1257,6 +1257,11 @@ public final class CarbonCommonConstants {
   public static final String CARBON_MAX_LRU_CACHE_SIZE_DEFAULT = "-1";
 
   /**
+   * when LRU cache if beyond the jvm max memory size,set 60% percent of max size
+   */
+  public static final double CARBON_LRU_CACHE_PERCENT_OVER_MAX_SIZE = 0.6d;
+
+  /**
    * property to enable min max during filter query
    */
   @CarbonProperty
diff --git a/core/src/test/java/org/apache/carbondata/core/cache/CarbonLRUCacheTest.java b/core/src/test/java/org/apache/carbondata/core/cache/CarbonLRUCacheTest.java
index 0493655..8ef6684 100644
--- a/core/src/test/java/org/apache/carbondata/core/cache/CarbonLRUCacheTest.java
+++ b/core/src/test/java/org/apache/carbondata/core/cache/CarbonLRUCacheTest.java
@@ -60,6 +60,13 @@ public class CarbonLRUCacheTest {
     assertNull(carbonLRUCache.get("Column2"));
   }
 
+  @Test public void testBiggerThanMaxSizeConfiguration() {
+    CarbonLRUCache carbonLRUCacheForConfig =
+            new CarbonLRUCache("prop2", "200000");//200GB
+    assertTrue(carbonLRUCacheForConfig.put("Column1", cacheable, 10L));
+    assertFalse(carbonLRUCacheForConfig.put("Column2", cacheable, 107374182400L));//100GB
+  }
+
   @AfterClass public static void cleanUp() {
     carbonLRUCache.clear();
     assertNull(carbonLRUCache.get("Column1"));