You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by dhatchayani <gi...@git.apache.org> on 2018/07/10 14:33:28 UTC
[GitHub] carbondata pull request #2482: [CARBONDATA-2714] Support merge index files f...
GitHub user dhatchayani opened a pull request:
https://github.com/apache/carbondata/pull/2482
[CARBONDATA-2714] Support merge index files for the segment
Completing and stabilizing Merge Index feature.
This PR depends on [PR#2307](https://github.com/apache/carbondata/pull/2307)
- [ ] Any interfaces changed?
- [ ] Any backward compatibility impacted?
- [ ] Document update required?
- [x] Testing done
UT and SDV added
- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/dhatchayani/carbondata CARBONDATA-2714
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/carbondata/pull/2482.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2482
----
commit e214b7b04fe28328b7d9836c7a1b8b54b3a30012
Author: dhatchayani <dh...@...>
Date: 2018-05-15T06:54:01Z
[CARBONDATA-2482] Pass uuid while writing segment file if possible
commit e434e61d7502c9e73e3e001c369e499b757474f6
Author: dhatchayani <dh...@...>
Date: 2018-07-10T14:30:41Z
[CARBONDATA-2714] Support merge index files for the segment
----
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2482
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5867/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2482
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5830/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by dhatchayani <gi...@git.apache.org>.
Github user dhatchayani commented on the issue:
https://github.com/apache/carbondata/pull/2482
retest this please
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7089/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7013/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6997/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7031/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2482
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5806/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5866/
---
[GitHub] carbondata pull request #2482: [CARBONDATA-2714] Support merge index files f...
Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2482#discussion_r202602660
--- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java ---
@@ -1871,6 +1868,16 @@
*/
public static final String CACHE_LEVEL_DEFAULT_VALUE = "BLOCK";
+ /**
+ * It is internal configuration and used only for test purpose.
--- End diff --
Internal ? then for users, how to enable and disable this feature ?
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5845/
---
[GitHub] carbondata pull request #2482: [CARBONDATA-2714] Support merge index files f...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2482#discussion_r202298843
--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/events/AlterTableMergeIndexEventListener.scala ---
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.events
+
+import java.util
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.util.CarbonException
+
+import org.apache.carbondata.common.logging.{LogService, LogServiceFactory}
+import org.apache.carbondata.core.datamap.Segment
+import org.apache.carbondata.core.locks.{CarbonLockFactory, LockUsage}
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager
+import org.apache.carbondata.events.{AlterTableMergeIndexEvent, Event, OperationContext, OperationEventListener}
+import org.apache.carbondata.processing.merger.CarbonDataMergerUtil
+import org.apache.carbondata.spark.util.CommonUtil
+
+class AlterTableMergeIndexEventListener extends OperationEventListener with Logging {
--- End diff --
This listener also merge with `MergeIndexEventListener` as another `case `
---
[GitHub] carbondata pull request #2482: [CARBONDATA-2714] Support merge index files f...
Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2482#discussion_r237740164
--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CarbonIndexFileMergeTestCase.scala ---
@@ -215,43 +249,215 @@ class CarbonIndexFileMergeTestCase
Assert
.assertEquals(getIndexOrMergeIndexFileSize(table, "0", CarbonTablePath.INDEX_FILE_EXT),
segment0.head.getIndexSize.toLong)
- new CarbonIndexFileMergeWriter(table)
- .mergeCarbonIndexFilesOfSegment("0", table.getTablePath, false, String.valueOf(System.currentTimeMillis()))
+ sql("Alter table fileSize compact 'segment_index'")
loadMetadataDetails = SegmentStatusManager
.readTableStatusFile(CarbonTablePath.getTableStatusFilePath(table.getTablePath))
segment0 = loadMetadataDetails.filter(x=> x.getLoadName.equalsIgnoreCase("0"))
Assert
.assertEquals(getIndexOrMergeIndexFileSize(table, "0", CarbonTablePath.MERGE_INDEX_FILE_EXT),
segment0.head.getIndexSize.toLong)
+ CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "true")
sql("DROP TABLE IF EXISTS fileSize")
}
- private def getIndexFileCount(tableName: String, segmentNo: String): Int = {
- val carbonTable = CarbonMetadata.getInstance().getCarbonTable(tableName)
- val segmentDir = CarbonTablePath.getSegmentPath(carbonTable.getTablePath, segmentNo)
- if (FileFactory.isFileExist(segmentDir)) {
- val indexFiles = new SegmentIndexFileStore().getIndexFilesFromSegment(segmentDir)
- indexFiles.asScala.map { f =>
- if (f._2 == null) {
- 1
- } else {
- 0
- }
- }.sum
- } else {
- val segment = Segment.getSegment(segmentNo, carbonTable.getTablePath)
- if (segment != null) {
- val store = new SegmentFileStore(carbonTable.getTablePath, segment.getSegmentFileName)
- store.getSegmentFile.getLocationMap.values().asScala.map { f =>
- if (f.getMergeFileName == null) {
- f.getFiles.size()
- } else {
- 0
- }
- }.sum
- } else {
- 0
+ test("Verify index merge for compacted segments MINOR - level 2") {
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.COMPACTION_SEGMENT_LEVEL_THRESHOLD, "2,2")
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "false")
+ sql("DROP TABLE IF EXISTS nonindexmerge")
+ sql(
+ """
+ | CREATE TABLE nonindexmerge(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
+ """.stripMargin)
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ val rows = sql("""Select count(*) from nonindexmerge""").collect()
+ assert(getIndexFileCount("default_nonindexmerge", "0") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "3") == 100)
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "true")
+ sql("ALTER TABLE nonindexmerge COMPACT 'minor'").collect()
+ assert(getIndexFileCount("default_nonindexmerge", "0") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "3") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "0.1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2.1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "0.2") == 0)
+ checkAnswer(sql("""Select count(*) from nonindexmerge"""), rows)
+ }
+
+ test("Verify index merge for compacted segments Auto Compaction") {
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.COMPACTION_SEGMENT_LEVEL_THRESHOLD, "2,3")
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "false")
+ sql("DROP TABLE IF EXISTS nonindexmerge")
+ sql(
+ """
+ | CREATE TABLE nonindexmerge(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
+ """.stripMargin)
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ val rows = sql("""Select count(*) from nonindexmerge""").collect()
+ assert(getIndexFileCount("default_nonindexmerge", "0") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "3") == 100)
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "true")
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.DEFAULT_ENABLE_AUTO_LOAD_MERGE, "true")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')"
+ )
+ assert(getIndexFileCount("default_nonindexmerge", "0") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "3") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "4") == 0)
+ assert(getIndexFileCount("default_nonindexmerge", "0.1") == 0)
+ assert(getIndexFileCount("default_nonindexmerge", "2.1") == 0)
+ checkAnswer(sql("""Select count(*) from nonindexmerge"""), Seq(Row(3000000)))
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.DEFAULT_ENABLE_AUTO_LOAD_MERGE, "false")
+ }
+
+ test("Verify index merge for compacted segments Auto Compaction - level 2") {
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.COMPACTION_SEGMENT_LEVEL_THRESHOLD, "2,2")
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "false")
+ sql("DROP TABLE IF EXISTS nonindexmerge")
+ sql(
+ """
+ | CREATE TABLE nonindexmerge(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
+ """.stripMargin)
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ val rows = sql("""Select count(*) from nonindexmerge""").collect()
+ assert(getIndexFileCount("default_nonindexmerge", "0") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "3") == 100)
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "true")
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.DEFAULT_ENABLE_AUTO_LOAD_MERGE, "true")
--- End diff --
Why the key is CarbonCommonConstants.DEFAULT_ENABLE_AUTO_LOAD_MERGE?DEFAULT_ENABLE_AUTO_LOAD_MERGE should be value
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5952/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7063/
---
[GitHub] carbondata pull request #2482: [CARBONDATA-2714] Support merge index files f...
Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2482#discussion_r237740134
--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CarbonIndexFileMergeTestCase.scala ---
@@ -215,43 +249,215 @@ class CarbonIndexFileMergeTestCase
Assert
.assertEquals(getIndexOrMergeIndexFileSize(table, "0", CarbonTablePath.INDEX_FILE_EXT),
segment0.head.getIndexSize.toLong)
- new CarbonIndexFileMergeWriter(table)
- .mergeCarbonIndexFilesOfSegment("0", table.getTablePath, false, String.valueOf(System.currentTimeMillis()))
+ sql("Alter table fileSize compact 'segment_index'")
loadMetadataDetails = SegmentStatusManager
.readTableStatusFile(CarbonTablePath.getTableStatusFilePath(table.getTablePath))
segment0 = loadMetadataDetails.filter(x=> x.getLoadName.equalsIgnoreCase("0"))
Assert
.assertEquals(getIndexOrMergeIndexFileSize(table, "0", CarbonTablePath.MERGE_INDEX_FILE_EXT),
segment0.head.getIndexSize.toLong)
+ CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "true")
sql("DROP TABLE IF EXISTS fileSize")
}
- private def getIndexFileCount(tableName: String, segmentNo: String): Int = {
- val carbonTable = CarbonMetadata.getInstance().getCarbonTable(tableName)
- val segmentDir = CarbonTablePath.getSegmentPath(carbonTable.getTablePath, segmentNo)
- if (FileFactory.isFileExist(segmentDir)) {
- val indexFiles = new SegmentIndexFileStore().getIndexFilesFromSegment(segmentDir)
- indexFiles.asScala.map { f =>
- if (f._2 == null) {
- 1
- } else {
- 0
- }
- }.sum
- } else {
- val segment = Segment.getSegment(segmentNo, carbonTable.getTablePath)
- if (segment != null) {
- val store = new SegmentFileStore(carbonTable.getTablePath, segment.getSegmentFileName)
- store.getSegmentFile.getLocationMap.values().asScala.map { f =>
- if (f.getMergeFileName == null) {
- f.getFiles.size()
- } else {
- 0
- }
- }.sum
- } else {
- 0
+ test("Verify index merge for compacted segments MINOR - level 2") {
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.COMPACTION_SEGMENT_LEVEL_THRESHOLD, "2,2")
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "false")
+ sql("DROP TABLE IF EXISTS nonindexmerge")
+ sql(
+ """
+ | CREATE TABLE nonindexmerge(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
+ """.stripMargin)
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ val rows = sql("""Select count(*) from nonindexmerge""").collect()
+ assert(getIndexFileCount("default_nonindexmerge", "0") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "3") == 100)
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "true")
+ sql("ALTER TABLE nonindexmerge COMPACT 'minor'").collect()
+ assert(getIndexFileCount("default_nonindexmerge", "0") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "3") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "0.1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2.1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "0.2") == 0)
+ checkAnswer(sql("""Select count(*) from nonindexmerge"""), rows)
+ }
+
+ test("Verify index merge for compacted segments Auto Compaction") {
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.COMPACTION_SEGMENT_LEVEL_THRESHOLD, "2,3")
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "false")
+ sql("DROP TABLE IF EXISTS nonindexmerge")
+ sql(
+ """
+ | CREATE TABLE nonindexmerge(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
+ """.stripMargin)
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ val rows = sql("""Select count(*) from nonindexmerge""").collect()
+ assert(getIndexFileCount("default_nonindexmerge", "0") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "3") == 100)
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "true")
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.DEFAULT_ENABLE_AUTO_LOAD_MERGE, "true")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')"
+ )
+ assert(getIndexFileCount("default_nonindexmerge", "0") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "3") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "4") == 0)
+ assert(getIndexFileCount("default_nonindexmerge", "0.1") == 0)
+ assert(getIndexFileCount("default_nonindexmerge", "2.1") == 0)
+ checkAnswer(sql("""Select count(*) from nonindexmerge"""), Seq(Row(3000000)))
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.DEFAULT_ENABLE_AUTO_LOAD_MERGE, "false")
--- End diff --
Why the key is CarbonCommonConstants.DEFAULT_ENABLE_AUTO_LOAD_MERGE?DEFAULT_ENABLE_AUTO_LOAD_MERGE should be value
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7202/
---
[GitHub] carbondata pull request #2482: [CARBONDATA-2714] Support merge index files f...
Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2482#discussion_r237740112
--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CarbonIndexFileMergeTestCase.scala ---
@@ -215,43 +249,215 @@ class CarbonIndexFileMergeTestCase
Assert
.assertEquals(getIndexOrMergeIndexFileSize(table, "0", CarbonTablePath.INDEX_FILE_EXT),
segment0.head.getIndexSize.toLong)
- new CarbonIndexFileMergeWriter(table)
- .mergeCarbonIndexFilesOfSegment("0", table.getTablePath, false, String.valueOf(System.currentTimeMillis()))
+ sql("Alter table fileSize compact 'segment_index'")
loadMetadataDetails = SegmentStatusManager
.readTableStatusFile(CarbonTablePath.getTableStatusFilePath(table.getTablePath))
segment0 = loadMetadataDetails.filter(x=> x.getLoadName.equalsIgnoreCase("0"))
Assert
.assertEquals(getIndexOrMergeIndexFileSize(table, "0", CarbonTablePath.MERGE_INDEX_FILE_EXT),
segment0.head.getIndexSize.toLong)
+ CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "true")
sql("DROP TABLE IF EXISTS fileSize")
}
- private def getIndexFileCount(tableName: String, segmentNo: String): Int = {
- val carbonTable = CarbonMetadata.getInstance().getCarbonTable(tableName)
- val segmentDir = CarbonTablePath.getSegmentPath(carbonTable.getTablePath, segmentNo)
- if (FileFactory.isFileExist(segmentDir)) {
- val indexFiles = new SegmentIndexFileStore().getIndexFilesFromSegment(segmentDir)
- indexFiles.asScala.map { f =>
- if (f._2 == null) {
- 1
- } else {
- 0
- }
- }.sum
- } else {
- val segment = Segment.getSegment(segmentNo, carbonTable.getTablePath)
- if (segment != null) {
- val store = new SegmentFileStore(carbonTable.getTablePath, segment.getSegmentFileName)
- store.getSegmentFile.getLocationMap.values().asScala.map { f =>
- if (f.getMergeFileName == null) {
- f.getFiles.size()
- } else {
- 0
- }
- }.sum
- } else {
- 0
+ test("Verify index merge for compacted segments MINOR - level 2") {
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.COMPACTION_SEGMENT_LEVEL_THRESHOLD, "2,2")
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "false")
+ sql("DROP TABLE IF EXISTS nonindexmerge")
+ sql(
+ """
+ | CREATE TABLE nonindexmerge(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
+ """.stripMargin)
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ val rows = sql("""Select count(*) from nonindexmerge""").collect()
+ assert(getIndexFileCount("default_nonindexmerge", "0") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "3") == 100)
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "true")
+ sql("ALTER TABLE nonindexmerge COMPACT 'minor'").collect()
+ assert(getIndexFileCount("default_nonindexmerge", "0") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "3") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "0.1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2.1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "0.2") == 0)
+ checkAnswer(sql("""Select count(*) from nonindexmerge"""), rows)
+ }
+
+ test("Verify index merge for compacted segments Auto Compaction") {
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.COMPACTION_SEGMENT_LEVEL_THRESHOLD, "2,3")
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "false")
+ sql("DROP TABLE IF EXISTS nonindexmerge")
+ sql(
+ """
+ | CREATE TABLE nonindexmerge(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
+ """.stripMargin)
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ val rows = sql("""Select count(*) from nonindexmerge""").collect()
+ assert(getIndexFileCount("default_nonindexmerge", "0") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "3") == 100)
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "true")
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.DEFAULT_ENABLE_AUTO_LOAD_MERGE, "true")
--- End diff --
Why the key is CarbonCommonConstants.DEFAULT_ENABLE_AUTO_LOAD_MERGE?DEFAULT_ENABLE_AUTO_LOAD_MERGE should be value
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5829/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5849/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7176/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5825/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2482
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5765/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:
https://github.com/apache/carbondata/pull/2482
@dhatchayani can you provide the detail, this pr including both(level1 and level2) ? as per the mailing list discussion : http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Discussion-Merging-carbonindex-files-for-each-segments-and-across-segments-td24441.html
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5975/
---
[GitHub] carbondata pull request #2482: [CARBONDATA-2714] Support merge index files f...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2482#discussion_r202301535
--- Diff: core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java ---
@@ -488,6 +485,49 @@ private void readIndexFiles(SegmentStatus status, boolean ignoreStatus) throws I
}
}
+ /**
+ * Reads all merge index / index files as per the status of the file.
+ * In case of @ignoreStatus is true it just reads all merge index / index files
+ *
+ * @param status
+ * @param ignoreStatus
+ * @return
+ * @throws IOException
+ */
+ private List<String> readIndexOrMergeFiles(SegmentStatus status, boolean ignoreStatus)
--- End diff --
why do you need this method, Already merge files are available in `SegmentIndexFileStore.getCarbonMergeFileToIndexFilesMap`
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2482
LGTM
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7048/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5777/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2482
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5757/
---
[GitHub] carbondata pull request #2482: [CARBONDATA-2714] Support merge index files f...
Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2482#discussion_r237739793
--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CarbonIndexFileMergeTestCase.scala ---
@@ -215,43 +249,215 @@ class CarbonIndexFileMergeTestCase
Assert
.assertEquals(getIndexOrMergeIndexFileSize(table, "0", CarbonTablePath.INDEX_FILE_EXT),
segment0.head.getIndexSize.toLong)
- new CarbonIndexFileMergeWriter(table)
- .mergeCarbonIndexFilesOfSegment("0", table.getTablePath, false, String.valueOf(System.currentTimeMillis()))
+ sql("Alter table fileSize compact 'segment_index'")
loadMetadataDetails = SegmentStatusManager
.readTableStatusFile(CarbonTablePath.getTableStatusFilePath(table.getTablePath))
segment0 = loadMetadataDetails.filter(x=> x.getLoadName.equalsIgnoreCase("0"))
Assert
.assertEquals(getIndexOrMergeIndexFileSize(table, "0", CarbonTablePath.MERGE_INDEX_FILE_EXT),
segment0.head.getIndexSize.toLong)
+ CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "true")
sql("DROP TABLE IF EXISTS fileSize")
}
- private def getIndexFileCount(tableName: String, segmentNo: String): Int = {
- val carbonTable = CarbonMetadata.getInstance().getCarbonTable(tableName)
- val segmentDir = CarbonTablePath.getSegmentPath(carbonTable.getTablePath, segmentNo)
- if (FileFactory.isFileExist(segmentDir)) {
- val indexFiles = new SegmentIndexFileStore().getIndexFilesFromSegment(segmentDir)
- indexFiles.asScala.map { f =>
- if (f._2 == null) {
- 1
- } else {
- 0
- }
- }.sum
- } else {
- val segment = Segment.getSegment(segmentNo, carbonTable.getTablePath)
- if (segment != null) {
- val store = new SegmentFileStore(carbonTable.getTablePath, segment.getSegmentFileName)
- store.getSegmentFile.getLocationMap.values().asScala.map { f =>
- if (f.getMergeFileName == null) {
- f.getFiles.size()
- } else {
- 0
- }
- }.sum
- } else {
- 0
+ test("Verify index merge for compacted segments MINOR - level 2") {
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.COMPACTION_SEGMENT_LEVEL_THRESHOLD, "2,2")
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "false")
+ sql("DROP TABLE IF EXISTS nonindexmerge")
+ sql(
+ """
+ | CREATE TABLE nonindexmerge(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
+ """.stripMargin)
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ val rows = sql("""Select count(*) from nonindexmerge""").collect()
+ assert(getIndexFileCount("default_nonindexmerge", "0") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "3") == 100)
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "true")
+ sql("ALTER TABLE nonindexmerge COMPACT 'minor'").collect()
+ assert(getIndexFileCount("default_nonindexmerge", "0") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "3") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "0.1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2.1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "0.2") == 0)
+ checkAnswer(sql("""Select count(*) from nonindexmerge"""), rows)
+ }
+
+ test("Verify index merge for compacted segments Auto Compaction") {
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.COMPACTION_SEGMENT_LEVEL_THRESHOLD, "2,3")
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "false")
+ sql("DROP TABLE IF EXISTS nonindexmerge")
+ sql(
+ """
+ | CREATE TABLE nonindexmerge(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
+ """.stripMargin)
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ val rows = sql("""Select count(*) from nonindexmerge""").collect()
+ assert(getIndexFileCount("default_nonindexmerge", "0") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "3") == 100)
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "true")
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.DEFAULT_ENABLE_AUTO_LOAD_MERGE, "true")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')"
+ )
+ assert(getIndexFileCount("default_nonindexmerge", "0") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "3") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "4") == 0)
+ assert(getIndexFileCount("default_nonindexmerge", "0.1") == 0)
+ assert(getIndexFileCount("default_nonindexmerge", "2.1") == 0)
+ checkAnswer(sql("""Select count(*) from nonindexmerge"""), Seq(Row(3000000)))
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.DEFAULT_ENABLE_AUTO_LOAD_MERGE, "false")
+ }
+
+ test("Verify index merge for compacted segments Auto Compaction - level 2") {
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.COMPACTION_SEGMENT_LEVEL_THRESHOLD, "2,2")
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "false")
+ sql("DROP TABLE IF EXISTS nonindexmerge")
+ sql(
+ """
+ | CREATE TABLE nonindexmerge(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
+ """.stripMargin)
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE nonindexmerge OPTIONS('header'='false', " +
+ s"'GLOBAL_SORT_PARTITIONS'='100')")
+ val rows = sql("""Select count(*) from nonindexmerge""").collect()
+ assert(getIndexFileCount("default_nonindexmerge", "0") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "1") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "2") == 100)
+ assert(getIndexFileCount("default_nonindexmerge", "3") == 100)
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_MERGE_INDEX_IN_SEGMENT, "true")
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.DEFAULT_ENABLE_AUTO_LOAD_MERGE, "true")
--- End diff --
Why the key is CarbonCommonConstants.DEFAULT_ENABLE_AUTO_LOAD_MERGE?
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7052/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7072/
---
[GitHub] carbondata pull request #2482: [CARBONDATA-2714] Support merge index files f...
Posted by dhatchayani <gi...@git.apache.org>.
Github user dhatchayani commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2482#discussion_r202608397
--- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java ---
@@ -1871,6 +1868,16 @@
*/
public static final String CACHE_LEVEL_DEFAULT_VALUE = "BLOCK";
+ /**
+ * It is internal configuration and used only for test purpose.
--- End diff --
We are not giving this to user. For user, this will be always TRUE. User should not change it.
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2482
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5868/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2482
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5866/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7201/
---
[GitHub] carbondata pull request #2482: [CARBONDATA-2714] Support merge index files f...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/carbondata/pull/2482
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5814/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5978/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7056/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by dhatchayani <gi...@git.apache.org>.
Github user dhatchayani commented on the issue:
https://github.com/apache/carbondata/pull/2482
retest this please
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5790/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2482
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5838/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2482
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5791/
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by dhatchayani <gi...@git.apache.org>.
Github user dhatchayani commented on the issue:
https://github.com/apache/carbondata/pull/2482
@chenliang613 We are supporting only level 1 (within segment) as discussed in the community. level 2 (across segments) is not supported as it will be already taken care through compaction
---
[GitHub] carbondata issue #2482: [CARBONDATA-2714] Support merge index files for the ...
Posted by dhatchayani <gi...@git.apache.org>.
Github user dhatchayani commented on the issue:
https://github.com/apache/carbondata/pull/2482
retest this please
---