You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by xubo245 <gi...@git.apache.org> on 2017/09/15 14:59:12 UTC
[GitHub] carbondata pull request #1361: [CARBONDATA-1481]Compaction support global so...
GitHub user xubo245 opened a pull request:
https://github.com/apache/carbondata/pull/1361
[CARBONDATA-1481]Compaction support global sort
We should add some test cases and evaluate performance for Compaction support global_sort.
Test cases:
1. compaction type: major and monor
2. parameter: global_sort_partitions
3. scort_column: data type
4. data size
5. load times
6. clean files and delete by segment.id
performance evaluation:
It should faster for querying after loading 1000 times and compaction
This PR is based on https://github.com/apache/carbondata/pull/1321
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/xubo245/carbondata compactionSupportGlobalSort
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/carbondata/pull/1361.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1361
----
commit 528deb7157f84b52890aea2751290793d1ae793f
Author: chenerlu <ch...@huawei.com>
Date: 2017-09-04T12:54:55Z
Unify the sort column and sort scope in create table command
commit 0982d96efaef6d39a69922d2e3e3d7e274cf900d
Author: chenerlu <ch...@huawei.com>
Date: 2017-09-11T09:29:48Z
move sort scope from segment level to table level
commit 506096dc6edb2c504092725033613fdf4f8c6984
Author: chenerlu <ch...@huawei.com>
Date: 2017-09-11T15:59:50Z
move sort scope from segment level to table level
commit 2094f70658a54a75fd9bdb20781cb4c243a3506a
Author: chenerlu <ch...@huawei.com>
Date: 2017-09-11T16:57:06Z
move sort scope from segment level to table level
commit 8944905cec30a900d32c94aea30456f20e0ae968
Author: chenerlu <ch...@huawei.com>
Date: 2017-09-11T17:10:07Z
Fix scala stype
commit cad37cd9cdc1c62dfb8bc961011760dfc4e41f84
Author: chenerlu <ch...@huawei.com>
Date: 2017-09-12T00:28:43Z
fix compile error
commit 867b7b35b8b9da8891d07e12adbedf949125258d
Author: chenerlu <ch...@huawei.com>
Date: 2017-09-12T01:59:10Z
fix TestCreateTableWithSortScope
commit eb21c48481631995444138ff9d52d43eee45ead3
Author: xubo245 <60...@qq.com>
Date: 2017-09-13T12:15:39Z
compaction supprt global_sort
----
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/180/
---
[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/carbondata/pull/1361
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/250/
---
[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1361#discussion_r143776859
--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortFunctionTest.scala ---
@@ -0,0 +1,535 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the"License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an"AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.datacompaction
+
+import java.io.{File, FilenameFilter}
+
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+
+class CompactionSupportGlobalSortFunctionTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll {
+ val filePath: String = s"$resourcesPath/globalsort"
+ val file1: String = resourcesPath + "/globalsort/sample1.csv"
+ val file2: String = resourcesPath + "/globalsort/sample2.csv"
+ val file3: String = resourcesPath + "/globalsort/sample3.csv"
+
+ override def beforeEach {
+ resetConf
+ sql("DROP TABLE IF EXISTS compaction_globalsort")
+ sql(
+ """
+ | CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
+ """.stripMargin)
+
+ sql("DROP TABLE IF EXISTS carbon_localsort")
+ sql(
+ """
+ | CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ """.stripMargin)
+ }
+
+ override def afterEach {
+ sql("DROP TABLE IF EXISTS compaction_globalsort")
+ sql("DROP TABLE IF EXISTS carbon_localsort")
+ }
+
+ test("Compaction type: major") {
+ sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort")
+ sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort")
+
+ sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort")
+ sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort")
+
+ sql("ALTER TABLE compaction_globalsort COMPACT 'MAJOR'")
--- End diff --
can you also configure the parameter for major compaction
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1361
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/866/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/81/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on the issue:
https://github.com/apache/carbondata/pull/1361
Have SDV accident error? I haven't changed the SDV code after SDV Build fail, but now it shows Success.
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:
https://github.com/apache/carbondata/pull/1361
LGTM
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1361
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/881/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/103/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1361
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1029/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...
Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on the issue:
https://github.com/apache/carbondata/pull/1361
@jackylk I has changed the title and content of this PR
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/421/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/398/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/124/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/121/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Failed with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/118/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/235/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by QiangCai <gi...@git.apache.org>.
Github user QiangCai commented on the issue:
https://github.com/apache/carbondata/pull/1361
@xubo245 better to use the input files in integration/spark-common-test/src/test/resources/compaction
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/271/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/191/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1361
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/837/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Failed with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/111/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/245/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1361
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/876/
---
[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...
Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1361#discussion_r143898738
--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortBigFileTest.scala ---
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.datacompaction
+
+import java.io.{File, PrintWriter}
+
+import scala.util.Random
+
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+
+class CompactionSupportGlobalSortBigFileTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll {
+ val file1 = resourcesPath + "/compaction/fil1.csv"
+ val file2 = resourcesPath + "/compaction/fil2.csv"
+ val file3 = resourcesPath + "/compaction/fil3.csv"
+ val file4 = resourcesPath + "/compaction/fil4.csv"
+ val file5 = resourcesPath + "/compaction/fil5.csv"
+
+ override protected def beforeAll(): Unit = {
+ resetConf("10")
+ //n should be about 5000000 of reset if size is default 1024
+ val n = 150000
+ CompactionSupportGlobalSortBigFileTest.createFile(file1, n, 0)
+ CompactionSupportGlobalSortBigFileTest.createFile(file2, n * 4, n)
+ CompactionSupportGlobalSortBigFileTest.createFile(file3, n * 3, n * 5)
+ CompactionSupportGlobalSortBigFileTest.createFile(file4, n * 2, n * 8)
+ CompactionSupportGlobalSortBigFileTest.createFile(file5, n * 2, n * 13)
+ }
+
+ override protected def afterAll(): Unit = {
+ CompactionSupportGlobalSortBigFileTest.deleteFile(file1)
+ CompactionSupportGlobalSortBigFileTest.deleteFile(file2)
+ CompactionSupportGlobalSortBigFileTest.deleteFile(file3)
+ CompactionSupportGlobalSortBigFileTest.deleteFile(file4)
+ CompactionSupportGlobalSortBigFileTest.deleteFile(file5)
+ resetConf(CarbonCommonConstants.DEFAULT_MAJOR_COMPACTION_SIZE)
+ }
+
+ override def beforeEach {
+ sql("DROP TABLE IF EXISTS compaction_globalsort")
+ sql(
+ """
+ | CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
+ """.stripMargin)
+
+ sql("DROP TABLE IF EXISTS carbon_localsort")
+ sql(
+ """
+ | CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ """.stripMargin)
+ }
+
+ override def afterEach {
+ sql("DROP TABLE IF EXISTS compaction_globalsort")
+ sql("DROP TABLE IF EXISTS carbon_localsort")
+ }
+
+ test("Compaction major: segments size is bigger than default compaction size") {
+ sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE carbon_localsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file5' INTO TABLE carbon_localsort OPTIONS('header'='false')")
+
+ sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file5' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
+
+ sql("ALTER TABLE compaction_globalsort COMPACT 'MAJOR'")
+ checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "global_sort")
+
+ checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "city,name")
+
+ checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Compacted")
+
+ checkAnswer(sql("select count(*) from compaction_globalsort"),sql("select count(*) from carbon_localsort"))
+ val segments = sql("SHOW SEGMENTS FOR TABLE compaction_globalsort")
+ val SegmentSequenceIds = segments.collect().map { each => (each.toSeq) (0) }
+ assert(SegmentSequenceIds.contains("0.1"))
+ }
+
+ private def resetConf(size:String) {
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.MAJOR_COMPACTION_SIZE, size)
+ }
+}
+
+object CompactionSupportGlobalSortBigFileTest {
+ def createFile(fileName: String, line: Int = 10000, start: Int = 0): Boolean = {
+ try {
+ val write = new PrintWriter(fileName);
+ for (i <- start until (start + line)) {
+ write.println(i + "," + "n" + i + "," + "c" + Random.nextInt(line) + "," + Random.nextInt(80))
+ }
+ write.close()
+ } catch {
+ case _: Exception => return false
+ }
+ return true
--- End diff --
Ok
---
[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...
Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1361#discussion_r143902415
--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortParameterTest.scala ---
@@ -0,0 +1,298 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the"License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an"AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.datacompaction
+
+import java.io.{File, FilenameFilter}
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
+
+class CompactionSupportGlobalSortParameterTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll {
+ val filePath: String = s"$resourcesPath/globalsort"
+ val file1: String = resourcesPath + "/globalsort/sample1.csv"
+ val file2: String = resourcesPath + "/globalsort/sample2.csv"
+ val file3: String = resourcesPath + "/globalsort/sample3.csv"
+
+ override def beforeEach {
+ resetConf
+ sql("DROP TABLE IF EXISTS compaction_globalsort")
+ sql(
+ """
+ | CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
+ """.stripMargin)
+
+ sql("DROP TABLE IF EXISTS carbon_localsort")
+ sql(
+ """
+ | CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ """.stripMargin)
+ }
+
+ override def afterEach {
+ sql("DROP TABLE IF EXISTS compaction_globalsort")
+ sql("DROP TABLE IF EXISTS carbon_localsort")
+ }
+
+ test("ENABLE_AUTO_LOAD_MERGE: false") {
+ CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_AUTO_LOAD_MERGE, "false")
+ for (i <- 0 until 2) {
+ sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort")
+ sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort")
+
+ sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
+ sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
+ }
+ checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "global_sort")
+
+ checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "city,name")
+
+ sql("delete from table compaction_globalsort where SEGMENT.ID in (1,2,3)")
+ sql("delete from table carbon_localsort where SEGMENT.ID in (1,2,3)")
+ sql("ALTER TABLE compaction_globalsort COMPACT 'minor'")
+ checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), false, "Compacted")
+
+ val segments = sql("SHOW SEGMENTS FOR TABLE compaction_globalsort")
+ val SegmentSequenceIds = segments.collect().map { each => (each.toSeq) (0) }
+ assert(!SegmentSequenceIds.contains("0.1"))
+ assert(SegmentSequenceIds.length == 6)
+
+ checkAnswer(sql("SELECT COUNT(*) FROM compaction_globalsort"), Seq(Row(12)))
+
+ checkAnswer(sql("SELECT * FROM compaction_globalsort"),
+ sql("SELECT * FROM carbon_localsort"))
+
+ checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Success")
+ checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Marked for Delete")
+ CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_AUTO_LOAD_MERGE,
+ CarbonCommonConstants.DEFAULT_ENABLE_AUTO_LOAD_MERGE)
+ }
+
+ test("ENABLE_AUTO_LOAD_MERGE: true") {
+ CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_AUTO_LOAD_MERGE, "true")
+ for (i <- 0 until 2) {
+ sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort")
+ sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort")
+
+ sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
+ sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
+ }
+ checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "global_sort")
+
+ checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "city,name")
+
+ checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Compacted")
+
+ val segments = sql("SHOW SEGMENTS FOR TABLE compaction_globalsort")
+ val SegmentSequenceIds = segments.collect().map { each => (each.toSeq) (0) }
+ assert(SegmentSequenceIds.contains("0.1"))
+ assert(SegmentSequenceIds.length == 7)
--- End diff --
// loaded 6 times and produced 6 segments,
// auto merge will compact and produce 1 segment because 6 is bigger than 4 (default value of minor),
// so total segment number is 7
---
[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...
Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1361#discussion_r143902837
--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortFunctionTest.scala ---
@@ -0,0 +1,535 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the"License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an"AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.datacompaction
+
+import java.io.{File, FilenameFilter}
+
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+
+class CompactionSupportGlobalSortFunctionTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll {
+ val filePath: String = s"$resourcesPath/globalsort"
+ val file1: String = resourcesPath + "/globalsort/sample1.csv"
+ val file2: String = resourcesPath + "/globalsort/sample2.csv"
+ val file3: String = resourcesPath + "/globalsort/sample3.csv"
+
+ override def beforeEach {
+ resetConf
+ sql("DROP TABLE IF EXISTS compaction_globalsort")
+ sql(
+ """
+ | CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
+ """.stripMargin)
+
+ sql("DROP TABLE IF EXISTS carbon_localsort")
+ sql(
+ """
+ | CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ """.stripMargin)
+ }
+
+ override def afterEach {
+ sql("DROP TABLE IF EXISTS compaction_globalsort")
+ sql("DROP TABLE IF EXISTS carbon_localsort")
+ }
+
+ test("Compaction type: major") {
+ sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort")
+ sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort")
+
+ sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort")
+ sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort")
+
+ sql("ALTER TABLE compaction_globalsort COMPACT 'MAJOR'")
--- End diff --
I add some parameter tests for major compaction in CompactionSupportGlobalSortParameterTest.scala
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1361
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/858/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/301/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1361
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1055/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1361
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/810/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/126/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1361
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/879/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/175/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1361
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/873/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Failed with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/297/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1361
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/836/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...
Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on the issue:
https://github.com/apache/carbondata/pull/1361
@jackylk CI pass
---
[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1361#discussion_r143775848
--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortBigFileTest.scala ---
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.datacompaction
+
+import java.io.{File, PrintWriter}
+
+import scala.util.Random
+
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+
+class CompactionSupportGlobalSortBigFileTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll {
+ val file1 = resourcesPath + "/compaction/fil1.csv"
+ val file2 = resourcesPath + "/compaction/fil2.csv"
+ val file3 = resourcesPath + "/compaction/fil3.csv"
+ val file4 = resourcesPath + "/compaction/fil4.csv"
+ val file5 = resourcesPath + "/compaction/fil5.csv"
+
+ override protected def beforeAll(): Unit = {
+ resetConf("10")
+ //n should be about 5000000 of reset if size is default 1024
+ val n = 150000
+ CompactionSupportGlobalSortBigFileTest.createFile(file1, n, 0)
+ CompactionSupportGlobalSortBigFileTest.createFile(file2, n * 4, n)
+ CompactionSupportGlobalSortBigFileTest.createFile(file3, n * 3, n * 5)
+ CompactionSupportGlobalSortBigFileTest.createFile(file4, n * 2, n * 8)
+ CompactionSupportGlobalSortBigFileTest.createFile(file5, n * 2, n * 13)
+ }
+
+ override protected def afterAll(): Unit = {
+ CompactionSupportGlobalSortBigFileTest.deleteFile(file1)
+ CompactionSupportGlobalSortBigFileTest.deleteFile(file2)
+ CompactionSupportGlobalSortBigFileTest.deleteFile(file3)
+ CompactionSupportGlobalSortBigFileTest.deleteFile(file4)
+ CompactionSupportGlobalSortBigFileTest.deleteFile(file5)
+ resetConf(CarbonCommonConstants.DEFAULT_MAJOR_COMPACTION_SIZE)
+ }
+
+ override def beforeEach {
+ sql("DROP TABLE IF EXISTS compaction_globalsort")
+ sql(
+ """
+ | CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
+ """.stripMargin)
+
+ sql("DROP TABLE IF EXISTS carbon_localsort")
+ sql(
+ """
+ | CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ """.stripMargin)
+ }
+
+ override def afterEach {
+ sql("DROP TABLE IF EXISTS compaction_globalsort")
+ sql("DROP TABLE IF EXISTS carbon_localsort")
+ }
+
+ test("Compaction major: segments size is bigger than default compaction size") {
+ sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE carbon_localsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file5' INTO TABLE carbon_localsort OPTIONS('header'='false')")
+
+ sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file5' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
+
+ sql("ALTER TABLE compaction_globalsort COMPACT 'MAJOR'")
+ checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "global_sort")
+
+ checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "city,name")
+
+ checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Compacted")
+
+ checkAnswer(sql("select count(*) from compaction_globalsort"),sql("select count(*) from carbon_localsort"))
+ val segments = sql("SHOW SEGMENTS FOR TABLE compaction_globalsort")
+ val SegmentSequenceIds = segments.collect().map { each => (each.toSeq) (0) }
+ assert(SegmentSequenceIds.contains("0.1"))
+ }
+
+ private def resetConf(size:String) {
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.MAJOR_COMPACTION_SIZE, size)
+ }
+}
+
+object CompactionSupportGlobalSortBigFileTest {
+ def createFile(fileName: String, line: Int = 10000, start: Int = 0): Boolean = {
+ try {
+ val write = new PrintWriter(fileName);
+ for (i <- start until (start + line)) {
+ write.println(i + "," + "n" + i + "," + "c" + Random.nextInt(line) + "," + Random.nextInt(80))
+ }
+ write.close()
+ } catch {
+ case _: Exception => return false
+ }
+ return true
--- End diff --
remove `return`
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:
https://github.com/apache/carbondata/pull/1361
Please change title to "Add test cases for compaction of global sorted segment". And mentioning it is only adding test cases in this PR.
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/147/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1361
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/822/
---
[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1361#discussion_r143777369
--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortParameterTest.scala ---
@@ -0,0 +1,298 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the"License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an"AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.datacompaction
+
+import java.io.{File, FilenameFilter}
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
+
+class CompactionSupportGlobalSortParameterTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll {
+ val filePath: String = s"$resourcesPath/globalsort"
+ val file1: String = resourcesPath + "/globalsort/sample1.csv"
+ val file2: String = resourcesPath + "/globalsort/sample2.csv"
+ val file3: String = resourcesPath + "/globalsort/sample3.csv"
+
+ override def beforeEach {
+ resetConf
+ sql("DROP TABLE IF EXISTS compaction_globalsort")
+ sql(
+ """
+ | CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
+ """.stripMargin)
+
+ sql("DROP TABLE IF EXISTS carbon_localsort")
+ sql(
+ """
+ | CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ """.stripMargin)
+ }
+
+ override def afterEach {
+ sql("DROP TABLE IF EXISTS compaction_globalsort")
+ sql("DROP TABLE IF EXISTS carbon_localsort")
+ }
+
+ test("ENABLE_AUTO_LOAD_MERGE: false") {
+ CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_AUTO_LOAD_MERGE, "false")
+ for (i <- 0 until 2) {
+ sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort")
+ sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort")
+
+ sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
+ sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
+ }
+ checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "global_sort")
+
+ checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "city,name")
+
+ sql("delete from table compaction_globalsort where SEGMENT.ID in (1,2,3)")
+ sql("delete from table carbon_localsort where SEGMENT.ID in (1,2,3)")
+ sql("ALTER TABLE compaction_globalsort COMPACT 'minor'")
+ checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), false, "Compacted")
+
+ val segments = sql("SHOW SEGMENTS FOR TABLE compaction_globalsort")
+ val SegmentSequenceIds = segments.collect().map { each => (each.toSeq) (0) }
+ assert(!SegmentSequenceIds.contains("0.1"))
+ assert(SegmentSequenceIds.length == 6)
+
+ checkAnswer(sql("SELECT COUNT(*) FROM compaction_globalsort"), Seq(Row(12)))
+
+ checkAnswer(sql("SELECT * FROM compaction_globalsort"),
+ sql("SELECT * FROM carbon_localsort"))
+
+ checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Success")
+ checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Marked for Delete")
+ CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_AUTO_LOAD_MERGE,
+ CarbonCommonConstants.DEFAULT_ENABLE_AUTO_LOAD_MERGE)
+ }
+
+ test("ENABLE_AUTO_LOAD_MERGE: true") {
+ CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_AUTO_LOAD_MERGE, "true")
+ for (i <- 0 until 2) {
+ sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort")
+ sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort")
+
+ sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
+ sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
+ }
+ checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "global_sort")
+
+ checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "city,name")
+
+ checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Compacted")
+
+ val segments = sql("SHOW SEGMENTS FOR TABLE compaction_globalsort")
+ val SegmentSequenceIds = segments.collect().map { each => (each.toSeq) (0) }
+ assert(SegmentSequenceIds.contains("0.1"))
+ assert(SegmentSequenceIds.length == 7)
--- End diff --
why is it 7?
---
[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...
Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1361#discussion_r143898742
--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortBigFileTest.scala ---
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.datacompaction
+
+import java.io.{File, PrintWriter}
+
+import scala.util.Random
+
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+
+class CompactionSupportGlobalSortBigFileTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll {
+ val file1 = resourcesPath + "/compaction/fil1.csv"
+ val file2 = resourcesPath + "/compaction/fil2.csv"
+ val file3 = resourcesPath + "/compaction/fil3.csv"
+ val file4 = resourcesPath + "/compaction/fil4.csv"
+ val file5 = resourcesPath + "/compaction/fil5.csv"
+
+ override protected def beforeAll(): Unit = {
+ resetConf("10")
+ //n should be about 5000000 of reset if size is default 1024
+ val n = 150000
+ CompactionSupportGlobalSortBigFileTest.createFile(file1, n, 0)
+ CompactionSupportGlobalSortBigFileTest.createFile(file2, n * 4, n)
+ CompactionSupportGlobalSortBigFileTest.createFile(file3, n * 3, n * 5)
+ CompactionSupportGlobalSortBigFileTest.createFile(file4, n * 2, n * 8)
+ CompactionSupportGlobalSortBigFileTest.createFile(file5, n * 2, n * 13)
+ }
+
+ override protected def afterAll(): Unit = {
+ CompactionSupportGlobalSortBigFileTest.deleteFile(file1)
+ CompactionSupportGlobalSortBigFileTest.deleteFile(file2)
+ CompactionSupportGlobalSortBigFileTest.deleteFile(file3)
+ CompactionSupportGlobalSortBigFileTest.deleteFile(file4)
+ CompactionSupportGlobalSortBigFileTest.deleteFile(file5)
+ resetConf(CarbonCommonConstants.DEFAULT_MAJOR_COMPACTION_SIZE)
+ }
+
+ override def beforeEach {
+ sql("DROP TABLE IF EXISTS compaction_globalsort")
+ sql(
+ """
+ | CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
+ """.stripMargin)
+
+ sql("DROP TABLE IF EXISTS carbon_localsort")
+ sql(
+ """
+ | CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ """.stripMargin)
+ }
+
+ override def afterEach {
+ sql("DROP TABLE IF EXISTS compaction_globalsort")
+ sql("DROP TABLE IF EXISTS carbon_localsort")
+ }
+
+ test("Compaction major: segments size is bigger than default compaction size") {
+ sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE carbon_localsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file5' INTO TABLE carbon_localsort OPTIONS('header'='false')")
+
+ sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file5' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
+
+ sql("ALTER TABLE compaction_globalsort COMPACT 'MAJOR'")
+ checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "global_sort")
+
+ checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "city,name")
+
+ checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Compacted")
+
+ checkAnswer(sql("select count(*) from compaction_globalsort"),sql("select count(*) from carbon_localsort"))
+ val segments = sql("SHOW SEGMENTS FOR TABLE compaction_globalsort")
+ val SegmentSequenceIds = segments.collect().map { each => (each.toSeq) (0) }
+ assert(SegmentSequenceIds.contains("0.1"))
+ }
+
+ private def resetConf(size:String) {
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.MAJOR_COMPACTION_SIZE, size)
+ }
+}
+
+object CompactionSupportGlobalSortBigFileTest {
+ def createFile(fileName: String, line: Int = 10000, start: Int = 0): Boolean = {
+ try {
+ val write = new PrintWriter(fileName);
+ for (i <- start until (start + line)) {
+ write.println(i + "," + "n" + i + "," + "c" + Random.nextInt(line) + "," + Random.nextInt(80))
+ }
+ write.close()
+ } catch {
+ case _: Exception => return false
+ }
+ return true
+ }
+
+ def deleteFile(fileName: String): Boolean = {
+ try {
+ val file = new File(fileName)
+ if (file.exists()) {
+ file.delete()
+ }
+ } catch {
+ case _: Exception => return false
+ }
+ return true
--- End diff --
Ok
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1361
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/824/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on the issue:
https://github.com/apache/carbondata/pull/1361
@jackylk Please review it
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/276/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/227/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/204/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on the issue:
https://github.com/apache/carbondata/pull/1361
@QiangCai OK, it has been changed.
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1361
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1051/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/248/
---
[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Compaction support global s...
Posted by QiangCai <gi...@git.apache.org>.
Github user QiangCai commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1361#discussion_r139949848
--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortPerformanceTest.scala ---
@@ -0,0 +1,106 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.spark.testsuite.datacompaction
+
+import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
+
+import org.apache.spark.sql.test.TestQueryExecutor
+import org.apache.spark.sql.test.util.QueryTest
+
+class CompactionSupportGlobalSortPerformanceTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll {
--- End diff --
remove this test case
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1361
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/805/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Failed with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/56/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/209/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...
Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on the issue:
https://github.com/apache/carbondata/pull/1361
please review it @jackylk
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/333/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/82/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Failed with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/51/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Failed with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/68/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1361
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/961/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1361
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/823/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1361
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/903/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...
Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on the issue:
https://github.com/apache/carbondata/pull/1361
Please review it again @jackylk
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/426/
---
[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1361#discussion_r143775897
--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortBigFileTest.scala ---
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.datacompaction
+
+import java.io.{File, PrintWriter}
+
+import scala.util.Random
+
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+
+class CompactionSupportGlobalSortBigFileTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll {
+ val file1 = resourcesPath + "/compaction/fil1.csv"
+ val file2 = resourcesPath + "/compaction/fil2.csv"
+ val file3 = resourcesPath + "/compaction/fil3.csv"
+ val file4 = resourcesPath + "/compaction/fil4.csv"
+ val file5 = resourcesPath + "/compaction/fil5.csv"
+
+ override protected def beforeAll(): Unit = {
+ resetConf("10")
+ //n should be about 5000000 of reset if size is default 1024
+ val n = 150000
+ CompactionSupportGlobalSortBigFileTest.createFile(file1, n, 0)
+ CompactionSupportGlobalSortBigFileTest.createFile(file2, n * 4, n)
+ CompactionSupportGlobalSortBigFileTest.createFile(file3, n * 3, n * 5)
+ CompactionSupportGlobalSortBigFileTest.createFile(file4, n * 2, n * 8)
+ CompactionSupportGlobalSortBigFileTest.createFile(file5, n * 2, n * 13)
+ }
+
+ override protected def afterAll(): Unit = {
+ CompactionSupportGlobalSortBigFileTest.deleteFile(file1)
+ CompactionSupportGlobalSortBigFileTest.deleteFile(file2)
+ CompactionSupportGlobalSortBigFileTest.deleteFile(file3)
+ CompactionSupportGlobalSortBigFileTest.deleteFile(file4)
+ CompactionSupportGlobalSortBigFileTest.deleteFile(file5)
+ resetConf(CarbonCommonConstants.DEFAULT_MAJOR_COMPACTION_SIZE)
+ }
+
+ override def beforeEach {
+ sql("DROP TABLE IF EXISTS compaction_globalsort")
+ sql(
+ """
+ | CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
+ """.stripMargin)
+
+ sql("DROP TABLE IF EXISTS carbon_localsort")
+ sql(
+ """
+ | CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT)
+ | STORED BY 'org.apache.carbondata.format'
+ """.stripMargin)
+ }
+
+ override def afterEach {
+ sql("DROP TABLE IF EXISTS compaction_globalsort")
+ sql("DROP TABLE IF EXISTS carbon_localsort")
+ }
+
+ test("Compaction major: segments size is bigger than default compaction size") {
+ sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE carbon_localsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file5' INTO TABLE carbon_localsort OPTIONS('header'='false')")
+
+ sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
+ sql(s"LOAD DATA LOCAL INPATH '$file5' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
+
+ sql("ALTER TABLE compaction_globalsort COMPACT 'MAJOR'")
+ checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "global_sort")
+
+ checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "city,name")
+
+ checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Compacted")
+
+ checkAnswer(sql("select count(*) from compaction_globalsort"),sql("select count(*) from carbon_localsort"))
+ val segments = sql("SHOW SEGMENTS FOR TABLE compaction_globalsort")
+ val SegmentSequenceIds = segments.collect().map { each => (each.toSeq) (0) }
+ assert(SegmentSequenceIds.contains("0.1"))
+ }
+
+ private def resetConf(size:String) {
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.MAJOR_COMPACTION_SIZE, size)
+ }
+}
+
+object CompactionSupportGlobalSortBigFileTest {
+ def createFile(fileName: String, line: Int = 10000, start: Int = 0): Boolean = {
+ try {
+ val write = new PrintWriter(fileName);
+ for (i <- start until (start + line)) {
+ write.println(i + "," + "n" + i + "," + "c" + Random.nextInt(line) + "," + Random.nextInt(80))
+ }
+ write.close()
+ } catch {
+ case _: Exception => return false
+ }
+ return true
+ }
+
+ def deleteFile(fileName: String): Boolean = {
+ try {
+ val file = new File(fileName)
+ if (file.exists()) {
+ file.delete()
+ }
+ } catch {
+ case _: Exception => return false
+ }
+ return true
--- End diff --
remove return
---
[GitHub] carbondata pull request #1361: [CARBONDATA-1481]Compaction support global so...
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1361#discussion_r139307748
--- Diff: integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/BatchSortLoad3TestCase.scala ---
@@ -112,33 +112,15 @@ class BatchSortLoad3TestCase extends QueryTest with BeforeAndAfterAll {
sql(s"""drop table if exists t3""").collect
}
-
- //Batch_sort_Loading_001-01-01-01_001-TC_056
- test("Batch_sort_Loading_001-01-01-01_001-TC_056", Include) {
--- End diff --
why removing it?
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/192/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/205/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/242/
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on the issue:
https://github.com/apache/carbondata/pull/1361
@QiangCai Please review it
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:
https://github.com/apache/carbondata/pull/1361
please rebase to master
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort
Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on the issue:
https://github.com/apache/carbondata/pull/1361
@QiangCai Please review it
---
[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1361
Build Failed with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/69/
---