You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by xubo245 <gi...@git.apache.org> on 2017/09/15 14:59:12 UTC

[GitHub] carbondata pull request #1361: [CARBONDATA-1481]Compaction support global so...

GitHub user xubo245 opened a pull request:

    https://github.com/apache/carbondata/pull/1361

    [CARBONDATA-1481]Compaction support global sort

    We should add some test cases and evaluate performance for Compaction support global_sort.
    
    Test cases:
    1. compaction type: major and monor
    2. parameter: global_sort_partitions 
    3. scort_column: data type
    4. data size 
    5. load times
    6. clean files and delete by segment.id
    
    performance evaluation:
    It should faster for querying after loading 1000 times and compaction
    
    This PR is based on https://github.com/apache/carbondata/pull/1321

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xubo245/carbondata compactionSupportGlobalSort

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1361.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1361
    
----
commit 528deb7157f84b52890aea2751290793d1ae793f
Author: chenerlu <ch...@huawei.com>
Date:   2017-09-04T12:54:55Z

    Unify the sort column and sort scope in create table command

commit 0982d96efaef6d39a69922d2e3e3d7e274cf900d
Author: chenerlu <ch...@huawei.com>
Date:   2017-09-11T09:29:48Z

    move sort scope from segment level to table level

commit 506096dc6edb2c504092725033613fdf4f8c6984
Author: chenerlu <ch...@huawei.com>
Date:   2017-09-11T15:59:50Z

    move sort scope from segment level to table level

commit 2094f70658a54a75fd9bdb20781cb4c243a3506a
Author: chenerlu <ch...@huawei.com>
Date:   2017-09-11T16:57:06Z

    move sort scope from segment level to table level

commit 8944905cec30a900d32c94aea30456f20e0ae968
Author: chenerlu <ch...@huawei.com>
Date:   2017-09-11T17:10:07Z

    Fix scala stype

commit cad37cd9cdc1c62dfb8bc961011760dfc4e41f84
Author: chenerlu <ch...@huawei.com>
Date:   2017-09-12T00:28:43Z

    fix compile error

commit 867b7b35b8b9da8891d07e12adbedf949125258d
Author: chenerlu <ch...@huawei.com>
Date:   2017-09-12T01:59:10Z

    fix TestCreateTableWithSortScope

commit eb21c48481631995444138ff9d52d43eee45ead3
Author: xubo245 <60...@qq.com>
Date:   2017-09-13T12:15:39Z

    compaction supprt global_sort

----


---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/180/



---

[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/1361


---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/250/



---

[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1361#discussion_r143776859
  
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortFunctionTest.scala ---
    @@ -0,0 +1,535 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the"License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an"AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.spark.testsuite.datacompaction
    +
    +import java.io.{File, FilenameFilter}
    +
    +import org.apache.spark.sql.Row
    +import org.apache.spark.sql.test.util.QueryTest
    +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
    +
    +import org.apache.carbondata.core.constants.CarbonCommonConstants
    +import org.apache.carbondata.core.util.CarbonProperties
    +
    +class CompactionSupportGlobalSortFunctionTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll {
    +  val filePath: String = s"$resourcesPath/globalsort"
    +  val file1: String = resourcesPath + "/globalsort/sample1.csv"
    +  val file2: String = resourcesPath + "/globalsort/sample2.csv"
    +  val file3: String = resourcesPath + "/globalsort/sample3.csv"
    +
    +  override def beforeEach {
    +    resetConf
    +    sql("DROP TABLE IF EXISTS compaction_globalsort")
    +    sql(
    +      """
    +        | CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'org.apache.carbondata.format'
    +        | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
    +      """.stripMargin)
    +
    +    sql("DROP TABLE IF EXISTS carbon_localsort")
    +    sql(
    +      """
    +        | CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'org.apache.carbondata.format'
    +      """.stripMargin)
    +  }
    +
    +  override def afterEach {
    +    sql("DROP TABLE IF EXISTS compaction_globalsort")
    +    sql("DROP TABLE IF EXISTS carbon_localsort")
    +  }
    +
    +  test("Compaction type: major") {
    +    sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort")
    +    sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort")
    +    sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort")
    +
    +    sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort")
    +    sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort")
    +    sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort")
    +
    +    sql("ALTER TABLE compaction_globalsort COMPACT 'MAJOR'")
    --- End diff --
    
    can you also configure the parameter for major compaction


---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/866/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/81/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Have SDV accident error?  I haven't changed the SDV code after SDV Build fail, but now it shows Success.


---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    LGTM


---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/881/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/103/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1029/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...

Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    @jackylk  I has changed the title and content of this PR


---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/421/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/398/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/124/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/121/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Failed with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/118/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/235/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by QiangCai <gi...@git.apache.org>.
Github user QiangCai commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    @xubo245  better to use the input files in integration/spark-common-test/src/test/resources/compaction


---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/271/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/191/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/837/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Failed with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/111/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/245/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/876/



---

[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...

Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1361#discussion_r143898738
  
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortBigFileTest.scala ---
    @@ -0,0 +1,136 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.spark.testsuite.datacompaction
    +
    +import java.io.{File, PrintWriter}
    +
    +import scala.util.Random
    +
    +import org.apache.spark.sql.test.util.QueryTest
    +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
    +
    +import org.apache.carbondata.core.constants.CarbonCommonConstants
    +import org.apache.carbondata.core.util.CarbonProperties
    +
    +class CompactionSupportGlobalSortBigFileTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll {
    +  val file1 = resourcesPath + "/compaction/fil1.csv"
    +  val file2 = resourcesPath + "/compaction/fil2.csv"
    +  val file3 = resourcesPath + "/compaction/fil3.csv"
    +  val file4 = resourcesPath + "/compaction/fil4.csv"
    +  val file5 = resourcesPath + "/compaction/fil5.csv"
    +
    +  override protected def beforeAll(): Unit = {
    +    resetConf("10")
    +    //n should be about 5000000 of reset if size is default 1024
    +    val n = 150000
    +    CompactionSupportGlobalSortBigFileTest.createFile(file1, n, 0)
    +    CompactionSupportGlobalSortBigFileTest.createFile(file2, n * 4, n)
    +    CompactionSupportGlobalSortBigFileTest.createFile(file3, n * 3, n * 5)
    +    CompactionSupportGlobalSortBigFileTest.createFile(file4, n * 2, n * 8)
    +    CompactionSupportGlobalSortBigFileTest.createFile(file5, n * 2, n * 13)
    +  }
    +
    +  override protected def afterAll(): Unit = {
    +    CompactionSupportGlobalSortBigFileTest.deleteFile(file1)
    +    CompactionSupportGlobalSortBigFileTest.deleteFile(file2)
    +    CompactionSupportGlobalSortBigFileTest.deleteFile(file3)
    +    CompactionSupportGlobalSortBigFileTest.deleteFile(file4)
    +    CompactionSupportGlobalSortBigFileTest.deleteFile(file5)
    +    resetConf(CarbonCommonConstants.DEFAULT_MAJOR_COMPACTION_SIZE)
    +  }
    +
    +  override def beforeEach {
    +    sql("DROP TABLE IF EXISTS compaction_globalsort")
    +    sql(
    +      """
    +        | CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'org.apache.carbondata.format'
    +        | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
    +      """.stripMargin)
    +
    +    sql("DROP TABLE IF EXISTS carbon_localsort")
    +    sql(
    +      """
    +        | CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'org.apache.carbondata.format'
    +      """.stripMargin)
    +  }
    +
    +  override def afterEach {
    +    sql("DROP TABLE IF EXISTS compaction_globalsort")
    +    sql("DROP TABLE IF EXISTS carbon_localsort")
    +  }
    +
    +  test("Compaction major:  segments size is bigger than default compaction size") {
    +    sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE carbon_localsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file5' INTO TABLE carbon_localsort OPTIONS('header'='false')")
    +
    +    sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file5' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
    +
    +    sql("ALTER TABLE compaction_globalsort COMPACT 'MAJOR'")
    +    checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "global_sort")
    +
    +    checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "city,name")
    +
    +    checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Compacted")
    +
    +    checkAnswer(sql("select count(*) from compaction_globalsort"),sql("select count(*) from carbon_localsort"))
    +    val segments = sql("SHOW SEGMENTS FOR TABLE compaction_globalsort")
    +    val SegmentSequenceIds = segments.collect().map { each => (each.toSeq) (0) }
    +    assert(SegmentSequenceIds.contains("0.1"))
    +  }
    +
    +  private def resetConf(size:String) {
    +    CarbonProperties.getInstance()
    +      .addProperty(CarbonCommonConstants.MAJOR_COMPACTION_SIZE, size)
    +  }
    +}
    +
    +object CompactionSupportGlobalSortBigFileTest {
    +  def createFile(fileName: String, line: Int = 10000, start: Int = 0): Boolean = {
    +    try {
    +      val write = new PrintWriter(fileName);
    +      for (i <- start until (start + line)) {
    +        write.println(i + "," + "n" + i + "," + "c" + Random.nextInt(line) + "," + Random.nextInt(80))
    +      }
    +      write.close()
    +    } catch {
    +      case _: Exception => return false
    +    }
    +    return true
    --- End diff --
    
    Ok


---

[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...

Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1361#discussion_r143902415
  
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortParameterTest.scala ---
    @@ -0,0 +1,298 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the"License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an"AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.spark.testsuite.datacompaction
    +
    +import java.io.{File, FilenameFilter}
    +
    +import org.apache.carbondata.core.constants.CarbonCommonConstants
    +import org.apache.carbondata.core.util.CarbonProperties
    +import org.apache.spark.sql.Row
    +import org.apache.spark.sql.test.util.QueryTest
    +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
    +
    +class CompactionSupportGlobalSortParameterTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll {
    +  val filePath: String = s"$resourcesPath/globalsort"
    +  val file1: String = resourcesPath + "/globalsort/sample1.csv"
    +  val file2: String = resourcesPath + "/globalsort/sample2.csv"
    +  val file3: String = resourcesPath + "/globalsort/sample3.csv"
    +
    +  override def beforeEach {
    +    resetConf
    +    sql("DROP TABLE IF EXISTS compaction_globalsort")
    +    sql(
    +      """
    +        | CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'org.apache.carbondata.format'
    +        | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
    +      """.stripMargin)
    +
    +    sql("DROP TABLE IF EXISTS carbon_localsort")
    +    sql(
    +      """
    +        | CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'org.apache.carbondata.format'
    +      """.stripMargin)
    +  }
    +
    +  override def afterEach {
    +    sql("DROP TABLE IF EXISTS compaction_globalsort")
    +    sql("DROP TABLE IF EXISTS carbon_localsort")
    +  }
    +
    +  test("ENABLE_AUTO_LOAD_MERGE: false") {
    +    CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_AUTO_LOAD_MERGE, "false")
    +    for (i <- 0 until 2) {
    +      sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort")
    +      sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort")
    +      sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort")
    +
    +      sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
    +      sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
    +      sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
    +    }
    +    checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "global_sort")
    +
    +    checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "city,name")
    +
    +    sql("delete from table compaction_globalsort where SEGMENT.ID in (1,2,3)")
    +    sql("delete from table carbon_localsort where SEGMENT.ID in (1,2,3)")
    +    sql("ALTER TABLE compaction_globalsort COMPACT 'minor'")
    +    checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), false, "Compacted")
    +
    +    val segments = sql("SHOW SEGMENTS FOR TABLE compaction_globalsort")
    +    val SegmentSequenceIds = segments.collect().map { each => (each.toSeq) (0) }
    +    assert(!SegmentSequenceIds.contains("0.1"))
    +    assert(SegmentSequenceIds.length == 6)
    +
    +    checkAnswer(sql("SELECT COUNT(*) FROM compaction_globalsort"), Seq(Row(12)))
    +
    +    checkAnswer(sql("SELECT * FROM compaction_globalsort"),
    +      sql("SELECT * FROM carbon_localsort"))
    +
    +    checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Success")
    +    checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Marked for Delete")
    +    CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_AUTO_LOAD_MERGE,
    +      CarbonCommonConstants.DEFAULT_ENABLE_AUTO_LOAD_MERGE)
    +  }
    +
    +  test("ENABLE_AUTO_LOAD_MERGE: true") {
    +    CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_AUTO_LOAD_MERGE, "true")
    +    for (i <- 0 until 2) {
    +      sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort")
    +      sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort")
    +      sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort")
    +
    +      sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
    +      sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
    +      sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
    +    }
    +    checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "global_sort")
    +
    +    checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "city,name")
    +
    +    checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Compacted")
    +
    +    val segments = sql("SHOW SEGMENTS FOR TABLE compaction_globalsort")
    +    val SegmentSequenceIds = segments.collect().map { each => (each.toSeq) (0) }
    +    assert(SegmentSequenceIds.contains("0.1"))
    +    assert(SegmentSequenceIds.length == 7)
    --- End diff --
    
      // loaded 6 times and produced 6 segments,
        // auto merge will compact and produce 1 segment because 6 is bigger than 4 (default value of minor),
        // so total segment number is 7


---

[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...

Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1361#discussion_r143902837
  
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortFunctionTest.scala ---
    @@ -0,0 +1,535 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the"License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an"AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.spark.testsuite.datacompaction
    +
    +import java.io.{File, FilenameFilter}
    +
    +import org.apache.spark.sql.Row
    +import org.apache.spark.sql.test.util.QueryTest
    +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
    +
    +import org.apache.carbondata.core.constants.CarbonCommonConstants
    +import org.apache.carbondata.core.util.CarbonProperties
    +
    +class CompactionSupportGlobalSortFunctionTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll {
    +  val filePath: String = s"$resourcesPath/globalsort"
    +  val file1: String = resourcesPath + "/globalsort/sample1.csv"
    +  val file2: String = resourcesPath + "/globalsort/sample2.csv"
    +  val file3: String = resourcesPath + "/globalsort/sample3.csv"
    +
    +  override def beforeEach {
    +    resetConf
    +    sql("DROP TABLE IF EXISTS compaction_globalsort")
    +    sql(
    +      """
    +        | CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'org.apache.carbondata.format'
    +        | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
    +      """.stripMargin)
    +
    +    sql("DROP TABLE IF EXISTS carbon_localsort")
    +    sql(
    +      """
    +        | CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'org.apache.carbondata.format'
    +      """.stripMargin)
    +  }
    +
    +  override def afterEach {
    +    sql("DROP TABLE IF EXISTS compaction_globalsort")
    +    sql("DROP TABLE IF EXISTS carbon_localsort")
    +  }
    +
    +  test("Compaction type: major") {
    +    sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort")
    +    sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort")
    +    sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort")
    +
    +    sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort")
    +    sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort")
    +    sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort")
    +
    +    sql("ALTER TABLE compaction_globalsort COMPACT 'MAJOR'")
    --- End diff --
    
    I add some parameter tests for major compaction in CompactionSupportGlobalSortParameterTest.scala


---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/858/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/301/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1055/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/810/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/126/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/879/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/175/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/873/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Failed with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/297/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/836/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...

Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    @jackylk  CI pass


---

[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1361#discussion_r143775848
  
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortBigFileTest.scala ---
    @@ -0,0 +1,136 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.spark.testsuite.datacompaction
    +
    +import java.io.{File, PrintWriter}
    +
    +import scala.util.Random
    +
    +import org.apache.spark.sql.test.util.QueryTest
    +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
    +
    +import org.apache.carbondata.core.constants.CarbonCommonConstants
    +import org.apache.carbondata.core.util.CarbonProperties
    +
    +class CompactionSupportGlobalSortBigFileTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll {
    +  val file1 = resourcesPath + "/compaction/fil1.csv"
    +  val file2 = resourcesPath + "/compaction/fil2.csv"
    +  val file3 = resourcesPath + "/compaction/fil3.csv"
    +  val file4 = resourcesPath + "/compaction/fil4.csv"
    +  val file5 = resourcesPath + "/compaction/fil5.csv"
    +
    +  override protected def beforeAll(): Unit = {
    +    resetConf("10")
    +    //n should be about 5000000 of reset if size is default 1024
    +    val n = 150000
    +    CompactionSupportGlobalSortBigFileTest.createFile(file1, n, 0)
    +    CompactionSupportGlobalSortBigFileTest.createFile(file2, n * 4, n)
    +    CompactionSupportGlobalSortBigFileTest.createFile(file3, n * 3, n * 5)
    +    CompactionSupportGlobalSortBigFileTest.createFile(file4, n * 2, n * 8)
    +    CompactionSupportGlobalSortBigFileTest.createFile(file5, n * 2, n * 13)
    +  }
    +
    +  override protected def afterAll(): Unit = {
    +    CompactionSupportGlobalSortBigFileTest.deleteFile(file1)
    +    CompactionSupportGlobalSortBigFileTest.deleteFile(file2)
    +    CompactionSupportGlobalSortBigFileTest.deleteFile(file3)
    +    CompactionSupportGlobalSortBigFileTest.deleteFile(file4)
    +    CompactionSupportGlobalSortBigFileTest.deleteFile(file5)
    +    resetConf(CarbonCommonConstants.DEFAULT_MAJOR_COMPACTION_SIZE)
    +  }
    +
    +  override def beforeEach {
    +    sql("DROP TABLE IF EXISTS compaction_globalsort")
    +    sql(
    +      """
    +        | CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'org.apache.carbondata.format'
    +        | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
    +      """.stripMargin)
    +
    +    sql("DROP TABLE IF EXISTS carbon_localsort")
    +    sql(
    +      """
    +        | CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'org.apache.carbondata.format'
    +      """.stripMargin)
    +  }
    +
    +  override def afterEach {
    +    sql("DROP TABLE IF EXISTS compaction_globalsort")
    +    sql("DROP TABLE IF EXISTS carbon_localsort")
    +  }
    +
    +  test("Compaction major:  segments size is bigger than default compaction size") {
    +    sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE carbon_localsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file5' INTO TABLE carbon_localsort OPTIONS('header'='false')")
    +
    +    sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file5' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
    +
    +    sql("ALTER TABLE compaction_globalsort COMPACT 'MAJOR'")
    +    checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "global_sort")
    +
    +    checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "city,name")
    +
    +    checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Compacted")
    +
    +    checkAnswer(sql("select count(*) from compaction_globalsort"),sql("select count(*) from carbon_localsort"))
    +    val segments = sql("SHOW SEGMENTS FOR TABLE compaction_globalsort")
    +    val SegmentSequenceIds = segments.collect().map { each => (each.toSeq) (0) }
    +    assert(SegmentSequenceIds.contains("0.1"))
    +  }
    +
    +  private def resetConf(size:String) {
    +    CarbonProperties.getInstance()
    +      .addProperty(CarbonCommonConstants.MAJOR_COMPACTION_SIZE, size)
    +  }
    +}
    +
    +object CompactionSupportGlobalSortBigFileTest {
    +  def createFile(fileName: String, line: Int = 10000, start: Int = 0): Boolean = {
    +    try {
    +      val write = new PrintWriter(fileName);
    +      for (i <- start until (start + line)) {
    +        write.println(i + "," + "n" + i + "," + "c" + Random.nextInt(line) + "," + Random.nextInt(80))
    +      }
    +      write.close()
    +    } catch {
    +      case _: Exception => return false
    +    }
    +    return true
    --- End diff --
    
    remove `return`


---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Please change title to "Add test cases for compaction of global sorted segment". And mentioning it is only adding test cases in this PR.


---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/147/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/822/



---

[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1361#discussion_r143777369
  
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortParameterTest.scala ---
    @@ -0,0 +1,298 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the"License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an"AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.spark.testsuite.datacompaction
    +
    +import java.io.{File, FilenameFilter}
    +
    +import org.apache.carbondata.core.constants.CarbonCommonConstants
    +import org.apache.carbondata.core.util.CarbonProperties
    +import org.apache.spark.sql.Row
    +import org.apache.spark.sql.test.util.QueryTest
    +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
    +
    +class CompactionSupportGlobalSortParameterTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll {
    +  val filePath: String = s"$resourcesPath/globalsort"
    +  val file1: String = resourcesPath + "/globalsort/sample1.csv"
    +  val file2: String = resourcesPath + "/globalsort/sample2.csv"
    +  val file3: String = resourcesPath + "/globalsort/sample3.csv"
    +
    +  override def beforeEach {
    +    resetConf
    +    sql("DROP TABLE IF EXISTS compaction_globalsort")
    +    sql(
    +      """
    +        | CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'org.apache.carbondata.format'
    +        | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
    +      """.stripMargin)
    +
    +    sql("DROP TABLE IF EXISTS carbon_localsort")
    +    sql(
    +      """
    +        | CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'org.apache.carbondata.format'
    +      """.stripMargin)
    +  }
    +
    +  override def afterEach {
    +    sql("DROP TABLE IF EXISTS compaction_globalsort")
    +    sql("DROP TABLE IF EXISTS carbon_localsort")
    +  }
    +
    +  test("ENABLE_AUTO_LOAD_MERGE: false") {
    +    CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_AUTO_LOAD_MERGE, "false")
    +    for (i <- 0 until 2) {
    +      sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort")
    +      sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort")
    +      sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort")
    +
    +      sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
    +      sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
    +      sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
    +    }
    +    checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "global_sort")
    +
    +    checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "city,name")
    +
    +    sql("delete from table compaction_globalsort where SEGMENT.ID in (1,2,3)")
    +    sql("delete from table carbon_localsort where SEGMENT.ID in (1,2,3)")
    +    sql("ALTER TABLE compaction_globalsort COMPACT 'minor'")
    +    checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), false, "Compacted")
    +
    +    val segments = sql("SHOW SEGMENTS FOR TABLE compaction_globalsort")
    +    val SegmentSequenceIds = segments.collect().map { each => (each.toSeq) (0) }
    +    assert(!SegmentSequenceIds.contains("0.1"))
    +    assert(SegmentSequenceIds.length == 6)
    +
    +    checkAnswer(sql("SELECT COUNT(*) FROM compaction_globalsort"), Seq(Row(12)))
    +
    +    checkAnswer(sql("SELECT * FROM compaction_globalsort"),
    +      sql("SELECT * FROM carbon_localsort"))
    +
    +    checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Success")
    +    checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Marked for Delete")
    +    CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_AUTO_LOAD_MERGE,
    +      CarbonCommonConstants.DEFAULT_ENABLE_AUTO_LOAD_MERGE)
    +  }
    +
    +  test("ENABLE_AUTO_LOAD_MERGE: true") {
    +    CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_AUTO_LOAD_MERGE, "true")
    +    for (i <- 0 until 2) {
    +      sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort")
    +      sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort")
    +      sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort")
    +
    +      sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
    +      sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
    +      sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')")
    +    }
    +    checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "global_sort")
    +
    +    checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "city,name")
    +
    +    checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Compacted")
    +
    +    val segments = sql("SHOW SEGMENTS FOR TABLE compaction_globalsort")
    +    val SegmentSequenceIds = segments.collect().map { each => (each.toSeq) (0) }
    +    assert(SegmentSequenceIds.contains("0.1"))
    +    assert(SegmentSequenceIds.length == 7)
    --- End diff --
    
    why is it 7?


---

[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...

Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1361#discussion_r143898742
  
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortBigFileTest.scala ---
    @@ -0,0 +1,136 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.spark.testsuite.datacompaction
    +
    +import java.io.{File, PrintWriter}
    +
    +import scala.util.Random
    +
    +import org.apache.spark.sql.test.util.QueryTest
    +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
    +
    +import org.apache.carbondata.core.constants.CarbonCommonConstants
    +import org.apache.carbondata.core.util.CarbonProperties
    +
    +class CompactionSupportGlobalSortBigFileTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll {
    +  val file1 = resourcesPath + "/compaction/fil1.csv"
    +  val file2 = resourcesPath + "/compaction/fil2.csv"
    +  val file3 = resourcesPath + "/compaction/fil3.csv"
    +  val file4 = resourcesPath + "/compaction/fil4.csv"
    +  val file5 = resourcesPath + "/compaction/fil5.csv"
    +
    +  override protected def beforeAll(): Unit = {
    +    resetConf("10")
    +    //n should be about 5000000 of reset if size is default 1024
    +    val n = 150000
    +    CompactionSupportGlobalSortBigFileTest.createFile(file1, n, 0)
    +    CompactionSupportGlobalSortBigFileTest.createFile(file2, n * 4, n)
    +    CompactionSupportGlobalSortBigFileTest.createFile(file3, n * 3, n * 5)
    +    CompactionSupportGlobalSortBigFileTest.createFile(file4, n * 2, n * 8)
    +    CompactionSupportGlobalSortBigFileTest.createFile(file5, n * 2, n * 13)
    +  }
    +
    +  override protected def afterAll(): Unit = {
    +    CompactionSupportGlobalSortBigFileTest.deleteFile(file1)
    +    CompactionSupportGlobalSortBigFileTest.deleteFile(file2)
    +    CompactionSupportGlobalSortBigFileTest.deleteFile(file3)
    +    CompactionSupportGlobalSortBigFileTest.deleteFile(file4)
    +    CompactionSupportGlobalSortBigFileTest.deleteFile(file5)
    +    resetConf(CarbonCommonConstants.DEFAULT_MAJOR_COMPACTION_SIZE)
    +  }
    +
    +  override def beforeEach {
    +    sql("DROP TABLE IF EXISTS compaction_globalsort")
    +    sql(
    +      """
    +        | CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'org.apache.carbondata.format'
    +        | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
    +      """.stripMargin)
    +
    +    sql("DROP TABLE IF EXISTS carbon_localsort")
    +    sql(
    +      """
    +        | CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'org.apache.carbondata.format'
    +      """.stripMargin)
    +  }
    +
    +  override def afterEach {
    +    sql("DROP TABLE IF EXISTS compaction_globalsort")
    +    sql("DROP TABLE IF EXISTS carbon_localsort")
    +  }
    +
    +  test("Compaction major:  segments size is bigger than default compaction size") {
    +    sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE carbon_localsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file5' INTO TABLE carbon_localsort OPTIONS('header'='false')")
    +
    +    sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file5' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
    +
    +    sql("ALTER TABLE compaction_globalsort COMPACT 'MAJOR'")
    +    checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "global_sort")
    +
    +    checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "city,name")
    +
    +    checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Compacted")
    +
    +    checkAnswer(sql("select count(*) from compaction_globalsort"),sql("select count(*) from carbon_localsort"))
    +    val segments = sql("SHOW SEGMENTS FOR TABLE compaction_globalsort")
    +    val SegmentSequenceIds = segments.collect().map { each => (each.toSeq) (0) }
    +    assert(SegmentSequenceIds.contains("0.1"))
    +  }
    +
    +  private def resetConf(size:String) {
    +    CarbonProperties.getInstance()
    +      .addProperty(CarbonCommonConstants.MAJOR_COMPACTION_SIZE, size)
    +  }
    +}
    +
    +object CompactionSupportGlobalSortBigFileTest {
    +  def createFile(fileName: String, line: Int = 10000, start: Int = 0): Boolean = {
    +    try {
    +      val write = new PrintWriter(fileName);
    +      for (i <- start until (start + line)) {
    +        write.println(i + "," + "n" + i + "," + "c" + Random.nextInt(line) + "," + Random.nextInt(80))
    +      }
    +      write.close()
    +    } catch {
    +      case _: Exception => return false
    +    }
    +    return true
    +  }
    +
    +  def deleteFile(fileName: String): Boolean = {
    +    try {
    +      val file = new File(fileName)
    +      if (file.exists()) {
    +        file.delete()
    +      }
    +    } catch {
    +      case _: Exception => return false
    +    }
    +    return true
    --- End diff --
    
    Ok


---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/824/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    @jackylk Please review it


---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/276/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/227/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/204/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    @QiangCai OK, it has been changed.


---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1051/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/248/



---

[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Compaction support global s...

Posted by QiangCai <gi...@git.apache.org>.
Github user QiangCai commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1361#discussion_r139949848
  
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortPerformanceTest.scala ---
    @@ -0,0 +1,106 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.carbondata.spark.testsuite.datacompaction
    +
    +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
    +
    +import org.apache.spark.sql.test.TestQueryExecutor
    +import org.apache.spark.sql.test.util.QueryTest
    +
    +class CompactionSupportGlobalSortPerformanceTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll {
    --- End diff --
    
    remove this test case 


---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/805/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Failed with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/56/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/209/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...

Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
     please review it @jackylk 


---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/333/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/82/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Failed with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/51/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Failed with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/68/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/961/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/823/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/903/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...

Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Please review it again @jackylk 


---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/426/



---

[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1361#discussion_r143775897
  
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortBigFileTest.scala ---
    @@ -0,0 +1,136 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.spark.testsuite.datacompaction
    +
    +import java.io.{File, PrintWriter}
    +
    +import scala.util.Random
    +
    +import org.apache.spark.sql.test.util.QueryTest
    +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
    +
    +import org.apache.carbondata.core.constants.CarbonCommonConstants
    +import org.apache.carbondata.core.util.CarbonProperties
    +
    +class CompactionSupportGlobalSortBigFileTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll {
    +  val file1 = resourcesPath + "/compaction/fil1.csv"
    +  val file2 = resourcesPath + "/compaction/fil2.csv"
    +  val file3 = resourcesPath + "/compaction/fil3.csv"
    +  val file4 = resourcesPath + "/compaction/fil4.csv"
    +  val file5 = resourcesPath + "/compaction/fil5.csv"
    +
    +  override protected def beforeAll(): Unit = {
    +    resetConf("10")
    +    //n should be about 5000000 of reset if size is default 1024
    +    val n = 150000
    +    CompactionSupportGlobalSortBigFileTest.createFile(file1, n, 0)
    +    CompactionSupportGlobalSortBigFileTest.createFile(file2, n * 4, n)
    +    CompactionSupportGlobalSortBigFileTest.createFile(file3, n * 3, n * 5)
    +    CompactionSupportGlobalSortBigFileTest.createFile(file4, n * 2, n * 8)
    +    CompactionSupportGlobalSortBigFileTest.createFile(file5, n * 2, n * 13)
    +  }
    +
    +  override protected def afterAll(): Unit = {
    +    CompactionSupportGlobalSortBigFileTest.deleteFile(file1)
    +    CompactionSupportGlobalSortBigFileTest.deleteFile(file2)
    +    CompactionSupportGlobalSortBigFileTest.deleteFile(file3)
    +    CompactionSupportGlobalSortBigFileTest.deleteFile(file4)
    +    CompactionSupportGlobalSortBigFileTest.deleteFile(file5)
    +    resetConf(CarbonCommonConstants.DEFAULT_MAJOR_COMPACTION_SIZE)
    +  }
    +
    +  override def beforeEach {
    +    sql("DROP TABLE IF EXISTS compaction_globalsort")
    +    sql(
    +      """
    +        | CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'org.apache.carbondata.format'
    +        | TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT')
    +      """.stripMargin)
    +
    +    sql("DROP TABLE IF EXISTS carbon_localsort")
    +    sql(
    +      """
    +        | CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT)
    +        | STORED BY 'org.apache.carbondata.format'
    +      """.stripMargin)
    +  }
    +
    +  override def afterEach {
    +    sql("DROP TABLE IF EXISTS compaction_globalsort")
    +    sql("DROP TABLE IF EXISTS carbon_localsort")
    +  }
    +
    +  test("Compaction major:  segments size is bigger than default compaction size") {
    +    sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE carbon_localsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file5' INTO TABLE carbon_localsort OPTIONS('header'='false')")
    +
    +    sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
    +    sql(s"LOAD DATA LOCAL INPATH '$file5' INTO TABLE compaction_globalsort OPTIONS('header'='false')")
    +
    +    sql("ALTER TABLE compaction_globalsort COMPACT 'MAJOR'")
    +    checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "global_sort")
    +
    +    checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "city,name")
    +
    +    checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Compacted")
    +
    +    checkAnswer(sql("select count(*) from compaction_globalsort"),sql("select count(*) from carbon_localsort"))
    +    val segments = sql("SHOW SEGMENTS FOR TABLE compaction_globalsort")
    +    val SegmentSequenceIds = segments.collect().map { each => (each.toSeq) (0) }
    +    assert(SegmentSequenceIds.contains("0.1"))
    +  }
    +
    +  private def resetConf(size:String) {
    +    CarbonProperties.getInstance()
    +      .addProperty(CarbonCommonConstants.MAJOR_COMPACTION_SIZE, size)
    +  }
    +}
    +
    +object CompactionSupportGlobalSortBigFileTest {
    +  def createFile(fileName: String, line: Int = 10000, start: Int = 0): Boolean = {
    +    try {
    +      val write = new PrintWriter(fileName);
    +      for (i <- start until (start + line)) {
    +        write.println(i + "," + "n" + i + "," + "c" + Random.nextInt(line) + "," + Random.nextInt(80))
    +      }
    +      write.close()
    +    } catch {
    +      case _: Exception => return false
    +    }
    +    return true
    +  }
    +
    +  def deleteFile(fileName: String): Boolean = {
    +    try {
    +      val file = new File(fileName)
    +      if (file.exists()) {
    +        file.delete()
    +      }
    +    } catch {
    +      case _: Exception => return false
    +    }
    +    return true
    --- End diff --
    
    remove return


---

[GitHub] carbondata pull request #1361: [CARBONDATA-1481]Compaction support global so...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1361#discussion_r139307748
  
    --- Diff: integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/BatchSortLoad3TestCase.scala ---
    @@ -112,33 +112,15 @@ class BatchSortLoad3TestCase extends QueryTest with BeforeAndAfterAll {
         sql(s"""drop table if exists t3""").collect
       }
     
    -
    -  //Batch_sort_Loading_001-01-01-01_001-TC_056
    -  test("Batch_sort_Loading_001-01-01-01_001-TC_056", Include) {
    --- End diff --
    
    why removing it?


---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/192/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/205/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/242/



---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    @QiangCai  Please review it


---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    please rebase to master


---

[GitHub] carbondata issue #1361: [CARBONDATA-1481] Compaction support global sort

Posted by xubo245 <gi...@git.apache.org>.
Github user xubo245 commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    
    @QiangCai Please review it


---

[GitHub] carbondata issue #1361: [CARBONDATA-1481]Compaction support global sort

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1361
  
    Build Failed with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/69/



---