You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by kevinjmh <gi...@git.apache.org> on 2018/07/16 12:20:21 UTC

[GitHub] carbondata pull request #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for g...

GitHub user kevinjmh opened a pull request:

    https://github.com/apache/carbondata/pull/2512

    [CARBONDATA-2746][BloomDataMap] Fix bug for getting datamap file when table has multiple datamaps

    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
    
     - [ ] Testing done
            Please provide details on 
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
    
    Currently, if table has multiple bloom datamap and carbon is set to use distributed datamap, query will throw an exception when accessing the index file, because carbon gets all the datamaps but sets them with same datamap schema. The error is appeared when getting the full path of bloom index by concating index directory and index column.  This PR fix this problem by filter the index directories of target datamap when using distributed datamap.
    
    Test shows that lucene is not affected by this. On the other hand, lucene gets wrong result if we apply this filter

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kevinjmh/carbondata fix_multidm_path

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2512.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2512
    
----
commit f1c50af176fc792c2fbdbe7c2114954b545ca723
Author: Manhua <ke...@...>
Date:   2018-07-16T11:29:07Z

    fix for datamap path problem

----


---

[GitHub] carbondata pull request #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for g...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2512#discussion_r202708377
  
    --- Diff: integration/spark2/src/test/scala/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapSuite.scala ---
    @@ -413,6 +414,46 @@ class BloomCoarseGrainDataMapSuite extends QueryTest with BeforeAndAfterAll with
         checkQuery("fakeDm", shouldHit = false)
       }
     
    +  test("test create datamaps on different column but hit only one") {
    +    val originDistributedDatamapStatus = CarbonProperties.getInstance().getProperty(
    +      CarbonCommonConstants.USE_DISTRIBUTED_DATAMAP,
    +      CarbonCommonConstants.USE_DISTRIBUTED_DATAMAP_DEFAULT
    +    )
    +
    +    CarbonProperties.getInstance()
    +      .addProperty(CarbonCommonConstants.USE_DISTRIBUTED_DATAMAP, "true")
    +    val datamap1 = "datamap1"
    +    val datamap2 = "datamap2"
    +    sql(
    +      s"""
    +         | CREATE TABLE $bloomDMSampleTable(id INT, name STRING, city STRING, age INT)
    +         | STORED BY 'carbondata'
    +         |  """.stripMargin)
    +    sql(
    +      s"""
    +         | CREATE DATAMAP $datamap1 ON TABLE $bloomDMSampleTable
    +         | USING 'bloomfilter'
    +         | DMProperties('INDEX_COLUMNS'='name', 'BLOOM_SIZE'='64000', 'BLOOM_FPP'='0.00001')
    +      """.stripMargin)
    +    sql(
    +      s"""
    +         | CREATE DATAMAP $datamap2 ON TABLE $bloomDMSampleTable
    +         | USING 'bloomfilter'
    +         | DMProperties('INDEX_COLUMNS'='city', 'BLOOM_SIZE'='64000', 'BLOOM_FPP'='0.00001')
    +      """.stripMargin)
    +
    +    sql(
    +      s"""
    +         | INSERT INTO $bloomDMSampleTable
    +         | VALUES(5,'a','beijing',21),(6,'b','shanghai',25),(7,'b','guangzhou',28)
    +      """.stripMargin)
    +    sql(s"SELECT * FROM $bloomDMSampleTable WHERE name='shanghai'").show()
    --- End diff --
    
    Is lucene datamap require similar fix? Please raise a JIRA issue to track it. Thanks


---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5884/



---

[GitHub] carbondata pull request #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for g...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2512#discussion_r202708129
  
    --- Diff: integration/spark2/src/test/scala/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapSuite.scala ---
    @@ -413,6 +414,46 @@ class BloomCoarseGrainDataMapSuite extends QueryTest with BeforeAndAfterAll with
         checkQuery("fakeDm", shouldHit = false)
       }
     
    +  test("test create datamaps on different column but hit only one") {
    +    val originDistributedDatamapStatus = CarbonProperties.getInstance().getProperty(
    +      CarbonCommonConstants.USE_DISTRIBUTED_DATAMAP,
    +      CarbonCommonConstants.USE_DISTRIBUTED_DATAMAP_DEFAULT
    +    )
    +
    +    CarbonProperties.getInstance()
    +      .addProperty(CarbonCommonConstants.USE_DISTRIBUTED_DATAMAP, "true")
    +    val datamap1 = "datamap1"
    +    val datamap2 = "datamap2"
    +    sql(
    +      s"""
    +         | CREATE TABLE $bloomDMSampleTable(id INT, name STRING, city STRING, age INT)
    +         | STORED BY 'carbondata'
    +         |  """.stripMargin)
    +    sql(
    +      s"""
    +         | CREATE DATAMAP $datamap1 ON TABLE $bloomDMSampleTable
    +         | USING 'bloomfilter'
    +         | DMProperties('INDEX_COLUMNS'='name', 'BLOOM_SIZE'='64000', 'BLOOM_FPP'='0.00001')
    +      """.stripMargin)
    +    sql(
    +      s"""
    +         | CREATE DATAMAP $datamap2 ON TABLE $bloomDMSampleTable
    +         | USING 'bloomfilter'
    +         | DMProperties('INDEX_COLUMNS'='city', 'BLOOM_SIZE'='64000', 'BLOOM_FPP'='0.00001')
    +      """.stripMargin)
    +
    +    sql(
    +      s"""
    +         | INSERT INTO $bloomDMSampleTable
    +         | VALUES(5,'a','beijing',21),(6,'b','shanghai',25),(7,'b','guangzhou',28)
    +      """.stripMargin)
    +    sql(s"SELECT * FROM $bloomDMSampleTable WHERE name='shanghai'").show()
    --- End diff --
    
    use expect to validate the result instead of show


---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7240/



---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7250/



---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6026/



---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    retest this please


---

[GitHub] carbondata pull request #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for g...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/2512


---

[GitHub] carbondata pull request #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for g...

Posted by kevinjmh <gi...@git.apache.org>.
Github user kevinjmh commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2512#discussion_r202869553
  
    --- Diff: integration/spark2/src/test/scala/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapSuite.scala ---
    @@ -413,6 +414,46 @@ class BloomCoarseGrainDataMapSuite extends QueryTest with BeforeAndAfterAll with
         checkQuery("fakeDm", shouldHit = false)
       }
     
    +  test("test create datamaps on different column but hit only one") {
    +    val originDistributedDatamapStatus = CarbonProperties.getInstance().getProperty(
    +      CarbonCommonConstants.USE_DISTRIBUTED_DATAMAP,
    +      CarbonCommonConstants.USE_DISTRIBUTED_DATAMAP_DEFAULT
    +    )
    +
    +    CarbonProperties.getInstance()
    +      .addProperty(CarbonCommonConstants.USE_DISTRIBUTED_DATAMAP, "true")
    +    val datamap1 = "datamap1"
    +    val datamap2 = "datamap2"
    +    sql(
    +      s"""
    +         | CREATE TABLE $bloomDMSampleTable(id INT, name STRING, city STRING, age INT)
    +         | STORED BY 'carbondata'
    +         |  """.stripMargin)
    +    sql(
    +      s"""
    +         | CREATE DATAMAP $datamap1 ON TABLE $bloomDMSampleTable
    +         | USING 'bloomfilter'
    +         | DMProperties('INDEX_COLUMNS'='name', 'BLOOM_SIZE'='64000', 'BLOOM_FPP'='0.00001')
    +      """.stripMargin)
    +    sql(
    +      s"""
    +         | CREATE DATAMAP $datamap2 ON TABLE $bloomDMSampleTable
    +         | USING 'bloomfilter'
    +         | DMProperties('INDEX_COLUMNS'='city', 'BLOOM_SIZE'='64000', 'BLOOM_FPP'='0.00001')
    +      """.stripMargin)
    +
    +    sql(
    +      s"""
    +         | INSERT INTO $bloomDMSampleTable
    +         | VALUES(5,'a','beijing',21),(6,'b','shanghai',25),(7,'b','guangzhou',28)
    +      """.stripMargin)
    +    sql(s"SELECT * FROM $bloomDMSampleTable WHERE name='shanghai'").show()
    --- End diff --
    
    Test shows that lucene is not affected by this. On the other hand, lucene gets wrong result(Empty) if we apply this filter.(testname: "test lucene fine grain multiple data map on table" ).  Issue 2747 is raised to track


---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7232/



---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7217/



---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5880/



---

[GitHub] carbondata pull request #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for g...

Posted by kevinjmh <gi...@git.apache.org>.
Github user kevinjmh commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2512#discussion_r202864952
  
    --- Diff: integration/spark2/src/test/scala/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapSuite.scala ---
    @@ -413,6 +414,46 @@ class BloomCoarseGrainDataMapSuite extends QueryTest with BeforeAndAfterAll with
         checkQuery("fakeDm", shouldHit = false)
       }
     
    +  test("test create datamaps on different column but hit only one") {
    +    val originDistributedDatamapStatus = CarbonProperties.getInstance().getProperty(
    +      CarbonCommonConstants.USE_DISTRIBUTED_DATAMAP,
    +      CarbonCommonConstants.USE_DISTRIBUTED_DATAMAP_DEFAULT
    +    )
    +
    +    CarbonProperties.getInstance()
    +      .addProperty(CarbonCommonConstants.USE_DISTRIBUTED_DATAMAP, "true")
    +    val datamap1 = "datamap1"
    +    val datamap2 = "datamap2"
    +    sql(
    +      s"""
    +         | CREATE TABLE $bloomDMSampleTable(id INT, name STRING, city STRING, age INT)
    +         | STORED BY 'carbondata'
    +         |  """.stripMargin)
    +    sql(
    +      s"""
    +         | CREATE DATAMAP $datamap1 ON TABLE $bloomDMSampleTable
    +         | USING 'bloomfilter'
    +         | DMProperties('INDEX_COLUMNS'='name', 'BLOOM_SIZE'='64000', 'BLOOM_FPP'='0.00001')
    +      """.stripMargin)
    +    sql(
    +      s"""
    +         | CREATE DATAMAP $datamap2 ON TABLE $bloomDMSampleTable
    +         | USING 'bloomfilter'
    +         | DMProperties('INDEX_COLUMNS'='city', 'BLOOM_SIZE'='64000', 'BLOOM_FPP'='0.00001')
    +      """.stripMargin)
    +
    +    sql(
    +      s"""
    +         | INSERT INTO $bloomDMSampleTable
    +         | VALUES(5,'a','beijing',21),(6,'b','shanghai',25),(7,'b','guangzhou',28)
    +      """.stripMargin)
    +    sql(s"SELECT * FROM $bloomDMSampleTable WHERE name='shanghai'").show()
    --- End diff --
    
    Test shows that lucene is not affected by this. On the other hand, lucene gets wrong result if we apply this filter. Reason to be determined


---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7266/



---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6035/



---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by kevinjmh <gi...@git.apache.org>.
Github user kevinjmh commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    retest this please


---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7227/



---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7261/



---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6013/



---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6000/



---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by brijoobopanna <gi...@git.apache.org>.
Github user brijoobopanna commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    retest this please


---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5895/



---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    LGTM


---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by brijoobopanna <gi...@git.apache.org>.
Github user brijoobopanna commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    retest sdv please


---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5992/



---

[GitHub] carbondata issue #2512: [CARBONDATA-2746][BloomDataMap] Fix bug for getting ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2512
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6005/



---