You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by bhavya411 <gi...@git.apache.org> on 2018/08/03 12:38:00 UTC

[GitHub] carbondata pull request #2607: Presto Upgrade to 0.206

GitHub user bhavya411 opened a pull request:

    https://github.com/apache/carbondata/pull/2607

    Presto Upgrade to 0.206

    This PR upgraded the Presto version from 0.187 to 0.206, the CarbondataSplitManager has been changed to accommodate the SplitSchedulingStrategy, also the SliceArrayBlock has now been removed from this version so we have to add the VariableWidthBlock for Dictionary. BlockBuilderStatus access has also been changed to protected instead of public so we have to pass null. Also, now the implementation of Stream Readers uses Block.
    
     - [ Y] Any interfaces changed?
     
     - [ Y] Any backward compatibility impacted?
     
     - [ N] Document update required?
    
     - [ Y] Testing was done
    All Unit test cases are passing and have run queries manually to test them.
           
    
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/bhavya411/incubator-carbondata CARBONDATA-2818

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2607.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2607
    
----
commit a1b698aa58a714ebb21dfcc01b5bd2d610378649
Author: Bhavya <bh...@...>
Date:   2018-08-03T08:10:14Z

    Presto Upgrade to 0.206

----


---

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/333/



---

[GitHub] carbondata pull request #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by bhavya411 <gi...@git.apache.org>.
Github user bhavya411 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2607#discussion_r207804289
  
    --- Diff: integration/presto/src/main/scala/org/apache/carbondata/presto/CarbonDictionaryDecodeReadSupport.scala ---
    @@ -84,25 +85,31 @@ class CarbonDictionaryDecodeReadSupport[T] extends CarbonReadSupport[T] {
        * @param dictionaryData
        * @return
        */
    -  private def createSliceArrayBlock(dictionaryData: Dictionary): SliceArrayBlock = {
    +  private def createSliceArrayBlock(dictionaryData: Dictionary): Block = {
         val chunks: DictionaryChunksWrapper = dictionaryData.getDictionaryChunks
    -    val sliceArray = new Array[Slice](chunks.getSize + 1)
    -    // Initialize Slice Array with Empty Slice as per Presto's code
    -    sliceArray(0) = Slices.EMPTY_SLICE
    -    var count = 1
    +    val positionCount = chunks.getSize;
    +    val offsetVector : Array[Int] = new Array[Int](positionCount + 2 )
    +    val isNullVector: Array[Boolean] = new Array[Boolean](positionCount + 1)
    +    isNullVector(0) = true
    +    isNullVector(1) = true
    --- End diff --
    
    We are talking about dictionary here , so In dictionary there will be only one null and the key value will be 1 by default in CarbonData, hence the isNullVector will be populated only once with null value it has no bearing on actual data. The Carbondata key starts from 1 so we need a filler at 0th position and 1 index is actually Null to map to carbondata null values . The offset index will be like 0th Position  ->  0  (As it is filler)
    1st Position -> 0 (For actual Null)
    2nd Postion -> 0 as the byte[] is still null so starting point will be 0 only


---

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7774/



---

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
  
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8580/



---

[GitHub] carbondata pull request #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by ajantha-bhat <gi...@git.apache.org>.
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2607#discussion_r207812426
  
    --- Diff: integration/presto/src/main/scala/org/apache/carbondata/presto/CarbonDictionaryDecodeReadSupport.scala ---
    @@ -84,25 +85,31 @@ class CarbonDictionaryDecodeReadSupport[T] extends CarbonReadSupport[T] {
        * @param dictionaryData
        * @return
        */
    -  private def createSliceArrayBlock(dictionaryData: Dictionary): SliceArrayBlock = {
    +  private def createSliceArrayBlock(dictionaryData: Dictionary): Block = {
         val chunks: DictionaryChunksWrapper = dictionaryData.getDictionaryChunks
    -    val sliceArray = new Array[Slice](chunks.getSize + 1)
    -    // Initialize Slice Array with Empty Slice as per Presto's code
    -    sliceArray(0) = Slices.EMPTY_SLICE
    -    var count = 1
    +    val positionCount = chunks.getSize;
    +    val offsetVector : Array[Int] = new Array[Int](positionCount + 2 )
    +    val isNullVector: Array[Boolean] = new Array[Boolean](positionCount + 1)
    +    isNullVector(0) = true
    +    isNullVector(1) = true
    --- End diff --
    
    ok.


---

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/224/



---

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by zzcclp <gi...@git.apache.org>.
Github user zzcclp commented on the issue:

    https://github.com/apache/carbondata/pull/2607
  
    @chenliang613 can it use the 0.210 as default version, there are some bugs on 0.206,


---

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
  
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/510/



---

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/2607
  
    @bhavya411  I tested this pr,  the performance(simple aggregation) not getting any improvement(0.206 compare to 0.187)
    
    Just i checked 0.207 and 0.208, there are fixed many memory issues, so propose to upgrade to 0.208 for CarbonData integration.


---

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
  
    Build Failed  with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8463/



---

[GitHub] carbondata pull request #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/2607


---

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6498/



---

[GitHub] carbondata pull request #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by ajantha-bhat <gi...@git.apache.org>.
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2607#discussion_r207774205
  
    --- Diff: integration/presto/src/main/scala/org/apache/carbondata/presto/CarbonDictionaryDecodeReadSupport.scala ---
    @@ -84,25 +85,31 @@ class CarbonDictionaryDecodeReadSupport[T] extends CarbonReadSupport[T] {
        * @param dictionaryData
        * @return
        */
    -  private def createSliceArrayBlock(dictionaryData: Dictionary): SliceArrayBlock = {
    +  private def createSliceArrayBlock(dictionaryData: Dictionary): Block = {
         val chunks: DictionaryChunksWrapper = dictionaryData.getDictionaryChunks
    -    val sliceArray = new Array[Slice](chunks.getSize + 1)
    -    // Initialize Slice Array with Empty Slice as per Presto's code
    -    sliceArray(0) = Slices.EMPTY_SLICE
    -    var count = 1
    +    val positionCount = chunks.getSize;
    +    val offsetVector : Array[Int] = new Array[Int](positionCount + 2 )
    +    val isNullVector: Array[Boolean] = new Array[Boolean](positionCount + 1)
    +    isNullVector(0) = true
    +    isNullVector(1) = true
    --- End diff --
    
    Can you please add a comment about this logic. It is hard to understand. 
    Why **isNullVector** is not filled inside while() if content is null. 
    Also why first 3 elements of offset vector is set to 0 ?


---

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6615/



---

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/2607
  
    retest this please


---

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/32/



---

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/2607
  
    @bhavya411 any new progress ?


---

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/2607
  
    LGTM, spark 2.3.1 CI is another issue.


---

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/2607
  
    retest this please


---

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7891/



---

[GitHub] carbondata pull request #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by ajantha-bhat <gi...@git.apache.org>.
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2607#discussion_r207771253
  
    --- Diff: integration/presto/src/main/java/org/apache/carbondata/presto/CarbonVectorBatch.java ---
    @@ -73,7 +73,7 @@ public static CarbonVectorBatch allocate(StructField[] schema,
       }
     
       private CarbonColumnVectorImpl createDirectStreamReader(int batchSize, DataType dataType,
    -      StructField field, Dictionary dictionary, SliceArrayBlock dictionarySliceArrayBlock) {
    +      StructField field, Dictionary dictionary, Block dictionarySliceArrayBlock) {
    --- End diff --
    
    better to modify the names also for readability and maintainability purpose.  
    dictionarySliceArrayBlock --> dictionaryBlock


---

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2607
  
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/393/



---

[GitHub] carbondata issue #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2607
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6157/



---