You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by xuchuanyin <gi...@git.apache.org> on 2017/12/04 10:48:10 UTC

[GitHub] carbondata pull request #1606: WIP:[CARBONDATA-1839] Fix bugs in compressing...

GitHub user xuchuanyin opened a pull request:

    https://github.com/apache/carbondata/pull/1606

    WIP:[CARBONDATA-1839] Fix bugs in compressing sort temp files

    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [X] Any interfaces changed?
      `YES, ONLY CHANGE INTERNAL INTERFACES`
     - [X] Any backward compatibility impacted?
      `NO`
     - [X] Document update required?
      `YES`
     - [X] Testing done
            Please provide details on 
            - Whether new unit test cases have been added or why no new tests are required?
            `ADDED TESTS`
            - How it is tested? Please attach test report.
            `TESTED IN LOCAL CLUSTER`
            - Is it a performance related change? Please attach the performance test report.
            `YES`
            - Any additional information to help reviewers in testing this change.
            `There are some duplicate code in write temp sort files found during this bug fixing and I plan to optimize it in successive PR not in this one.`
     - [X] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
            `NOT RELATED`
    
    RESOLVE
    ===
    1. fix bugs in compressing sort temp files
    
    2. optimize parameters
    
    3. add tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xuchuanyin/carbondata bug_compress_sort_temp

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1606.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1606
    
----
commit 83e109b6d332dafcc10e243e8bc5a21ac3617ac6
Author: xuchuanyin <xu...@hust.edu.cn>
Date:   2017-12-01T11:32:00Z

    fix bugs in compressing sort temp files
    
    1. fix bugs in compressing sort temp files
    
    2. optimize parameters
    
    3. add tests

----


---

[GitHub] carbondata issue #1606: WIP:[CARBONDATA-1839] Fix bugs in compressing sort t...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1606
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1674/



---

[GitHub] carbondata issue #1606: [CARBONDATA-1839] Fix bugs in compressing sort temp ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1606
  
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/429/



---

[GitHub] carbondata pull request #1606: [CARBONDATA-1839] [DataLoad]Fix bugs in compr...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1606#discussion_r155000018
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/TableFieldStat.java ---
    @@ -0,0 +1,139 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.processing.sort.sortdata;
    +
    +import java.util.Objects;
    +
    +import org.apache.carbondata.core.metadata.datatype.DataType;
    +
    +public class TableFieldStat {
    --- End diff --
    
    Add comment


---

[GitHub] carbondata issue #1606: [CARBONDATA-1839] [DataLoad]Fix bugs in compressing ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1606
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1714/



---

[GitHub] carbondata issue #1606: [CARBONDATA-1839] [DataLoad]Fix bugs in compressing ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1606
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1757/



---

[GitHub] carbondata issue #1606: WIP:[CARBONDATA-1839] Fix bugs in compressing sort t...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1606
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2061/



---

[GitHub] carbondata issue #1606: [CARBONDATA-1839] Fix bugs in compressing sort temp ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1606
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1700/



---

[GitHub] carbondata issue #1606: [CARBONDATA-1839] [DataLoad]Fix bugs in compressing ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1606
  
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/495/



---

[GitHub] carbondata issue #1606: [CARBONDATA-1839] Fix bugs in compressing sort temp ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1606
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2085/



---

[GitHub] carbondata issue #1606: [CARBONDATA-1839] [DataLoad]Fix bugs in compressing ...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/1606
  
    Is the rework completed for this PR?


---

[GitHub] carbondata pull request #1606: WIP: [CARBONDATA-1839] [DataLoad]Fix bugs in ...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin closed the pull request at:

    https://github.com/apache/carbondata/pull/1606


---

[GitHub] carbondata issue #1606: [CARBONDATA-1839] Fix bugs in compressing sort temp ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1606
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2095/



---

[GitHub] carbondata issue #1606: [CARBONDATA-1839] Fix bugs in compressing sort temp ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1606
  
    Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/445/



---

[GitHub] carbondata issue #1606: [CARBONDATA-1839] [DataLoad]Fix bugs in compressing ...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1606
  
    @jackylk Thanks for your review, since another PR #1594 has been merged, I'll optimize this PR to reduce duplicate codes in writing sort temp files.


---

[GitHub] carbondata pull request #1606: [CARBONDATA-1839] [DataLoad]Fix bugs in compr...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1606#discussion_r154999809
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/SortTempFileChunkHolder.java ---
    @@ -381,6 +351,51 @@ private void fillDataForPrefetch() {
         return holder;
       }
     
    +  private Object[][] getBatchedRowFromStream(int expected) throws CarbonSortKeyAndGroupByException {
    --- End diff --
    
    please add comment for this function


---

[GitHub] carbondata issue #1606: WIP: [CARBONDATA-1839] [DataLoad]Fix bugs in compres...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1606
  
    @jackylk I close this PR and raise #1632 instead.


---

[GitHub] carbondata issue #1606: [CARBONDATA-1839] [DataLoad]Fix bugs in compressing ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1606
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2141/



---

[GitHub] carbondata pull request #1606: [CARBONDATA-1839] [DataLoad]Fix bugs in compr...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1606#discussion_r154999120
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/ParallelReadMergeSorterImpl.java ---
    @@ -80,11 +81,7 @@ public void initialize(SortParameters sortParameters) {
             File.separator, CarbonCommonConstants.SORT_TEMP_FILE_LOCATION);
         finalMerger =
             new SingleThreadFinalSortFilesMerger(dataFolderLocations, sortParameters.getTableName(),
    -            sortParameters.getDimColCount(),
    -            sortParameters.getComplexDimColCount(), sortParameters.getMeasureColCount(),
    -            sortParameters.getNoDictionaryCount(), sortParameters.getMeasureDataType(),
    -            sortParameters.getNoDictionaryDimnesionColumn(),
    -            sortParameters.getNoDictionarySortColumn());
    +            new TableFieldStat(sortParameters));
    --- End diff --
    
    I think the indentation is not correct


---

[GitHub] carbondata issue #1606: WIP:[CARBONDATA-1839] Fix bugs in compressing sort t...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1606
  
    Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/401/



---

[GitHub] carbondata pull request #1606: [CARBONDATA-1839] [DataLoad]Fix bugs in compr...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1606#discussion_r154999525
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeSortTempFileChunkHolder.java ---
    @@ -352,6 +330,49 @@ private void fillDataForPrefetch() {
         }
       }
     
    +  private Object[][] getBatchedRowFromStream(int expected) throws CarbonSortKeyAndGroupByException {
    --- End diff --
    
    please add comment for this function


---

[GitHub] carbondata issue #1606: [CARBONDATA-1839] [DataLoad]Fix bugs in compressing ...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1606
  
    No, it is a little complex to reduce the duplicate code. I'll make a full test as possible as I can.


---