You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by xuchuanyin <gi...@git.apache.org> on 2018/01/11 07:39:02 UTC

[GitHub] carbondata pull request #1792: [CARBONDATA-2018][DataLoad] Optimization in r...

GitHub user xuchuanyin opened a pull request:

    https://github.com/apache/carbondata/pull/1792

    [CARBONDATA-2018][DataLoad] Optimization in reading/writing for sort temp row

    pack the no-sort fields in the row as a byte array during merge sort
    to save CPU consumption
    
    I've tested it in my cluster and seen about 8% performance gained (74MB/s/Node -> 81MB/s/Node) in data loading. Please note that global_sort will not gain benefit from this feature since there are no sort temp file in that procedure.
    
    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [x] Any interfaces changed?
     `Some internal used interface has been changed`
     - [x] Any backward compatibility impacted?
     `No`
     - [x] Document update required?
    `No`
     - [x] Testing done
            Please provide details on 
            - Whether new unit test cases have been added or why no new tests are required?
    `No`
            - How it is tested? Please attach test report.
    `Tested in 3-node cluster with real business data`
            - Is it a performance related change? Please attach the performance test report.
    `Yes, I've tested it in my cluster and seen about 8% performance gained (74MB/s/Node -> 81MB/s/Node) in data loading.`
            - Any additional information to help reviewers in testing this change.
    `No`
     - [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
    `Unrelated`


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xuchuanyin/carbondata opt_sort_temp_serializeation

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1792.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1792
    
----
commit 1cf4efbd5f3065cb996fa4d6a133df68f2cca585
Author: xuchuanyin <xu...@...>
Date:   2018-01-10T12:39:02Z

    pack no sort fields
    
    pack the no-sort fields in the row as a byte array during merge sort
    to save CPU consumption

----


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2708/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3316/



---

[GitHub] carbondata pull request #1792: [CARBONDATA-2018][DataLoad] Optimization in r...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin closed the pull request at:

    https://github.com/apache/carbondata/pull/1792


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1467/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    retest this please


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2938/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2438/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2854/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2955/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3677/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    retest this please


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2338/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2817/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    retest this please


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1741/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    retest this please


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3371/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    retest this please


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2946/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    retest this please


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1475/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    retest this please


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2088/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2080/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    this PR depends on #1952 


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1726/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1471/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    retest this please


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    retest sdv please


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3327/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    retest this please


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2985/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1704/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1646/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2697/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1682/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2876/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1631/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1480/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2965/



---

[GitHub] carbondata pull request #1792: [CARBONDATA-2018][DataLoad] Optimization in r...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1792#discussion_r164714917
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/NewIntermediateSortTempRowComparator.java ---
    @@ -0,0 +1,73 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.processing.sort.sortdata;
    +
    +import java.util.Comparator;
    +
    +import org.apache.carbondata.core.util.ByteUtil.UnsafeComparer;
    +import org.apache.carbondata.processing.loading.row.IntermediateSortTempRow;
    +
    +/**
    + * This class is used as comparator for comparing intermediate sort temp row
    + */
    +public class NewIntermediateSortTempRowComparator implements Comparator<IntermediateSortTempRow> {
    --- End diff --
    
    OK, I'll fix it


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    retest this please


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1464/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2965/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    retest this please


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2135/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1591/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2704/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1582/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2700/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    merged into carbonstore branch


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2910/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3594/



---

[GitHub] carbondata pull request #1792: [CARBONDATA-2018][DataLoad] Optimization in r...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1792#discussion_r164681706
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeCarbonRowPage.java ---
    @@ -80,255 +63,43 @@ public UnsafeCarbonRowPage(boolean[] noDictionaryDimensionMapping,
         this.managerType = MemoryManagerType.UNSAFE_MEMORY_MANAGER;
       }
     
    -  public int addRow(Object[] row) {
    -    int size = addRow(row, dataBlock.getBaseOffset() + lastSize);
    +  public int addRow(Object[] row, ByteBuffer rowBuffer) {
    +    int size = addRow(row, dataBlock.getBaseOffset() + lastSize, rowBuffer);
         buffer.set(lastSize);
         lastSize = lastSize + size;
         return size;
       }
     
    -  private int addRow(Object[] row, long address) {
    -    if (row == null) {
    -      throw new RuntimeException("Row is null ??");
    -    }
    -    int dimCount = 0;
    -    int size = 0;
    -    Object baseObject = dataBlock.getBaseObject();
    -    for (; dimCount < noDictionaryDimensionMapping.length; dimCount++) {
    -      if (noDictionaryDimensionMapping[dimCount]) {
    -        byte[] col = (byte[]) row[dimCount];
    -        CarbonUnsafe.getUnsafe()
    -            .putShort(baseObject, address + size, (short) col.length);
    -        size += 2;
    -        CarbonUnsafe.getUnsafe().copyMemory(col, CarbonUnsafe.BYTE_ARRAY_OFFSET, baseObject,
    -            address + size, col.length);
    -        size += col.length;
    -      } else {
    -        int value = (int) row[dimCount];
    -        CarbonUnsafe.getUnsafe().putInt(baseObject, address + size, value);
    -        size += 4;
    -      }
    -    }
    -
    -    // write complex dimensions here.
    -    for (; dimCount < dimensionSize; dimCount++) {
    -      byte[] col = (byte[]) row[dimCount];
    -      CarbonUnsafe.getUnsafe().putShort(baseObject, address + size, (short) col.length);
    -      size += 2;
    -      CarbonUnsafe.getUnsafe().copyMemory(col, CarbonUnsafe.BYTE_ARRAY_OFFSET, baseObject,
    -          address + size, col.length);
    -      size += col.length;
    -    }
    -    Arrays.fill(nullSetWords, 0);
    -    int nullSetSize = nullSetWords.length * 8;
    -    int nullWordLoc = size;
    -    size += nullSetSize;
    -    for (int mesCount = 0; mesCount < measureSize; mesCount++) {
    -      Object value = row[mesCount + dimensionSize];
    -      if (null != value) {
    -        DataType dataType = measureDataType[mesCount];
    -        if (dataType == DataTypes.BOOLEAN) {
    -          Boolean bval = (Boolean) value;
    -          CarbonUnsafe.getUnsafe().putBoolean(baseObject, address + size, bval);
    -          size += 1;
    -        } else if (dataType == DataTypes.SHORT) {
    -          Short sval = (Short) value;
    -          CarbonUnsafe.getUnsafe().putShort(baseObject, address + size, sval);
    -          size += 2;
    -        } else if (dataType == DataTypes.INT) {
    -          Integer ival = (Integer) value;
    -          CarbonUnsafe.getUnsafe().putInt(baseObject, address + size, ival);
    -          size += 4;
    -        } else if (dataType == DataTypes.LONG) {
    -          Long val = (Long) value;
    -          CarbonUnsafe.getUnsafe().putLong(baseObject, address + size, val);
    -          size += 8;
    -        } else if (dataType == DataTypes.DOUBLE) {
    -          Double doubleVal = (Double) value;
    -          CarbonUnsafe.getUnsafe().putDouble(baseObject, address + size, doubleVal);
    -          size += 8;
    -        } else if (DataTypes.isDecimal(dataType)) {
    -          BigDecimal decimalVal = (BigDecimal) value;
    -          byte[] bigDecimalInBytes = DataTypeUtil.bigDecimalToByte(decimalVal);
    -          CarbonUnsafe.getUnsafe()
    -              .putShort(baseObject, address + size, (short) bigDecimalInBytes.length);
    -          size += 2;
    -          CarbonUnsafe.getUnsafe()
    -              .copyMemory(bigDecimalInBytes, CarbonUnsafe.BYTE_ARRAY_OFFSET, baseObject,
    -                  address + size, bigDecimalInBytes.length);
    -          size += bigDecimalInBytes.length;
    -        } else {
    -          throw new IllegalArgumentException("unsupported data type:" + measureDataType[mesCount]);
    -        }
    -        set(nullSetWords, mesCount);
    -      } else {
    -        unset(nullSetWords, mesCount);
    -      }
    -    }
    -    CarbonUnsafe.getUnsafe().copyMemory(nullSetWords, CarbonUnsafe.LONG_ARRAY_OFFSET, baseObject,
    -        address + nullWordLoc, nullSetSize);
    -    return size;
    +  /**
    +   * add row as 3 parts
    +   * @param row
    +   * @param address
    +   * @return
    +   */
    +  private int addRow(Object[] row, long address, ByteBuffer rowBuffer) {
    +    return sortStepRowHandler.writeRawRowAsIntermediateSortTempRowToUnsafeMemory(row,
    --- End diff --
    
    Can you add a conversion function to convert row to IntermediateSortTempRow, then add call sortStepRowHandler


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2356/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2938/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2448/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3688/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3576/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    retest this please


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3426/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    retest this please


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2773/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2930/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    LGTM


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    LGTM


---

[GitHub] carbondata pull request #1792: [CARBONDATA-2018][DataLoad] Optimization in r...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin closed the pull request at:

    https://github.com/apache/carbondata/pull/1792


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2864/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    retest this please


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    retest this please


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    please rebase


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3261/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2915/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1738/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2833/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2971/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    retest this please


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    retest this please


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2826/



---

[GitHub] carbondata pull request #1792: [CARBONDATA-2018][DataLoad] Optimization in r...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1792#discussion_r164712900
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeCarbonRowPage.java ---
    @@ -80,255 +63,43 @@ public UnsafeCarbonRowPage(boolean[] noDictionaryDimensionMapping,
         this.managerType = MemoryManagerType.UNSAFE_MEMORY_MANAGER;
       }
     
    -  public int addRow(Object[] row) {
    -    int size = addRow(row, dataBlock.getBaseOffset() + lastSize);
    +  public int addRow(Object[] row, ByteBuffer rowBuffer) {
    +    int size = addRow(row, dataBlock.getBaseOffset() + lastSize, rowBuffer);
         buffer.set(lastSize);
         lastSize = lastSize + size;
         return size;
       }
     
    -  private int addRow(Object[] row, long address) {
    -    if (row == null) {
    -      throw new RuntimeException("Row is null ??");
    -    }
    -    int dimCount = 0;
    -    int size = 0;
    -    Object baseObject = dataBlock.getBaseObject();
    -    for (; dimCount < noDictionaryDimensionMapping.length; dimCount++) {
    -      if (noDictionaryDimensionMapping[dimCount]) {
    -        byte[] col = (byte[]) row[dimCount];
    -        CarbonUnsafe.getUnsafe()
    -            .putShort(baseObject, address + size, (short) col.length);
    -        size += 2;
    -        CarbonUnsafe.getUnsafe().copyMemory(col, CarbonUnsafe.BYTE_ARRAY_OFFSET, baseObject,
    -            address + size, col.length);
    -        size += col.length;
    -      } else {
    -        int value = (int) row[dimCount];
    -        CarbonUnsafe.getUnsafe().putInt(baseObject, address + size, value);
    -        size += 4;
    -      }
    -    }
    -
    -    // write complex dimensions here.
    -    for (; dimCount < dimensionSize; dimCount++) {
    -      byte[] col = (byte[]) row[dimCount];
    -      CarbonUnsafe.getUnsafe().putShort(baseObject, address + size, (short) col.length);
    -      size += 2;
    -      CarbonUnsafe.getUnsafe().copyMemory(col, CarbonUnsafe.BYTE_ARRAY_OFFSET, baseObject,
    -          address + size, col.length);
    -      size += col.length;
    -    }
    -    Arrays.fill(nullSetWords, 0);
    -    int nullSetSize = nullSetWords.length * 8;
    -    int nullWordLoc = size;
    -    size += nullSetSize;
    -    for (int mesCount = 0; mesCount < measureSize; mesCount++) {
    -      Object value = row[mesCount + dimensionSize];
    -      if (null != value) {
    -        DataType dataType = measureDataType[mesCount];
    -        if (dataType == DataTypes.BOOLEAN) {
    -          Boolean bval = (Boolean) value;
    -          CarbonUnsafe.getUnsafe().putBoolean(baseObject, address + size, bval);
    -          size += 1;
    -        } else if (dataType == DataTypes.SHORT) {
    -          Short sval = (Short) value;
    -          CarbonUnsafe.getUnsafe().putShort(baseObject, address + size, sval);
    -          size += 2;
    -        } else if (dataType == DataTypes.INT) {
    -          Integer ival = (Integer) value;
    -          CarbonUnsafe.getUnsafe().putInt(baseObject, address + size, ival);
    -          size += 4;
    -        } else if (dataType == DataTypes.LONG) {
    -          Long val = (Long) value;
    -          CarbonUnsafe.getUnsafe().putLong(baseObject, address + size, val);
    -          size += 8;
    -        } else if (dataType == DataTypes.DOUBLE) {
    -          Double doubleVal = (Double) value;
    -          CarbonUnsafe.getUnsafe().putDouble(baseObject, address + size, doubleVal);
    -          size += 8;
    -        } else if (DataTypes.isDecimal(dataType)) {
    -          BigDecimal decimalVal = (BigDecimal) value;
    -          byte[] bigDecimalInBytes = DataTypeUtil.bigDecimalToByte(decimalVal);
    -          CarbonUnsafe.getUnsafe()
    -              .putShort(baseObject, address + size, (short) bigDecimalInBytes.length);
    -          size += 2;
    -          CarbonUnsafe.getUnsafe()
    -              .copyMemory(bigDecimalInBytes, CarbonUnsafe.BYTE_ARRAY_OFFSET, baseObject,
    -                  address + size, bigDecimalInBytes.length);
    -          size += bigDecimalInBytes.length;
    -        } else {
    -          throw new IllegalArgumentException("unsupported data type:" + measureDataType[mesCount]);
    -        }
    -        set(nullSetWords, mesCount);
    -      } else {
    -        unset(nullSetWords, mesCount);
    -      }
    -    }
    -    CarbonUnsafe.getUnsafe().copyMemory(nullSetWords, CarbonUnsafe.LONG_ARRAY_OFFSET, baseObject,
    -        address + nullWordLoc, nullSetSize);
    -    return size;
    +  /**
    +   * add row as 3 parts
    --- End diff --
    
    It is raw row, I'll fix the comment.


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1540/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2713/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1623/



---

[GitHub] carbondata pull request #1792: [CARBONDATA-2018][DataLoad] Optimization in r...

Posted by xuchuanyin <gi...@git.apache.org>.
GitHub user xuchuanyin reopened a pull request:

    https://github.com/apache/carbondata/pull/1792

    [CARBONDATA-2018][DataLoad] Optimization in reading/writing for sort temp row

    Pick up the no-sort fields in the row and pack them as bytes array and skip parsing them during merge sort to reduce CPU consumption
    
    I've tested it in my cluster and seen about 8% performance gained (74MB/s/Node -> 81MB/s/Node) in data loading. Please note that global_sort will not gain benefit from this feature since there are no sort temp file in that procedure.
    
    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [x] Any interfaces changed?
     `Some internal used interface has been changed`
     - [x] Any backward compatibility impacted?
     `No`
     - [x] Document update required?
    `No`
     - [x] Testing done
            Please provide details on 
            - Whether new unit test cases have been added or why no new tests are required?
    `No`
            - How it is tested? Please attach test report.
    `Tested in 3-node cluster with real business data`
            - Is it a performance related change? Please attach the performance test report.
    `Yes, I've tested it in my cluster and seen about 8% performance gained (74MB/s/Node -> 81MB/s/Node) in data loading.`
            - Any additional information to help reviewers in testing this change.
    `No`
     - [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
    `Unrelated`


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xuchuanyin/carbondata opt_sort_temp_serializeation

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1792.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1792
    
----
commit de71999872761008365efc5bb943c77219479d14
Author: xuchuanyin <xu...@...>
Date:   2018-02-08T06:35:14Z

    Optimization in reading/writing for sort temp row
    
    Pick up the no-sort fields in the row and pack them as byte array and
    skip parsing them during merge sort to reduce CPU consumption.

----


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2835/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1792
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1697/



---

[GitHub] carbondata pull request #1792: [CARBONDATA-2018][DataLoad] Optimization in r...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1792#discussion_r164682124
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeCarbonRowPage.java ---
    @@ -80,255 +63,43 @@ public UnsafeCarbonRowPage(boolean[] noDictionaryDimensionMapping,
         this.managerType = MemoryManagerType.UNSAFE_MEMORY_MANAGER;
       }
     
    -  public int addRow(Object[] row) {
    -    int size = addRow(row, dataBlock.getBaseOffset() + lastSize);
    +  public int addRow(Object[] row, ByteBuffer rowBuffer) {
    +    int size = addRow(row, dataBlock.getBaseOffset() + lastSize, rowBuffer);
         buffer.set(lastSize);
         lastSize = lastSize + size;
         return size;
       }
     
    -  private int addRow(Object[] row, long address) {
    -    if (row == null) {
    -      throw new RuntimeException("Row is null ??");
    -    }
    -    int dimCount = 0;
    -    int size = 0;
    -    Object baseObject = dataBlock.getBaseObject();
    -    for (; dimCount < noDictionaryDimensionMapping.length; dimCount++) {
    -      if (noDictionaryDimensionMapping[dimCount]) {
    -        byte[] col = (byte[]) row[dimCount];
    -        CarbonUnsafe.getUnsafe()
    -            .putShort(baseObject, address + size, (short) col.length);
    -        size += 2;
    -        CarbonUnsafe.getUnsafe().copyMemory(col, CarbonUnsafe.BYTE_ARRAY_OFFSET, baseObject,
    -            address + size, col.length);
    -        size += col.length;
    -      } else {
    -        int value = (int) row[dimCount];
    -        CarbonUnsafe.getUnsafe().putInt(baseObject, address + size, value);
    -        size += 4;
    -      }
    -    }
    -
    -    // write complex dimensions here.
    -    for (; dimCount < dimensionSize; dimCount++) {
    -      byte[] col = (byte[]) row[dimCount];
    -      CarbonUnsafe.getUnsafe().putShort(baseObject, address + size, (short) col.length);
    -      size += 2;
    -      CarbonUnsafe.getUnsafe().copyMemory(col, CarbonUnsafe.BYTE_ARRAY_OFFSET, baseObject,
    -          address + size, col.length);
    -      size += col.length;
    -    }
    -    Arrays.fill(nullSetWords, 0);
    -    int nullSetSize = nullSetWords.length * 8;
    -    int nullWordLoc = size;
    -    size += nullSetSize;
    -    for (int mesCount = 0; mesCount < measureSize; mesCount++) {
    -      Object value = row[mesCount + dimensionSize];
    -      if (null != value) {
    -        DataType dataType = measureDataType[mesCount];
    -        if (dataType == DataTypes.BOOLEAN) {
    -          Boolean bval = (Boolean) value;
    -          CarbonUnsafe.getUnsafe().putBoolean(baseObject, address + size, bval);
    -          size += 1;
    -        } else if (dataType == DataTypes.SHORT) {
    -          Short sval = (Short) value;
    -          CarbonUnsafe.getUnsafe().putShort(baseObject, address + size, sval);
    -          size += 2;
    -        } else if (dataType == DataTypes.INT) {
    -          Integer ival = (Integer) value;
    -          CarbonUnsafe.getUnsafe().putInt(baseObject, address + size, ival);
    -          size += 4;
    -        } else if (dataType == DataTypes.LONG) {
    -          Long val = (Long) value;
    -          CarbonUnsafe.getUnsafe().putLong(baseObject, address + size, val);
    -          size += 8;
    -        } else if (dataType == DataTypes.DOUBLE) {
    -          Double doubleVal = (Double) value;
    -          CarbonUnsafe.getUnsafe().putDouble(baseObject, address + size, doubleVal);
    -          size += 8;
    -        } else if (DataTypes.isDecimal(dataType)) {
    -          BigDecimal decimalVal = (BigDecimal) value;
    -          byte[] bigDecimalInBytes = DataTypeUtil.bigDecimalToByte(decimalVal);
    -          CarbonUnsafe.getUnsafe()
    -              .putShort(baseObject, address + size, (short) bigDecimalInBytes.length);
    -          size += 2;
    -          CarbonUnsafe.getUnsafe()
    -              .copyMemory(bigDecimalInBytes, CarbonUnsafe.BYTE_ARRAY_OFFSET, baseObject,
    -                  address + size, bigDecimalInBytes.length);
    -          size += bigDecimalInBytes.length;
    -        } else {
    -          throw new IllegalArgumentException("unsupported data type:" + measureDataType[mesCount]);
    -        }
    -        set(nullSetWords, mesCount);
    -      } else {
    -        unset(nullSetWords, mesCount);
    -      }
    -    }
    -    CarbonUnsafe.getUnsafe().copyMemory(nullSetWords, CarbonUnsafe.LONG_ARRAY_OFFSET, baseObject,
    -        address + nullWordLoc, nullSetSize);
    -    return size;
    +  /**
    +   * add row as 3 parts
    --- End diff --
    
    The input row is 3 part row? Can you  wrap it by some class? It is hard to understand what is inside Object[]


---

[GitHub] carbondata pull request #1792: [CARBONDATA-2018][DataLoad] Optimization in r...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1792#discussion_r164683831
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/NewIntermediateSortTempRowComparator.java ---
    @@ -0,0 +1,73 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.processing.sort.sortdata;
    +
    +import java.util.Comparator;
    +
    +import org.apache.carbondata.core.util.ByteUtil.UnsafeComparer;
    +import org.apache.carbondata.processing.loading.row.IntermediateSortTempRow;
    +
    +/**
    + * This class is used as comparator for comparing intermediate sort temp row
    + */
    +public class NewIntermediateSortTempRowComparator implements Comparator<IntermediateSortTempRow> {
    --- End diff --
    
    Why not call it `IntermediateSortTempRowComparator`


---