You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by GitBox <gi...@apache.org> on 2020/02/09 00:59:51 UTC

[GitHub] [carbondata] marchpure opened a new pull request #3607: [CARBONDATA-3670] Support compress offheap columnpage directly, avoding a copy of data from offhead to heap when compressed.

marchpure opened a new pull request #3607: [CARBONDATA-3670] Support compress offheap columnpage directly, avoding a copy of data from offhead to heap when compressed.
URL: https://github.com/apache/carbondata/pull/3607
 
 
    ### Why is this PR needed?
     When loading, the columnpages are stored on the offheap by default,  compression is needed to save storage cost. But, in the compression, the data must be copied from the offheap to the heap before compressed, leads to heavier GC overhead compared with compress offhead data directly.
     Overall, this pr aims to support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap when compressed.
    
    ### What changes were proposed in this PR?
     1. Support compress direct bytebuffer in the SNAPPY/ZSTD/GZIP compressor
        Add Interface compressByte(ByteBuffer) in the Compressor/SnappyCompressor/ZstdCompressor/GzipCompressor.java
     2. Support compress offheap data directly in the columnpage if the dataype is primitive
        2.1 Add Interface getPage in columnpage to get data as directbytebuffer
        2.2 The compress() in the Columnpage.java is changed. If the datatype is primitve and the page is unsafe, compress the directbytebuffer returned by getPage() directly.
     3. Support compress offheap data directly in the columnpage in IndexStorageCodec
        3.1 For String/Varchar, the RLE and InvertIndex needs to get the columnpage as 2-dimension bytearray, in which each bytearray presents a row, We add a interface getByteBufferArray()
            in the Columnpage, to replace the 2-dimension bytearray. Then, InvertIndex and RLE can work on the directbytebuffer directly.
        3.2 If there are no need to build RLE and InvertIndex, getByteBufferArray() return the flatten data as directbytebuffer, which can be compressed directly.
       
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - Yes
   
       
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583839686
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1897/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583839247
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1896/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583840188
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1898/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583835072
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1893/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583839556
 
 
   Build Failed  with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/195/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#discussion_r376869338
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/datastore/columnar/BlockIndexerStorageForNoInvertedIndexForShort.java
 ##########
 @@ -79,12 +82,8 @@ private void rleEncodeOnData(List<byte[]> actualDataList) {
     }
   }
 
-  private byte[][] convertToDataPage(List<byte[]> list) {
-    byte[][] shortArray = new byte[list.size()][];
-    for (int i = 0; i < shortArray.length; i++) {
-      shortArray[i] = list.get(i);
-    }
-    return shortArray;
+  private ByteBuffer[] convertToDataPage(List<ByteBuffer> list) {
 
 Review comment:
   should avoid redundant conversion,  should directly use ByteBuffer[] everywhere, don't convert to list

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#discussion_r376873977
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/dimension/legacy/DictDimensionIndexCodec.java
 ##########
 @@ -46,18 +47,33 @@ public String getName() {
   public ColumnPageEncoder createEncoder(Map<String, String> parameter) {
     return new IndexStorageEncoder() {
       @Override
-      void encodeIndexStorage(ColumnPage inputPage) {
-        BlockIndexerStorage<byte[][]> indexStorage;
-        byte[][] data = inputPage.getByteArrayPage();
+      void encodeIndexStorage(ColumnPage input) {
+        BlockIndexerStorage<ByteBuffer[]> indexStorage;
+        boolean isDictionary = input.isLocalDictGeneratedPage();
+
+        // if need to build invertIndex or RLE, the columnpage should to be organized in Row,
+        // in the other words, we get the data of columnpage as an array, in which each element
+        // presenets a row. But if no need to build both invertIndex and RLE, it will increase
+        // extra overhead, considering data in columnpage was already stored as flattened data,
+        // and the compression is also on flattened  data, to organized data in ROW is actually
+        // increase the overheadof "Expand" and "Flatten" with on invertIndex and RLE.
+        // Overall, isFlatted presents do we flatten the data? if need to build invertIndex or RLE,
+        // isFlattened is set to ture, otherwise, isFlattened is set to false.
+        boolean isFlattened = !isInvertedIndex && !isDictionary;
+
+        // when isFlattened is true, data[0] is the flattened data of the columnpage.
+        // when isFlattened is false, data[i] is the ith row of the columnpage.
+        ByteBuffer[] data = input.getByteBufferArrayPage(isFlattened);
         if (isInvertedIndex) {
-          indexStorage = new BlockIndexerStorageForShort(data, true, false, isSort);
+          indexStorage = new BlockIndexerStorageForShort(data, isDictionary, !isDictionary, isSort);
 
 Review comment:
   Not a good practice to pass same flag with complementary values. Hardcode is ok or introduce new variable

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583827606
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583840176
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583848464
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1899/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#discussion_r376872535
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/datastore/page/SafeFixLengthColumnPage.java
 ##########
 @@ -289,6 +298,15 @@ public BigDecimal getDecimal(int rowId) {
     return data;
   }
 
+  @Override
+  public ByteBuffer[] getByteBufferArrayPage(boolean isFlattened) {
 
 Review comment:
   The changes was only for offheap data right ? so I expect only unsafe pages should have changes. Why changed for safe column pages also ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583840028
 
 
   Build Failed  with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/196/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583797146
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1890/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#discussion_r376874268
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/dimension/legacy/DirectDictDimensionIndexCodec.java
 ##########
 @@ -46,18 +47,33 @@ public String getName() {
   public ColumnPageEncoder createEncoder(Map<String, String> parameter) {
     return new IndexStorageEncoder() {
       @Override
-      void encodeIndexStorage(ColumnPage inputPage) {
-        BlockIndexerStorage<byte[][]> indexStorage;
-        byte[][] data = inputPage.getByteArrayPage();
+      void encodeIndexStorage(ColumnPage input) {
+        BlockIndexerStorage<ByteBuffer[]> indexStorage;
+        boolean isDictionary = input.isLocalDictGeneratedPage();
+
+        // if need to build invertIndex or RLE, the columnpage should to be organized in Row,
+        // in the other words, we get the data of columnpage as an array, in which each element
+        // presenets a row. But if no need to build both invertIndex and RLE, it will increase
+        // extra overhead, considering data in columnpage was already stored as flattened data,
+        // and the compression is also on flattened  data, to organized data in ROW is actually
+        // increase the overheadof "Expand" and "Flatten" with on invertIndex and RLE.
+        // Overall, isFlatted presents do we flatten the data? if need to build invertIndex or RLE,
+        // isFlattened is set to ture, otherwise, isFlattened is set to false.
+        boolean isFlattened = !isInvertedIndex && !isDictionary;
+
+        // when isFlattened is true, data[0] is the flattened data of the columnpage.
+        // when isFlattened is false, data[i] is the ith row of the columnpage.
+        ByteBuffer[] data = input.getByteBufferArrayPage(isFlattened);
         if (isInvertedIndex) {
-          indexStorage = new BlockIndexerStorageForShort(data, false, false, isSort);
+          indexStorage = new BlockIndexerStorageForShort(data, isDictionary, !isDictionary, isSort);
 
 Review comment:
   same as above

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#discussion_r376871815
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/datastore/page/ColumnPage.java
 ##########
 @@ -747,6 +759,16 @@ public long getPageLengthInBytes() throws IOException {
    */
   public byte[] compress(Compressor compressor) throws IOException {
     DataType dataType = columnPageEncoderMeta.getStoreDataType();
+
+    // if the columnpage is isUnsafeEnabled and the Datatype is primitive.
+    // we try to compress the data in offheap directly, avoiding a copy from offheap to heap
+    if (isUnsafeEnabled() && (dataType == DataTypes.BOOLEAN || dataType == BYTE
+        || dataType == SHORT || dataType == DataTypes.SHORT_INT || dataType == INT
+        || dataType == LONG || dataType == FLOAT || dataType == DOUBLE
+        || DataTypes.isDecimal(dataType))) {
 
 Review comment:
   is Decimal supported ? 
   
   below I see getByteBufferArrayPage is unsupported in DecimalColumnPage

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583842184
 
 
   Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/197/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#discussion_r376873364
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/dimension/legacy/ComplexDimensionIndexCodec.java
 ##########
 @@ -46,12 +47,13 @@ public ColumnPageEncoder createEncoder(Map<String, String> parameter) {
     return new IndexStorageEncoder() {
       @Override
       void encodeIndexStorage(ColumnPage inputPage) {
-        BlockIndexerStorage<byte[][]> indexStorage =
-            new BlockIndexerStorageForShort(inputPage.getByteArrayPage(), false, false, false);
-        byte[] flattened = ByteUtil.flatten(indexStorage.getDataPage());
+        BlockIndexerStorage<ByteBuffer[]> indexStorage =
+            new BlockIndexerStorageForShort(inputPage.getByteBufferArrayPage(false),
+                false, false, false);
+        ByteBuffer flattened = ByteUtil.flatten(indexStorage.getDataPage());
         Compressor compressor = CompressorFactory.getInstance().getCompressor(
             inputPage.getColumnCompressorName());
-        byte[] compressed = compressor.compressByte(flattened);
+        byte[] compressed = ByteUtil.byteBufferToBytes(compressor.compressByte(flattened));
 
 Review comment:
   I think we have converted 2D byte array to bytebuffer, to support compression directly on the byte buffer.
   But now we have to convert to byte[] again !, possible to keep bytebuffer itself ?
   
   so compression gc might have reduced, but decompression gc might have increased now. Please compare the before and after performance and memory usage.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583829549
 
 
   Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/191/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583839095
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] marchpure closed pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
marchpure closed pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#discussion_r376868515
 
 

 ##########
 File path: core/src/main/java/org/apache/carbondata/core/datastore/columnar/BlockIndexerStorageForNoInvertedIndexForShort.java
 ##########
 @@ -17,52 +17,55 @@
 
 package org.apache.carbondata.core.datastore.columnar;
 
+import java.nio.ByteBuffer;
 import java.util.ArrayList;
+import java.util.Arrays;
 import java.util.List;
 
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
-import org.apache.carbondata.core.util.ByteUtil;
 
 /**
  * Below class will be used to for no inverted index
  */
-public class BlockIndexerStorageForNoInvertedIndexForShort extends BlockIndexerStorage<byte[][]> {
+public class BlockIndexerStorageForNoInvertedIndexForShort
+    extends BlockIndexerStorage<ByteBuffer[]> {
 
   /**
    * column data
    */
-  private byte[][] dataPage;
+  private ByteBuffer[] dataPage;
 
   private short[] dataRlePage;
 
-  public BlockIndexerStorageForNoInvertedIndexForShort(byte[][] dataPage, boolean applyRLE) {
+  public BlockIndexerStorageForNoInvertedIndexForShort(ByteBuffer[] dataPage, boolean applyRLE) {
     this.dataPage = dataPage;
     if (applyRLE) {
-      List<byte[]> actualDataList = new ArrayList<>();
-      for (int i = 0; i < dataPage.length; i++) {
-        actualDataList.add(dataPage[i]);
-      }
+      List<ByteBuffer> actualDataList = Arrays.asList(dataPage);
 
 Review comment:
   **Can we skip converting arrays to list ?** 
   Can we change it to use the array directly ? because as it is one dimensional array now, we can remove list. can use array directly in below methods.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583794215
 
 
   Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/188/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
marchpure commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-583840206
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] ajantha-bhat commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on issue #3607: [CARBONDATA-3670] Support compress offheap data in columnpage directly, avoding a copy of data from offhead to heap before compressed.
URL: https://github.com/apache/carbondata/pull/3607#issuecomment-609595628
 
 
   I think this was already handled in @jackylk 's #3638 
   
   Please check and close the PR if handled.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services