You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by manishgupta88 <gi...@git.apache.org> on 2017/08/29 09:40:16 UTC

[GitHub] carbondata pull request #1297: [WIP] Added a value based compression for dec...

GitHub user manishgupta88 opened a pull request:

    https://github.com/apache/carbondata/pull/1297

    [WIP] Added a value based compression for decimal data type when decimal is stored as Int or Long

    Added a value based compression for decimal data type when decimal is stored as Int or Long
    
    1. When decimal precision is <= 9, decimal values are stored in 4 bytes but they are not compressed further based on min and max values as compared with other primitive data type compression. Therefore now based on min and max value decimal data falling in Integer range will be further compressed as byte or short.
    2. When decimal precision is <= 18, decimal values are stored in 8 bytes but they are not compressed further based on min and max values as compared with other primitive data type compression. Therefore now based on min and max value decimal data falling in Long range will be further compressed as byte, short or int.
    
    Advantage: This will reduce the storage space thereby decreasing the IO time while decompressing the data.
    
    TODO: Need to be implemented for unsafe type


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/manishgupta88/carbondata decimal_compression

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1297.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1297
    
----
commit 04d47802fa139f9a359ca6727f4f79a26402b43a
Author: manishgupta88 <to...@gmail.com>
Date:   2017-08-24T07:13:58Z

    Added a value based compression for decimal data type.
    
    1. When decimal precision is <= 9, decimal values are stored in 4 bytes but they are not compressed further based on min and max values as compared with other primitive data type compression. Therefore now based on min and max value decimal data falling in Integer range will be further compressed as byte or short.
    2. When decimal precision is <= 18, decimal values are stored in 8 bytes but they are not compressed further based on min and max values as compared with other primitive data type compression. Therefore now based on min and max value decimal data falling in Long range will be further compressed as byte, short or int.
    
    Advantage: This will reduce the storage space thereby decreasing the IO time while decompressing the data.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    retest this please


---

[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1297#discussion_r138915973
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/DecimalColumnPage.java ---
    @@ -0,0 +1,96 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.core.datastore.page;
    +
    +import org.apache.carbondata.core.datastore.TableSpec;
    +import org.apache.carbondata.core.metadata.datatype.DataType;
    +import org.apache.carbondata.core.metadata.datatype.DecimalConverterFactory;
    +
    +/**
    + * Represent a columnar data in one page for one column of decimal data type
    + */
    +public abstract class DecimalColumnPage extends VarLengthColumnPageBase {
    +
    +  /**
    +   * decimal converter instance
    +   */
    +  DecimalConverterFactory.DecimalConverter decimalConverter;
    +
    +  DecimalColumnPage(TableSpec.ColumnSpec columnSpec, DataType dataType, int pageSize) {
    +    super(columnSpec, dataType, pageSize);
    +    decimalConverter = DecimalConverterFactory.INSTANCE
    +        .getDecimalConverter(columnSpec.getPrecision(), columnSpec.getScale());
    +  }
    +
    +  public DecimalConverterFactory.DecimalConverter getDecimalConverter() {
    +    return decimalConverter;
    +  }
    +
    +  @Override public byte[] getBytePage() {
    +    throw new UnsupportedOperationException("invalid data type: " + dataType);
    +  }
    +
    +  @Override public short[] getShortPage() {
    +    throw new UnsupportedOperationException("invalid data type: " + dataType);
    +  }
    +
    +  @Override public byte[] getShortIntPage() {
    +    throw new UnsupportedOperationException("invalid data type: " + dataType);
    +  }
    +
    +  @Override public int[] getIntPage() {
    +    throw new UnsupportedOperationException("invalid data type: " + dataType);
    +  }
    +
    +  @Override public long[] getLongPage() {
    +    throw new UnsupportedOperationException("invalid data type: " + dataType);
    +  }
    +
    +  @Override public float[] getFloatPage() {
    +    throw new UnsupportedOperationException("invalid data type: " + dataType);
    +  }
    +
    +  @Override public double[] getDoublePage() {
    --- End diff --
    
    Can you move all `@Override` to previous line


---

[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1297#discussion_r138916072
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/LazyColumnPage.java ---
    @@ -91,9 +93,24 @@ public float getFloat(int rowId) {
         throw new UnsupportedOperationException("internal error");
       }
     
    -  @Override
    -  public BigDecimal getDecimal(int rowId) {
    -    return columnPage.getDecimal(rowId);
    +  @Override public BigDecimal getDecimal(int rowId) {
    --- End diff --
    
    Can you move all @Override to previous line


---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/793/



---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/779/



---

[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/1297


---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/741/



---

[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1297#discussion_r137270799
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/DefaultEncodingStrategy.java ---
    @@ -130,7 +146,8 @@ private static DataType fitLongMinMax(long max, long min) {
         }
       }
     
    -  private static DataType fitMinMax(DataType dataType, Object max, Object min) {
    +  private static DataType fitMinMax(DataType dataType, Object max, Object min,
    +      DecimalConverterFactory.DecimalConverterType decimalConverterType) {
    --- End diff --
    
    I think it is not good to pass `decimalConverterType` in many functions. It makes code complex. It is better to think of a way to encapsulate it.


---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by manishgupta88 <gi...@git.apache.org>.
Github user manishgupta88 commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

Posted by manishgupta88 <gi...@git.apache.org>.
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1297#discussion_r138941702
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeDecimalColumnPage.java ---
    @@ -0,0 +1,274 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.core.datastore.page;
    +
    +import java.math.BigDecimal;
    +
    +import org.apache.carbondata.core.datastore.TableSpec;
    +import org.apache.carbondata.core.memory.CarbonUnsafe;
    +import org.apache.carbondata.core.memory.MemoryException;
    +import org.apache.carbondata.core.memory.UnsafeMemoryManager;
    +import org.apache.carbondata.core.metadata.datatype.DataType;
    +import org.apache.carbondata.core.util.ByteUtil;
    +
    +/**
    + * Represents a columnar data for decimal data type column for one page
    + */
    +public class UnsafeDecimalColumnPage extends DecimalColumnPage {
    +
    +  UnsafeDecimalColumnPage(TableSpec.ColumnSpec columnSpec, DataType dataType, int pageSize)
    +      throws MemoryException {
    +    super(columnSpec, dataType, pageSize);
    +    capacity = (int) (pageSize * DEFAULT_ROW_SIZE * FACTOR);
    +    initMemory();
    +  }
    +
    +  UnsafeDecimalColumnPage(TableSpec.ColumnSpec columnSpec, DataType dataType, int pageSize,
    +      int capacity) throws MemoryException {
    +    super(columnSpec, dataType, pageSize);
    +    this.capacity = capacity;
    +    initMemory();
    +  }
    +
    +  private void initMemory() throws MemoryException {
    +    switch (dataType) {
    +      case BYTE:
    +      case SHORT:
    +      case INT:
    +      case LONG:
    +        int size = pageSize << dataType.getSizeBits();
    +        memoryBlock = UnsafeMemoryManager.allocateMemoryWithRetry(taskId, size);
    +        baseAddress = memoryBlock.getBaseObject();
    +        baseOffset = memoryBlock.getBaseOffset();
    +        break;
    +      case SHORT_INT:
    +        size = pageSize * 3;
    +        memoryBlock = UnsafeMemoryManager.allocateMemoryWithRetry(taskId, size);
    +        baseAddress = memoryBlock.getBaseObject();
    +        baseOffset = memoryBlock.getBaseOffset();
    +        break;
    +      case DECIMAL:
    +      case STRING:
    --- End diff --
    
    yes it will not be string..I will remove


---

[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1297#discussion_r138936085
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/VarLengthColumnPageBase.java ---
    @@ -22,21 +22,48 @@
     import java.util.List;
     
     import org.apache.carbondata.core.datastore.TableSpec;
    +import org.apache.carbondata.core.memory.CarbonUnsafe;
    +import org.apache.carbondata.core.memory.MemoryBlock;
     import org.apache.carbondata.core.memory.MemoryException;
    +import org.apache.carbondata.core.memory.UnsafeMemoryManager;
     import org.apache.carbondata.core.metadata.datatype.DataType;
     import org.apache.carbondata.core.metadata.datatype.DecimalConverterFactory;
     import org.apache.carbondata.core.util.ByteUtil;
    +import org.apache.carbondata.core.util.ThreadLocalTaskInfo;
     
    +import static org.apache.carbondata.core.metadata.datatype.DataType.BYTE;
     import static org.apache.carbondata.core.metadata.datatype.DataType.DECIMAL;
     
     public abstract class VarLengthColumnPageBase extends ColumnPage {
     
    +  static final int byteBits = BYTE.getSizeBits();
    +  static final int shortBits = DataType.SHORT.getSizeBits();
    +  static final int intBits = DataType.INT.getSizeBits();
    +  static final int longBits = DataType.LONG.getSizeBits();
    +  // default size for each row, grows as needed
    +  static final int DEFAULT_ROW_SIZE = 8;
    +
    +  static final double FACTOR = 1.25;
    +
    +  final long taskId = ThreadLocalTaskInfo.getCarbonTaskInfo().getTaskId();
    +
    +  // memory allocated by Unsafe
    +  MemoryBlock memoryBlock;
    --- End diff --
    
    This is for Unsafe column page only, right


---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    SDV Build Failed with Spark 2.1, Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/340/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

Posted by manishgupta88 <gi...@git.apache.org>.
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1297#discussion_r138941633
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/VarLengthColumnPageBase.java ---
    @@ -22,21 +22,48 @@
     import java.util.List;
     
     import org.apache.carbondata.core.datastore.TableSpec;
    +import org.apache.carbondata.core.memory.CarbonUnsafe;
    +import org.apache.carbondata.core.memory.MemoryBlock;
     import org.apache.carbondata.core.memory.MemoryException;
    +import org.apache.carbondata.core.memory.UnsafeMemoryManager;
     import org.apache.carbondata.core.metadata.datatype.DataType;
     import org.apache.carbondata.core.metadata.datatype.DecimalConverterFactory;
     import org.apache.carbondata.core.util.ByteUtil;
    +import org.apache.carbondata.core.util.ThreadLocalTaskInfo;
     
    +import static org.apache.carbondata.core.metadata.datatype.DataType.BYTE;
     import static org.apache.carbondata.core.metadata.datatype.DataType.DECIMAL;
     
     public abstract class VarLengthColumnPageBase extends ColumnPage {
     
    +  static final int byteBits = BYTE.getSizeBits();
    +  static final int shortBits = DataType.SHORT.getSizeBits();
    +  static final int intBits = DataType.INT.getSizeBits();
    +  static final int longBits = DataType.LONG.getSizeBits();
    +  // default size for each row, grows as needed
    +  static final int DEFAULT_ROW_SIZE = 8;
    +
    +  static final double FACTOR = 1.25;
    +
    +  final long taskId = ThreadLocalTaskInfo.getCarbonTaskInfo().getTaskId();
    +
    +  // memory allocated by Unsafe
    +  MemoryBlock memoryBlock;
    --- End diff --
    
    SafeDecimalColumnPage  will handle both fixed and variable implementation of decimal data based on precsion. So it is also required to be extended from VarLengthColumnPageBase


---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    SDV Build Success with Spark 2.1, Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/348/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/114/



---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    SDV Build Failed with Spark 2.1, Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/304/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/16/



---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3431/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/743/



---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/775/



---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/43/



---

[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1297#discussion_r137270300
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/DefaultEncodingStrategy.java ---
    @@ -104,18 +107,31 @@ private ColumnPageEncoder createEncoderForMeasure(ColumnPage columnPage) {
           case SHORT:
           case INT:
           case LONG:
    -        return selectCodecByAlgorithmForIntegral(stats).createEncoder(null);
    +        return selectCodecByAlgorithmForIntegral(stats,
    +            DecimalConverterFactory.DecimalConverterType.DECIMAL_LONG).createEncoder(null);
    +      case DECIMAL:
    +        return createEncoderForDecimalDataTypeMeasure(columnPage);
    --- End diff --
    
    Rename as others, `selectCodecByAlgorithmForDecimal`. 


---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder1/17/



---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by manishgupta88 <gi...@git.apache.org>.
Github user manishgupta88 commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

Posted by manishgupta88 <gi...@git.apache.org>.
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1297#discussion_r136260340
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/SafeFixLengthColumnPage.java ---
    @@ -165,9 +165,28 @@ public double getDouble(int rowId) {
         return doubleData[rowId];
       }
     
    -  @Override
    -  public BigDecimal getDecimal(int rowId) {
    -    throw new UnsupportedOperationException("invalid data type: " + dataType);
    +  @Override public BigDecimal getDecimal(int rowId) {
    --- End diff --
    
    Please correct me If I am wrong....From my opinion it will not be any advantage to add DecimalColumnPage as we are not doing anything extra if we introduce DecimalColumnPage.
    Basically we are storing values either as fixed or variable length and decimal values also falls in these 2 categories. So according to current design I believe it may not be required to introduce 1 more type of page until and unless we have pages for all other datatypes separately.
    Please share your suggestions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1297#discussion_r138931778
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeDecimalColumnPage.java ---
    @@ -0,0 +1,274 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.core.datastore.page;
    +
    +import java.math.BigDecimal;
    +
    +import org.apache.carbondata.core.datastore.TableSpec;
    +import org.apache.carbondata.core.memory.CarbonUnsafe;
    +import org.apache.carbondata.core.memory.MemoryException;
    +import org.apache.carbondata.core.memory.UnsafeMemoryManager;
    +import org.apache.carbondata.core.metadata.datatype.DataType;
    +import org.apache.carbondata.core.util.ByteUtil;
    +
    +/**
    + * Represents a columnar data for decimal data type column for one page
    + */
    +public class UnsafeDecimalColumnPage extends DecimalColumnPage {
    +
    +  UnsafeDecimalColumnPage(TableSpec.ColumnSpec columnSpec, DataType dataType, int pageSize)
    +      throws MemoryException {
    +    super(columnSpec, dataType, pageSize);
    +    capacity = (int) (pageSize * DEFAULT_ROW_SIZE * FACTOR);
    +    initMemory();
    +  }
    +
    +  UnsafeDecimalColumnPage(TableSpec.ColumnSpec columnSpec, DataType dataType, int pageSize,
    +      int capacity) throws MemoryException {
    +    super(columnSpec, dataType, pageSize);
    +    this.capacity = capacity;
    +    initMemory();
    +  }
    +
    +  private void initMemory() throws MemoryException {
    +    switch (dataType) {
    +      case BYTE:
    +      case SHORT:
    +      case INT:
    +      case LONG:
    +        int size = pageSize << dataType.getSizeBits();
    +        memoryBlock = UnsafeMemoryManager.allocateMemoryWithRetry(taskId, size);
    +        baseAddress = memoryBlock.getBaseObject();
    +        baseOffset = memoryBlock.getBaseOffset();
    +        break;
    +      case SHORT_INT:
    +        size = pageSize * 3;
    +        memoryBlock = UnsafeMemoryManager.allocateMemoryWithRetry(taskId, size);
    +        baseAddress = memoryBlock.getBaseObject();
    +        baseOffset = memoryBlock.getBaseOffset();
    +        break;
    +      case DECIMAL:
    +      case STRING:
    --- End diff --
    
    It can not be String, right?


---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3432/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/118/



---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by QACarbonData <gi...@git.apache.org>.
Github user QACarbonData commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/27/



---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    LGTM


---

[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1297#discussion_r136248243
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/SafeFixLengthColumnPage.java ---
    @@ -165,9 +165,28 @@ public double getDouble(int rowId) {
         return doubleData[rowId];
       }
     
    -  @Override
    -  public BigDecimal getDecimal(int rowId) {
    -    throw new UnsupportedOperationException("invalid data type: " + dataType);
    +  @Override public BigDecimal getDecimal(int rowId) {
    --- End diff --
    
    I think it is better to add a DecimalColumnPage instead of using FixLengthColumnPage


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/828/



---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/113/



---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder1/15/



---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    SDV Build Failed with Spark 2.1, Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/330/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    SDV Build Failed with Spark 2.1, Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/328/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    SDV Build Failed with Spark 2.1, Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/308/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by manishgupta88 <gi...@git.apache.org>.
Github user manishgupta88 commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    @jackylk ..handled review comments


---

[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1297#discussion_r137273402
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/SafeFixLengthColumnPage.java ---
    @@ -165,9 +165,28 @@ public double getDouble(int rowId) {
         return doubleData[rowId];
       }
     
    -  @Override
    -  public BigDecimal getDecimal(int rowId) {
    -    throw new UnsupportedOperationException("invalid data type: " + dataType);
    +  @Override public BigDecimal getDecimal(int rowId) {
    --- End diff --
    
    I think adding converter inside the column page make it a bit complex. I think there are two mays to make the code clean:
    1. Store BigDecimal array in the ColumnPage and do the conversion to int or long when come to encoding part, by implementing a ColumnPageValueConverter
    2. Create DecimalColumnPage class and add the conversion logic in DecimalColumnPage. This logic is the extra logic for decimal only. If you do like this, we can consider to rename `FixLengthColumnPage` to `PrimitiveColumnPage` and `VarLengthColumnPage` to `StringColumnPage` and the 3rd one is `DecimalColumnPage`. The responsibility of each class will be more clear.


---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/164/



---

[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

Posted by manishgupta88 <gi...@git.apache.org>.
Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1297#discussion_r138941549
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/VarLengthColumnPageBase.java ---
    @@ -22,21 +22,48 @@
     import java.util.List;
     
     import org.apache.carbondata.core.datastore.TableSpec;
    +import org.apache.carbondata.core.memory.CarbonUnsafe;
    +import org.apache.carbondata.core.memory.MemoryBlock;
     import org.apache.carbondata.core.memory.MemoryException;
    +import org.apache.carbondata.core.memory.UnsafeMemoryManager;
     import org.apache.carbondata.core.metadata.datatype.DataType;
     import org.apache.carbondata.core.metadata.datatype.DecimalConverterFactory;
     import org.apache.carbondata.core.util.ByteUtil;
    +import org.apache.carbondata.core.util.ThreadLocalTaskInfo;
     
    +import static org.apache.carbondata.core.metadata.datatype.DataType.BYTE;
     import static org.apache.carbondata.core.metadata.datatype.DataType.DECIMAL;
     
     public abstract class VarLengthColumnPageBase extends ColumnPage {
     
    +  static final int byteBits = BYTE.getSizeBits();
    +  static final int shortBits = DataType.SHORT.getSizeBits();
    +  static final int intBits = DataType.INT.getSizeBits();
    +  static final int longBits = DataType.LONG.getSizeBits();
    +  // default size for each row, grows as needed
    +  static final int DEFAULT_ROW_SIZE = 8;
    +
    +  static final double FACTOR = 1.25;
    +
    +  final long taskId = ThreadLocalTaskInfo.getCarbonTaskInfo().getTaskId();
    +
    +  // memory allocated by Unsafe
    +  MemoryBlock memoryBlock;
    --- End diff --
    
    SafeDecimalColumnPage  will handle both fixed and variable implementation of decimal data based on precsion. So it is also required to be extended from VarLengthColumnPageBase


---

[GitHub] carbondata issue #1297: [CARBONDATA-1429] Add a value based compression for ...

Posted by ashwini-krishnakumar <gi...@git.apache.org>.
Github user ashwini-krishnakumar commented on the issue:

    https://github.com/apache/carbondata/pull/1297
  
    @ravipesala please review 


---