You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by vdiravka <gi...@git.apache.org> on 2016/12/15 19:36:33 UTC

[GitHub] drill pull request #697: DRILL-5097: Using store.parquet.reader.int96_as_tim...

GitHub user vdiravka opened a pull request:

    https://github.com/apache/drill/pull/697

    DRILL-5097: Using store.parquet.reader.int96_as_timestamp gives IOOB whereas convert_from works

    When the int96 value is converted into timestamp (long in java) we cut the nanos precision to millis. But need to change dataTypeLengthInBits of columnReader (while converting parquet fixed binary type INT96 into drill TimeStamp) from 12 byte(s) to 8 byte(s) as well.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vdiravka/drill DRILL-5097

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/697.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #697
    
----
commit ab6bad27469234587a059cccd7b5497279828450
Author: Vitalii Diravka <vi...@gmail.com>
Date:   2016-12-14T16:24:08Z

    DRILL-5097: Using store.parquet.reader.int96_as_timestamp gives IOOB whereas convert_from works

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #697: DRILL-5097: Using store.parquet.reader.int96_as_tim...

Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on a diff in the pull request:

    https://github.com/apache/drill/pull/697#discussion_r96213729
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/NullableFixedByteAlignedReaders.java ---
    @@ -132,6 +137,9 @@ protected void readField(long recordsToReadInThisPass) {
               valueVec.getMutator().setSafe(valuesReadInCurrentPass + i, getDateTimeValueFromBinary(binaryTimeStampValue));
             }
           }
    +      // The nanos precision is cut to millis. Therefore the length of single timestamp value is 8 bytes(s)
    +      // instead of 12 byte(s).
    +      dataTypeLengthInBits = timestampLengthInBits;
    --- End diff --
    
    Yes, it is. Because when [`PARQUET_READER_INT96_AS_TIMESTAMP` is set to false](
    https://github.com/apache/drill/blob/83513daf0903e0d94fcaad7b1ae4e8ad6272b494/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/ColumnReaderFactory.java#L246) 
    `NullableFixedBinaryAsTimeStampReader` will not used. Instead of that `NullableFixedBinaryReader` will used.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #697: DRILL-5097: Using store.parquet.reader.int96_as_tim...

Posted by bitblender <gi...@git.apache.org>.
Github user bitblender commented on a diff in the pull request:

    https://github.com/apache/drill/pull/697#discussion_r94350780
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/NullableFixedByteAlignedReaders.java ---
    @@ -110,9 +110,14 @@ protected void readField(long recordsToReadInThisPass) {
     
       /**
        * Class for reading parquet fixed binary type INT96, which is used for storing hive,
    -   * impala timestamp values with nanoseconds precision. So it reads such values as a drill timestamp.
    +   * impala timestamp values with nanoseconds precision (12 bytes). So it reads such values as a drill timestamp (8 bytes).
        */
       static class NullableFixedBinaryAsTimeStampReader extends NullableFixedByteAlignedReader<NullableTimeStampVector> {
    +    /**
    +     * The width of each element of the TimeStampVector is 8 byte(s).
    +     */
    +    private static final int timestampLengthInBits = 64;
    --- End diff --
    
    I would recommend the use of uppercase in naming constants.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #697: DRILL-5097: Using store.parquet.reader.int96_as_tim...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/drill/pull/697


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #697: DRILL-5097: Using store.parquet.reader.int96_as_tim...

Posted by bitblender <gi...@git.apache.org>.
Github user bitblender commented on a diff in the pull request:

    https://github.com/apache/drill/pull/697#discussion_r94350967
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/NullableFixedByteAlignedReaders.java ---
    @@ -132,6 +137,9 @@ protected void readField(long recordsToReadInThisPass) {
               valueVec.getMutator().setSafe(valuesReadInCurrentPass + i, getDateTimeValueFromBinary(binaryTimeStampValue));
             }
           }
    +      // The nanos precision is cut to millis. Therefore the length of single timestamp value is 8 bytes(s)
    +      // instead of 12 byte(s).
    +      dataTypeLengthInBits = timestampLengthInBits;
    --- End diff --
    
    Just trying to understand this better: Is it OK to set the length to 64 bits even if PARQUET_READER_INT96_AS_TIMESTAMP is set to false?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #697: DRILL-5097: Using store.parquet.reader.int96_as_tim...

Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on a diff in the pull request:

    https://github.com/apache/drill/pull/697#discussion_r96212464
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/NullableFixedByteAlignedReaders.java ---
    @@ -110,9 +110,14 @@ protected void readField(long recordsToReadInThisPass) {
     
       /**
        * Class for reading parquet fixed binary type INT96, which is used for storing hive,
    -   * impala timestamp values with nanoseconds precision. So it reads such values as a drill timestamp.
    +   * impala timestamp values with nanoseconds precision (12 bytes). So it reads such values as a drill timestamp (8 bytes).
        */
       static class NullableFixedBinaryAsTimeStampReader extends NullableFixedByteAlignedReader<NullableTimeStampVector> {
    +    /**
    +     * The width of each element of the TimeStampVector is 8 byte(s).
    +     */
    +    private static final int timestampLengthInBits = 64;
    --- End diff --
    
    Agree. Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #697: DRILL-5097: Using store.parquet.reader.int96_as_timestamp ...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on the issue:

    https://github.com/apache/drill/pull/697
  
    +1



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---