You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/12/02 01:03:58 UTC

[jira] [Commented] (DRILL-4764) Parquet file with INT_16, etc. logical types not supported by simple SELECT

    [ https://issues.apache.org/jira/browse/DRILL-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15713607#comment-15713607 ] 

ASF GitHub Bot commented on DRILL-4764:
---------------------------------------

Github user parthchandra commented on a diff in the pull request:

    https://github.com/apache/drill/pull/673#discussion_r90570343
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet2/TestDrillParquetReader.java ---
    @@ -84,4 +94,43 @@ public void test4349() throws Exception {
           .sqlBaselineQuery("SELECT columns[0] id, CAST(NULLIF(columns[1], '') AS DOUBLE) val FROM cp.`parquet2/4349.csv.gz` WHERE columns[0] = 'b'")
           .go();
       }
    +
    +  @Test
    +  public void testUnsignedByteShortIntLong() throws Exception {
    +    Path file = new Path(getDfsTestTmpSchemaLocation(), "uint_types.parquet");
    +    Configuration conf = new Configuration();
    +    MessageType schema = parseMessageType(
    +      "message test { "
    +        + "required int32 int8_field (INT_8); "
    +        + "required int32 uint8_field (UINT_8); "
    +        + "required int32 int16_field (INT_16); "
    +        + "required int32 uint16_field (UINT_16); "
    +        + "required int32 uint32_field (UINT_32); "
    +        + "required int64 uint64_field (UINT_64); "
    +        +"} ");
    +    GroupWriteSupport.setSchema(schema, conf);
    +
    +    ParquetWriter<Group> writer = ExampleParquetWriter.builder(file)
    +      .withType(schema)
    +      .withConf(conf)
    +      .build();
    +
    +    SimpleGroupFactory groupFactory = new SimpleGroupFactory(schema);
    +    writer.write(
    +      groupFactory.newGroup()
    +        .append("int8_field", -128)
    +        .append("uint8_field", 127)
    +        .append("int16_field", -32768)
    --- End diff --
    
    Can you add test values for unsigned types as well? 0, -1, 256, 65536, etc


> Parquet file with INT_16, etc. logical types not supported by simple SELECT
> ---------------------------------------------------------------------------
>
>                 Key: DRILL-4764
>                 URL: https://issues.apache.org/jira/browse/DRILL-4764
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Data Types
>    Affects Versions: 1.6.0
>            Reporter: Paul Rogers
>            Assignee: Parth Chandra
>         Attachments: int_16.parquet, int_8.parquet, uint_16.parquet, uint_32.parquet, uint_8.parquet
>
>
> Create a Parquet file with the following schema:
> message int16Data { required int32 index; required int32 value (INT_16); }
> Store it as int_16.parquet in the local file system. Query it with:
> SELECT * from `local`.`root`.`int_16.parquet`;
> The result, in the web UI, is this error:
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: UnsupportedOperationException: unsupported type: INT32 INT_16 Fragment 0:0 [Error Id: c63f66b4-e5a9-4a35-9ceb-546b74645dd4 on 172.30.1.28:31010]
> The INT_16 logical (or "original") type simply tells consumers of the file that the data is actually a 16-bit signed int. Presumably, this should tell Drill to use the SmallIntVector (or NullableSmallIntVector) class for storage. Without supporting this annotation, even 16-bit integers must be stored as 32-bits within Drill.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)