You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/12/02 01:03:58 UTC
[jira] [Commented] (DRILL-4764) Parquet file with INT_16, etc.
logical types not supported by simple SELECT
[ https://issues.apache.org/jira/browse/DRILL-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15713607#comment-15713607 ]
ASF GitHub Bot commented on DRILL-4764:
---------------------------------------
Github user parthchandra commented on a diff in the pull request:
https://github.com/apache/drill/pull/673#discussion_r90570343
--- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet2/TestDrillParquetReader.java ---
@@ -84,4 +94,43 @@ public void test4349() throws Exception {
.sqlBaselineQuery("SELECT columns[0] id, CAST(NULLIF(columns[1], '') AS DOUBLE) val FROM cp.`parquet2/4349.csv.gz` WHERE columns[0] = 'b'")
.go();
}
+
+ @Test
+ public void testUnsignedByteShortIntLong() throws Exception {
+ Path file = new Path(getDfsTestTmpSchemaLocation(), "uint_types.parquet");
+ Configuration conf = new Configuration();
+ MessageType schema = parseMessageType(
+ "message test { "
+ + "required int32 int8_field (INT_8); "
+ + "required int32 uint8_field (UINT_8); "
+ + "required int32 int16_field (INT_16); "
+ + "required int32 uint16_field (UINT_16); "
+ + "required int32 uint32_field (UINT_32); "
+ + "required int64 uint64_field (UINT_64); "
+ +"} ");
+ GroupWriteSupport.setSchema(schema, conf);
+
+ ParquetWriter<Group> writer = ExampleParquetWriter.builder(file)
+ .withType(schema)
+ .withConf(conf)
+ .build();
+
+ SimpleGroupFactory groupFactory = new SimpleGroupFactory(schema);
+ writer.write(
+ groupFactory.newGroup()
+ .append("int8_field", -128)
+ .append("uint8_field", 127)
+ .append("int16_field", -32768)
--- End diff --
Can you add test values for unsigned types as well? 0, -1, 256, 65536, etc
> Parquet file with INT_16, etc. logical types not supported by simple SELECT
> ---------------------------------------------------------------------------
>
> Key: DRILL-4764
> URL: https://issues.apache.org/jira/browse/DRILL-4764
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Data Types
> Affects Versions: 1.6.0
> Reporter: Paul Rogers
> Assignee: Parth Chandra
> Attachments: int_16.parquet, int_8.parquet, uint_16.parquet, uint_32.parquet, uint_8.parquet
>
>
> Create a Parquet file with the following schema:
> message int16Data { required int32 index; required int32 value (INT_16); }
> Store it as int_16.parquet in the local file system. Query it with:
> SELECT * from `local`.`root`.`int_16.parquet`;
> The result, in the web UI, is this error:
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: UnsupportedOperationException: unsupported type: INT32 INT_16 Fragment 0:0 [Error Id: c63f66b4-e5a9-4a35-9ceb-546b74645dd4 on 172.30.1.28:31010]
> The INT_16 logical (or "original") type simply tells consumers of the file that the data is actually a 16-bit signed int. Presumably, this should tell Drill to use the SmallIntVector (or NullableSmallIntVector) class for storage. Without supporting this annotation, even 16-bit integers must be stored as 32-bits within Drill.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)