You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Ryan Blue (JIRA)" <ji...@apache.org> on 2014/11/22 02:07:34 UTC
[jira] [Commented] (PARQUET-26) Parquet doesn't recognize the
nested Array type in MAP as ArrayWritable.
[ https://issues.apache.org/jira/browse/PARQUET-26?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221682#comment-14221682 ]
Ryan Blue commented on PARQUET-26:
----------------------------------
This is fixed in HIVE-8909
> Parquet doesn't recognize the nested Array type in MAP as ArrayWritable.
> ------------------------------------------------------------------------
>
> Key: PARQUET-26
> URL: https://issues.apache.org/jira/browse/PARQUET-26
> Project: Parquet
> Issue Type: Bug
> Reporter: Mala Chikka Kempanna
> Assignee: Ryan Blue
> Attachments: test.dat
>
>
> When trying to insert hive data of type of MAP<string, array<int>> into Parquet, it throws the following error
> Caused by: parquet.io.ParquetEncodingException: This should be an ArrayWritable or MapWritable: org.apache.hadoop.hive.ql.io.parquet.writable.BinaryWritable@c644ef1c
> at org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeData(DataWritableWriter.java:86)
> Problem is reproducible with following steps:
> Relevant test data is attached.
> 1.
> CREATE TABLE test_hive (
> node string,
> stime string,
> stimeutc string,
> swver string,
> moid MAP <string,string>,
> pdfs MAP <string,array<int>>,
> utcdate string,
> motype string)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|'
> COLLECTION ITEMS TERMINATED BY ','
> MAP KEYS TERMINATED BY '=';
> 2.
> LOAD DATA LOCAL INPATH '/root/38388/test.dat' INTO TABLE test_hive;
> 3.
> CREATE TABLE test_parquet(
> pdfs MAP <string,array<int>>
> )
> STORED AS PARQUET ;
> 4.
> INSERT INTO TABLE test_parquet SELECT pdfs FROM test_hive;
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)