You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Denys Ordynskiy (JIRA)" <ji...@apache.org> on 2018/11/28 16:39:00 UTC

[jira] [Created] (DRILL-6874) CTAS from json to parquet is not working on S3 storage

Denys Ordynskiy created DRILL-6874:
--------------------------------------

             Summary: CTAS from json to parquet is not working on S3 storage
                 Key: DRILL-6874
                 URL: https://issues.apache.org/jira/browse/DRILL-6874
             Project: Apache Drill
          Issue Type: Bug
    Affects Versions: 1.15.0
            Reporter: Denys Ordynskiy
         Attachments: ctasjsontoparquet.zip, drillbit.log, drillbit_queries.json, s3src.json, sqlline.log

Json file "s3src.json" was uploaded to the s3 storage.
Query from Json works fine:
select * from s3.tmp.`s3src.json`;
| id  |  first_name  |  last_name  |
| 1   | first_name1  | last_name1  |
| 2   | first_name2  | last_name2  |
| 3   | first_name3  | last_name3  |
| 4   | first_name4  | last_name4  |
| 5   | first_name5  | last_name5  |
5 rows selected (2.803 seconds)

CTAS from this json file returns successfully result:
create table s3.tmp.`ctasjsontoparquet` as select * from s3.tmp.`s3src.json`;
| Fragment  | Number of records written  |
| 0_0       | 5                          |
1 row selected (9.264 seconds)

*Query from the created parquet table {color:#d04437}throws an error:{color}*
select * from s3.tmp.`ctasjsontoparquet`;

{code:java}
Error: INTERNAL_ERROR ERROR: Error in parquet record reader.
Message: Failure in setting up reader
Parquet Metadata: ParquetMetaData{FileMetaData{schema: message root {
  optional int64 id;
  optional binary first_name (UTF8);
  optional binary last_name (UTF8);
}
, metadata: {drill-writer.version=2, drill.version=1.15.0-SNAPSHOT}}, blocks: [BlockMetaData{5, 360 [ColumnMetaData{UNCOMPRESSED [id] optional int64 id  [BIT_PACKED, RLE, PLAIN], 4}, ColumnMetaData{UNCOMPRESSED [first_name] optional binary first_name (UTF8)  [BIT_PACKED, RLE, PLAIN], 111}, ColumnMetaData{UNCOMPRESSED [last_name] optional binary last_name (UTF8)  [BIT_PACKED, RLE, PLAIN], 241}]}]}

Fragment 0:0

Please, refer to logs for more information.

[Error Id: 885723e4-8385-4fb0-87dd-c08b0570db95 on maprhost:31010] (state=,code=0)
{code}

The same CTAS query works fine on MapRFS and FileSystem storages.

Log files, json file and created parquet file from S3 are in the attachments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)