You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Egor Ryashin <ri...@gmail.com> on 2021/12/14 18:33:41 UTC

reading gz files

Hey,

I’m using Flink 1.14 and having trouble ingesting data from json gz file. I’ve successfully created a table but number of records is wrong.
I’m using this SQL:
create table i1(
 line_item_id STRING
) with (
 'connector'='filesystem',
 'path'='/Users/egorryashin/temp/test.json',
 'format' = 'json'
);

create table i2(
 line_item_id STRING
) with (
 'connector'='filesystem',
 'path'='/Users/egorryashin/temp/test.gz',
 'format' = 'json'
);

select count(*) from i1 union all select count(*) from i2;
               EXPR$0
                   65
                  285

Thanks

Re: reading gz files

Posted by Caizhi Weng <ts...@gmail.com>.
Hi!

Thanks for raising this issue. This is unfortunately a bug. I've created a
JIRA ticket [1] and you can check the progress of this issue there.

[1] https://issues.apache.org/jira/browse/FLINK-25311

Egor Ryashin <ri...@gmail.com> 于2021年12月15日周三 02:33写道:

> Hey,
>
> I’m using Flink 1.14 and having trouble ingesting data from json gz file.
> I’ve successfully created a table but number of records is wrong.
> I’m using this SQL:
> create table i1(
>  line_item_id STRING
> ) with (
>  'connector'='filesystem',
>  'path'='/Users/egorryashin/temp/test.json',
>  'format' = 'json'
> );
>
> create table i2(
>  line_item_id STRING
> ) with (
>  'connector'='filesystem',
>  'path'='/Users/egorryashin/temp/test.gz',
>  'format' = 'json'
> );
>
> select count(*) from i1 union all select count(*) from i2;
>                EXPR$0
>                    65
>                   285
>
> Thanks
>