You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Jark Wu (Jira)" <ji...@apache.org> on 2020/12/08 13:29:00 UTC
[jira] [Commented] (FLINK-20538) sink.rolling-policy.file-size does
not work in filesystem connector
[ https://issues.apache.org/jira/browse/FLINK-20538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245886#comment-17245886 ]
Jark Wu commented on FLINK-20538:
---------------------------------
cc [~lzljs3620320] could you have a look?
> sink.rolling-policy.file-size does not work in filesystem connector
> -------------------------------------------------------------------
>
> Key: FLINK-20538
> URL: https://issues.apache.org/jira/browse/FLINK-20538
> Project: Flink
> Issue Type: Bug
> Components: Connectors / FileSystem
> Affects Versions: 1.11.1
> Reporter: zhuxiaoshang
> Priority: Major
>
> When I use sql filesystem connector to write data to hdfs,and set sink.rolling-policy.file-size to 50MB.But seems not working, there are still 100MB+ size files.
> My table ddl is :
>
> {code:java}
> CREATE TABLE cpc_bd_recall_log_hdfs (
> log_timestamp BIGINT,
> ip STRING,
> `raw` STRING,
> `day` STRING, `hour` STRING,`minute` STRING
> ) PARTITIONED BY (`day` , `hour` ,`minute`) WITH (
> 'connector'='filesystem',
> 'path'='hdfs://xxx/test.db/hdfs_test',
> 'format'='parquet',
> 'parquet.compression'='SNAPPY',
> 'sink.rolling-policy.file-size' = '50MB',
> 'sink.partition-commit.policy.kind' = 'success-file',
> 'sink.partition-commit.delay'='60s'
> );
> {code}
> the hdfs files are:
>
>
> {code:java}
> 0 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/_SUCCESS
> -rw-r--r-- 3 hadoop hadoop 31.7 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-0-2500
> -rw-r--r-- 3 hadoop hadoop 121.8 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-0-2501
> -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-1-2499
> -rw-r--r-- 3 hadoop hadoop 122.0 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-1-2500
> -rw-r--r-- 3 hadoop hadoop 31.8 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-10-2501
> -rw-r--r-- 3 hadoop hadoop 121.8 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-10-2502
> -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-11-2500
> -rw-r--r-- 3 hadoop hadoop 122.2 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-11-2501
> -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-12-2500
> -rw-r--r-- 3 hadoop hadoop 122.2 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-12-2501
> -rw-r--r-- 3 hadoop hadoop 31.8 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-13-2499
> -rw-r--r-- 3 hadoop hadoop 122.0 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-13-2500
> -rw-r--r-- 3 hadoop hadoop 31.6 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-14-2500
> -rw-r--r-- 3 hadoop hadoop 122.1 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-14-2501
> -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-15-2498
> -rw-r--r-- 3 hadoop hadoop 121.8 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-15-2499
> -rw-r--r-- 3 hadoop hadoop 31.7 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-16-2501
> -rw-r--r-- 3 hadoop hadoop 122.0 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-16-2502
> -rw-r--r-- 3 hadoop hadoop 31.7 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-17-2500
> -rw-r--r-- 3 hadoop hadoop 122.5 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-17-2501
> -rw-r--r-- 3 hadoop hadoop 31.8 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-18-2500
> -rw-r--r-- 3 hadoop hadoop 121.7 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-18-2501
> -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-19-2501
> -rw-r--r-- 3 hadoop hadoop 121.7 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-19-2502
> -rw-r--r-- 3 hadoop hadoop 31.6 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-2-2499
> -rw-r--r-- 3 hadoop hadoop 121.6 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-2-2500
> -rw-r--r-- 3 hadoop hadoop 31.8 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-3-2500
> -rw-r--r-- 3 hadoop hadoop 121.8 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-3-2501
> -rw-r--r-- 3 hadoop hadoop 31.6 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-4-2499
> -rw-r--r-- 3 hadoop hadoop 122.1 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-4-2500
> -rw-r--r-- 3 hadoop hadoop 31.6 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-5-2499
> -rw-r--r-- 3 hadoop hadoop 121.8 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-5-2500
> -rw-r--r-- 3 hadoop hadoop 31.8 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-6-2499
> -rw-r--r-- 3 hadoop hadoop 121.5 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-6-2500
> -rw-r--r-- 3 hadoop hadoop 31.6 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-7-2500
> -rw-r--r-- 3 hadoop hadoop 122.0 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-7-2501
> -rw-r--r-- 3 hadoop hadoop 31.7 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-8-2501
> -rw-r--r-- 3 hadoop hadoop 122.0 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-8-2502
> -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-9-2501
> -rw-r--r-- 3 hadoop hadoop 121.9 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-9-2502
> {code}
>
>
> However,when I dig into source code,when writing element to bucket it'll invoke `shouldRollOnEvent` in TableRollingPolicy.
> I don't understand how can this happen?Is a BUG or somewhere I get it wrong.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)