You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Gary Yao (Jira)" <ji...@apache.org> on 2020/02/03 12:18:01 UTC

[jira] [Updated] (FLINK-13034) Improve the performance when checking whether mapstate is empty for RocksDBStateBackend

     [ https://issues.apache.org/jira/browse/FLINK-13034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gary Yao updated FLINK-13034:
-----------------------------
    Release Note: We have added a new method MapState#isEmpty() which enables users to check whether a map state is empty. The new method is 40% faster than mapState.keys().iterator().hasNext() when using the RocksDB state backend.  (was: After FLINK-13034 we added a new isEmpty() interface in MapState and its relative views. Users could use this API to verify whether the map state is empty with better performance when using RocksDBStateBackend. Unbounded RANGE/ROWS window in table API has taken benefit of this improvement.)

> Improve the performance when checking whether mapstate is empty for RocksDBStateBackend
> ---------------------------------------------------------------------------------------
>
>                 Key: FLINK-13034
>                 URL: https://issues.apache.org/jira/browse/FLINK-13034
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / State Backends
>    Affects Versions: 1.6.3, 1.7.2, 1.8.1
>            Reporter: Yun Tang
>            Assignee: Yun Tang
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.10.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, there existed several scenarios to check whether map state is empty in Flink source code, e.g.[TemporalRowTimeJoinOperator|https://github.com/apache/flink/blob/8315f38e89f897e32cfa0f23990cb3fb44db0d72/flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/runtime/join/temporal/TemporalRowTimeJoinOperator.java#L192], [AbstractRowTimeUnboundedPrecedingOver|#L160)].
>  Developers would use below command to check whether the map state is empty:
> {code:java}
> boolean noRecordsToProcess = !inputState.keys().iterator().hasNext();
> {code}
> However, if we use {{RocksDBStateBackend}}, {{inputState.keys().iterator().hasNext()}} would actually call 1 {{seek}} and 128 {{next}} actions in [RocksDBMapState|https://github.com/apache/flink/blob/8315f38e89f897e32cfa0f23990cb3fb44db0d72/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBMapState.java#L483], in which the redundant {{next}} actions are not what we want.
> I have two options to improve this:
>  * Modify {{RocksDBMapState}} back to previous design which would first load one element and then load more elements in the follow-up queries. However, this would effect the performance of other map state methods.
>  * Add a {{isEmpty()}} method in the public evolving interface {{MapState}}, so that we could use it to check whether the map state is empty without any redundant RocksDB actions.
> I prefer to the 2nd option.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)