You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Chongchen Chen (Jira)" <ji...@apache.org> on 2019/12/18 06:40:00 UTC

[jira] [Comment Edited] (FLINK-15249) Improve PipelinedRegions calculation with Union Set

    [ https://issues.apache.org/jira/browse/FLINK-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16998845#comment-16998845 ] 

Chongchen Chen edited comment on FLINK-15249 at 12/18/19 6:39 AM:
------------------------------------------------------------------

the set union operation is O(N) -> O(1), but set union is only a part of calculation. that's why it's not O(N) -> O(1)  improvement in total.  Disjoint Set Data Structure has very good theoretical guarantee, it can be applied to any kinds of topologies. 

'Did you check why it becomes faster?'.  For Disjoint Set Data Structure, set union only need to set a key value in hashmap, but normal set operation need to copy all elements from one set into another set. [~zhuzh]

 


was (Author: nppoly):
the set union operation is O(N) -> O(1), but set union is only a part of calculation. that's why it's not O(N) -> O(1)  improvement in total.  Disjoint Set Data Structure has very good theoretical guarantee, it can be applied to any kinds of topologies. 

'Did you check why it becomes faster?'.  For Disjoint Set Data Structure, set union only need to set a key value in hashmap, but normal set operation need to copy all elements from one set into another set.

 

> Improve PipelinedRegions calculation with Union Set
> ---------------------------------------------------
>
>                 Key: FLINK-15249
>                 URL: https://issues.apache.org/jira/browse/FLINK-15249
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>            Reporter: Chongchen Chen
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: PipelinedRegionComputeUtil.diff
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Union Set's Merge Set cost is O(1). current implementation is O(N). the attachment is patch.
> [Disjoint Set Data Structure|[https://en.wikipedia.org/wiki/Disjoint-set_data_structure]]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)