You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Flink Jira Bot (Jira)" <ji...@apache.org> on 2022/01/07 22:40:00 UTC

[jira] [Updated] (FLINK-5127) Reduce the amount of intermediate data in vertex-centric iterations

     [ https://issues.apache.org/jira/browse/FLINK-5127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Flink Jira Bot updated FLINK-5127:
----------------------------------
      Labels: auto-deprioritized-major auto-deprioritized-minor auto-unassigned  (was: auto-deprioritized-major auto-unassigned stale-minor)
    Priority: Not a Priority  (was: Minor)

This issue was labeled "stale-minor" 7 days ago and has not received any updates so it is being deprioritized. If this ticket is actually Minor, please raise the priority and ask a committer to assign you the issue or revive the public discussion.


> Reduce the amount of intermediate data in vertex-centric iterations
> -------------------------------------------------------------------
>
>                 Key: FLINK-5127
>                 URL: https://issues.apache.org/jira/browse/FLINK-5127
>             Project: Flink
>          Issue Type: Improvement
>          Components: Library / Graph Processing (Gelly)
>    Affects Versions: 1.1.0, 1.2.0
>            Reporter: Vasia Kalavri
>            Priority: Not a Priority
>              Labels: auto-deprioritized-major, auto-deprioritized-minor, auto-unassigned
>
> The vertex-centric plan contains a join between the workset (messages) and the solution set (vertices) that outputs <Vertex, Message> tuples. This intermediate dataset is then co-grouped with the edges to provide the Pregel interface directly.
> This issue proposes an improvement to reduce the size of this intermediate dataset. In particular, the vertex state does not have to be attached to all the output tuples of the join. If we replace the join with a coGroup and use an `Either` type, we can attach the vertex state to the first tuple only. The subsequent coGroup can retrieve the vertex state from the first tuple and correctly expose the Pregel interface.
> In my preliminary experiments, I find that this change reduces intermediate data by 2x for small vertex state and 4-5x for large vertex states. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)