You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Markus Holzemer (JIRA)" <ji...@apache.org> on 2014/07/01 13:49:24 UTC

[jira] [Commented] (FLINK-909) Pitfall due to additional superstep after the iteration has stopped

    [ https://issues.apache.org/jira/browse/FLINK-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14048778#comment-14048778 ] 

Markus Holzemer commented on FLINK-909:
---------------------------------------

I started working on this issue but it seems to be more complicated then I thought. When I wait on a barrier before calling super.run() (that is also calling the open method) inside of the IterationPactTasks I do not receive any channel events.
Has somebody with more experience in the runtime a suggestion why this is the case and what I can do about it?

> Pitfall due to additional superstep after the iteration has stopped
> -------------------------------------------------------------------
>
>                 Key: FLINK-909
>                 URL: https://issues.apache.org/jira/browse/FLINK-909
>             Project: Flink
>          Issue Type: Bug
>            Reporter: GitHub Import
>            Assignee: Markus Holzemer
>              Labels: github-import
>             Fix For: pre-apache
>
>
> Currently, after an iteration has exceeded the maximum number of iterations, all tasks are started again for an additional superstep during which they are stopped. This works if a tasks only waits for dynamic input. However, in the case where one has a task, e.g. a coGroup operation, which gets dynamic and static input the execution is not blocked. This can then lead to erroneous behaviour which the user is not aware of.
> I had this problem implementing ALS. Here one has a loop which gets as dynamic input matrix columns and as static input matrix entries. The columns and the entries are used to construct a matrix which represents a system of linear equations. If the set of columns are empty, then the matrix is singular and thus not solvable. During the additional superstep the task won't receive any columns but would still try to solve the now singular matrix.
> It would be good to finish the iteration without initiating this additional superstep.
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/stratosphere/stratosphere/issues/909
> Created by: [tillrohrmann|https://github.com/tillrohrmann]
> Labels: 
> Created at: Thu Jun 05 17:50:17 CEST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.2#6252)