You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Ismaël Mejía (Jira)" <ji...@apache.org> on 2020/09/22 06:15:00 UTC

[jira] [Updated] (BEAM-10481) MetricsAccumulator is not registering when resuming from a checkpoint

     [ https://issues.apache.org/jira/browse/BEAM-10481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ismaël Mejía updated BEAM-10481:
--------------------------------
    Status: Open  (was: Triage Needed)

> MetricsAccumulator is not registering when resuming from a checkpoint
> ---------------------------------------------------------------------
>
>                 Key: BEAM-10481
>                 URL: https://issues.apache.org/jira/browse/BEAM-10481
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-spark
>    Affects Versions: 2.22.0
>            Reporter: Mani Kolbe
>            Priority: P2
>         Attachments: image-2020-07-14-10-55-11-855.png
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
>  
> I am running in an beam application in streaming mode with jobName and checkPointDir configured. When I recover application from a planned stop, I am getting failure.
>  
> I did some investigation and noticed that the accumulator is not getting registered on recovering checkpoint scenario.
>  
> Correct me if I am wrong. If you see the code in the screenshot below from MetricsAccumulator class on beam v2.22.0, you can see new instance of MetricsContainerStepMapAccumulator is getting registered on line#64. But if a recovered value is present, it constructs a new instance with the recovered value (Line#78). But this new accumulator instance is not getting registered. This is forcing Spark Driver to throw exception:  _java.lang.UnsupportedOperationException: Accumulator must be registered before send to executor_
>   !image-2020-07-14-10-55-11-855.png|width=706,height=351!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)