You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Ismaël Mejía (Jira)" <ji...@apache.org> on 2020/09/22 06:16:00 UTC
[jira] [Assigned] (BEAM-10481) MetricsAccumulator is not
registering when resuming from a checkpoint
[ https://issues.apache.org/jira/browse/BEAM-10481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ismaël Mejía reassigned BEAM-10481:
-----------------------------------
Assignee: Tim Clemons
> MetricsAccumulator is not registering when resuming from a checkpoint
> ---------------------------------------------------------------------
>
> Key: BEAM-10481
> URL: https://issues.apache.org/jira/browse/BEAM-10481
> Project: Beam
> Issue Type: Bug
> Components: runner-spark
> Affects Versions: 2.22.0
> Reporter: Mani Kolbe
> Assignee: Tim Clemons
> Priority: P2
> Attachments: image-2020-07-14-10-55-11-855.png
>
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
>
> I am running in an beam application in streaming mode with jobName and checkPointDir configured. When I recover application from a planned stop, I am getting failure.
>
> I did some investigation and noticed that the accumulator is not getting registered on recovering checkpoint scenario.
>
> Correct me if I am wrong. If you see the code in the screenshot below from MetricsAccumulator class on beam v2.22.0, you can see new instance of MetricsContainerStepMapAccumulator is getting registered on line#64. But if a recovered value is present, it constructs a new instance with the recovered value (Line#78). But this new accumulator instance is not getting registered. This is forcing Spark Driver to throw exception: _java.lang.UnsupportedOperationException: Accumulator must be registered before send to executor_
> !image-2020-07-14-10-55-11-855.png|width=706,height=351!
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)