You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Kenneth Knowles (JIRA)" <ji...@apache.org> on 2017/04/03 23:15:41 UTC

[jira] [Commented] (BEAM-1867) Element counts missing on Cloud Dataflow when PCollection has anything other than hardcoded name pattern

    [ https://issues.apache.org/jira/browse/BEAM-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15954306#comment-15954306 ] 

Kenneth Knowles commented on BEAM-1867:
---------------------------------------

[~dhalperi@google.com] and [~tgroh] there might be SDK-side mitigations but otherwise this is just an internal bug, it seems. I'll leave open for you to agree/disagree.

> Element counts missing on Cloud Dataflow when PCollection has anything other than hardcoded name pattern
> --------------------------------------------------------------------------------------------------------
>
>                 Key: BEAM-1867
>                 URL: https://issues.apache.org/jira/browse/BEAM-1867
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow
>            Reporter: Kenneth Knowles
>            Priority: Blocker
>             Fix For: First stable release
>
>
> In 0.6.0 and 0.7.0-SNAPSHOT (and possibly all past versions, these are just those where it is confirmed) element count and byte metrics are not reported correctly when the output PCollection for a primitive transform is not {{transformname + ".out" + index}}.
> In 0.7.0-SNAPSHOT, the DataflowRunner uses pipeline surgery to replace the composite {{ParDoSingle}} (that contains a {{ParDoMulti}}) with a Dataflow-specific non-composite {{ParDoSingle}}. So metrics are reported for names like {{"ParDoSingle(MyDoFn).out"}} when they should be reported for {{"ParDoSingle/ParDoMulti(MyDoFn).out"}}. So all single-output ParDo transforms lack these metrics on their outputs.
> In 0.6.0 the same problem occurs if the user ever uses {{PCollection.setName}} to give their collection a meaningful name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)