You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Kenneth Knowles (JIRA)" <ji...@apache.org> on 2017/04/21 01:24:04 UTC
[jira] [Resolved] (BEAM-1867) Element counts missing on Cloud
Dataflow when PCollection has anything other than hardcoded name pattern
[ https://issues.apache.org/jira/browse/BEAM-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kenneth Knowles resolved BEAM-1867.
-----------------------------------
Resolution: Fixed
> Element counts missing on Cloud Dataflow when PCollection has anything other than hardcoded name pattern
> --------------------------------------------------------------------------------------------------------
>
> Key: BEAM-1867
> URL: https://issues.apache.org/jira/browse/BEAM-1867
> Project: Beam
> Issue Type: Bug
> Components: runner-dataflow
> Reporter: Kenneth Knowles
> Assignee: Kenneth Knowles
> Priority: Blocker
> Fix For: First stable release
>
>
> In 0.6.0 and 0.7.0-SNAPSHOT (and possibly all past versions, these are just those where it is confirmed) element count and byte metrics are not reported correctly when the output PCollection for a primitive transform is not {{transformname + ".out" + index}}.
> In 0.7.0-SNAPSHOT, the DataflowRunner uses pipeline surgery to replace the composite {{ParDoSingle}} (that contains a {{ParDoMulti}}) with a Dataflow-specific non-composite {{ParDoSingle}}. So metrics are reported for names like {{"ParDoSingle(MyDoFn).out"}} when they should be reported for {{"ParDoSingle/ParDoMulti(MyDoFn).out"}}. So all single-output ParDo transforms lack these metrics on their outputs.
> In 0.6.0 the same problem occurs if the user ever uses {{PCollection.setName}} to give their collection a meaningful name.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)