You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Beam JIRA Bot (Jira)" <ji...@apache.org> on 2020/08/10 17:08:23 UTC

[jira] [Commented] (BEAM-6372) Direct Runner should marshal data in a similar way to Dataflow runner

    [ https://issues.apache.org/jira/browse/BEAM-6372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174788#comment-17174788 ] 

Beam JIRA Bot commented on BEAM-6372:
-------------------------------------

This issue is P2 but has been unassigned without any comment for 60 days so it has been labeled "stale-P2". If this issue is still affecting you, we care! Please comment and remove the label. Otherwise, in 14 days the issue will be moved to P3.

Please see https://beam.apache.org/contribute/jira-priorities/ for a detailed explanation of what these priorities mean.


> Direct Runner should marshal data in a similar way to Dataflow runner
> ---------------------------------------------------------------------
>
>                 Key: BEAM-6372
>                 URL: https://issues.apache.org/jira/browse/BEAM-6372
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-direct, sdk-go
>            Reporter: Andrew Brampton
>            Priority: P2
>              Labels: stale-P2
>
> I would test my pipeline using the direct runner, and it would happily run on a sample. When I ran it on the Dataflow runner, it'll run for a hour, then get to a stage that would crash like so:
>  
> {quote}java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error received from SDK harness for instruction -224: execute failed: panic: reflect: Call using main.HistogramResult as type struct \{ Key string "json:\"key\""; Files []string "json:\"files\""; Histogram palette.ColorHistogram "json:\"histogram,omitempty\""; Palette []struct { R uint8; G uint8; B uint8; A uint8 } "json:\"palette\"" } goroutine 70 [running]:{quote}
> This was because I forgot to register my HistogramResult type.
> It would be useful if the direct runner tried to marshal and unmarshal all types, to help expose issues like this earlier.
> Also, when running on Dataflow, the value of flags, and captured variables, would be the empty/default value. It would be good if direct also caused this behaviour. For example:
> {code}
> prefix := “X”
> s = s.Scope(“Prefix ” + prefix)
> c = beam.ParDo(s, func(value string) string {
> 	return prefix + value
> }, c)
> {code}
> Will work prefix "X" on the Direct runner, but will prefix "" on Dataflow. Subtle behaviour, but I suspect the direct runner could expose this if it marshalled and unmarshalled the func like the dataflow runner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)