You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Eric Kolotyluk (Jira)" <ji...@apache.org> on 2022/03/30 15:47:00 UTC
[jira] [Created] (BEAM-14206) Direct Runner Does Not Work As Well As Google Dataflow
Eric Kolotyluk created BEAM-14206:
-------------------------------------
Summary: Direct Runner Does Not Work As Well As Google Dataflow
Key: BEAM-14206
URL: https://issues.apache.org/jira/browse/BEAM-14206
Project: Beam
Issue Type: Bug
Components: runner-direct
Affects Versions: 2.37.0
Reporter: Eric Kolotyluk
*As a Staff Software Engineer* at Forgerock developing Beam applications,
*I want* Direct Runner to work the same as Google Dataflow with Flex Templates,
*So that* Beam applications can be developed and tested locally.
*_This is a high-level report that may need to be broken down into more specific reports._*
{code:java}
var stream = pipeline // Extract - Transform - Load, Lather Rinse Repeat
//.apply("Extract from Source", PubsubIO.readStrings().fromSubscription(inputSubscription).withTimestampAttribute("date"))
.apply("Extract from Source", PubsubIO.readStrings().fromSubscription(inputSubscription))
.apply("Transform to TableRow", ParDo.of(new TransformToTableRow()))
.apply("Define Window", Window.<TableRow>into(FixedWindows.of(Duration.standardMinutes(1))).withAllowedLateness(Duration.standardMinutes(10)).accumulatingFiredPanes())
.apply("Transform to JSON", MapElements.into(TypeDescriptors.strings()).via(GenericJson::toString));
stream.apply("Load to PubSub", PubsubIO.writeStrings().to(sinkTopic));
stream.apply("Load to GCS", TextIO.write().to(sinkUri).withWindowedWrites().withNumShards(1));
{code}
This code work as expected running under Google Dataflow as a Flex Template.
When running locally under Direct Runner there are two problems
# .withTimestampAttribute("date") fails to find the "date" attribute
# No results are written to GCS
** However, results are written to the local file system as temporary files
** Results are written to PubSub correctly
--
This message was sent by Atlassian Jira
(v8.20.1#820001)