You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/06/02 17:58:21 UTC

[GitHub] [beam] lukecwik commented on a change in pull request #11862: [BEAM-10141] HL7v2 io timestamps / watermark estimate

lukecwik commented on a change in pull request #11862:
URL: https://github.com/apache/beam/pull/11862#discussion_r434064107



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/healthcare/HL7v2IO.java
##########
@@ -179,25 +189,51 @@ public static ListHL7v2Messages readAll(List<String> hl7v2Stores) {
     return new ListHL7v2Messages(StaticValueProvider.of(hl7v2Stores), StaticValueProvider.of(null));
   }
 
+  /** Read all HL7v2 Messages from multiple stores as sendTime {@link TimestampedValue}s. */
+  public static ListTimestampedHL7v2Messages readAllWithTimestamps(List<String> hl7v2Stores) {

Review comment:
       Instead of adding static methods for each variant, any reason to not make ListHL7v2Messages be a builder so someone could invoke `withMessageTimestamps`?
   
   This would have also make sense to make `withFilter` a builder like method on ListHL7v2Messages

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/healthcare/HL7v2IO.java
##########
@@ -397,7 +492,8 @@ public void instantiateHealthcareClient() throws IOException {
         public void processElement(ProcessContext context) {
           String msgId = context.element();
           try {
-            context.output(HL7v2Message.fromModel(fetchMessage(this.client, msgId)));
+            HL7v2Message msg = HL7v2Message.fromModel(fetchMessage(this.client, msgId));
+            context.output(TimestampedValue.of(msg, Instant.parse(msg.getSendTime())));

Review comment:
       You should really be using `context.outputWithTimestamp` and not `TimestampedValue`.
   
   It would make sense to have an object that chooses which timestamp to use (input elements timestamp or messages send time) instead of having a different variant of the DoFn. In both cases you would use the monotonically increasing estimator.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org