You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by GitBox <gi...@apache.org> on 2020/03/13 15:53:16 UTC

[GitHub] [beam] mxm commented on a change in pull request #10992: [BEAM-2939] Provide user facing API for reporting the watermark in SplittableDoFns.

mxm commented on a change in pull request #10992: [BEAM-2939] Provide user facing API for reporting the watermark in SplittableDoFns.
URL: https://github.com/apache/beam/pull/10992#discussion_r392275988
 
 

 ##########
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
 ##########
 @@ -1070,6 +1091,120 @@ public Duration getAllowedTimestampSkew() {
   @Experimental(Kind.SPLITTABLE_DO_FN)
   public @interface NewTracker {}
 
+  /**
+   * Annotation for the method that maps an element and restriction to initial watermark estimator
+   * state for a <a href="https://s.apache.org/splittable-do-fn">splittable</a> {@link DoFn}.
+   *
+   * <p>Signature: {@code WatermarkEstimatorStateT getInitialWatermarkState(<arguments>);}
+   *
+   * <p>This method must satisfy the following constraints:
+   *
+   * <ul>
+   *   <li>The return type {@code WatermarkEstimatorStateT} defines the watermark state type used
+   *       within this splittable DoFn. All other methods that use a {@link
+   *       WatermarkEstimatorState @WatermarkEstimatorState} parameter must use the same type that
+   *       is used here. It is suggested to use as narrow of a return type definition as possible
+   *       (for example prefer to use a square type over a shape type as a square is a type of a
+   *       shape).
+   *   <li>If one of its arguments is tagged with the {@link Element} annotation, then it will be
+   *       passed the current element being processed; the argument must be of type {@code InputT}.
+   *       Note that automatic conversion of {@link Row}s and {@link FieldAccess} parameters are
+   *       currently unsupported.
+   *   <li>If one of its arguments is tagged with the {@link Restriction} annotation, then it will
+   *       be passed the current restriction being processed; the argument must be of type {@code
+   *       RestrictionT}.
+   *   <li>If one of its arguments is tagged with the {@link Timestamp} annotation, then it will be
+   *       passed the timestamp of the current element being processed; the argument must be of type
+   *       {@link Instant}.
+   *   <li>If one of its arguments is a subtype of {@link BoundedWindow}, then it will be passed the
+   *       window of the current element. When applied by {@link ParDo} the subtype of {@link
+   *       BoundedWindow} must match the type of windows on the input {@link PCollection}. If the
+   *       window is not accessed a runner may perform additional optimizations.
+   *   <li>If one of its arguments is of type {@link PaneInfo}, then it will be passed information
+   *       about the current triggering pane.
+   *   <li>If one of the parameters is of type {@link PipelineOptions}, then it will be passed the
+   *       options for the current pipeline.
+   * </ul>
+   */
+  @Documented
+  @Retention(RetentionPolicy.RUNTIME)
+  @Target(ElementType.METHOD)
+  @Experimental(Kind.SPLITTABLE_DO_FN)
+  public @interface GetInitialWatermarkEstimatorState {}
 
 Review comment:
   Note that this terminology deviates from the current `getCheckpointMark()`. I like it though.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services