You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/04/23 04:31:44 UTC

[GitHub] [beam] boyuanzz commented on a change in pull request #11472: [BEAM-2939] Migrate from HasSize to HasProgress interface for restriction trackers and use the progress value during sizing, splitting and reporting

boyuanzz commented on a change in pull request #11472:
URL: https://github.com/apache/beam/pull/11472#discussion_r413493450



##########
File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/DoFnInvoker.java
##########
@@ -94,9 +94,13 @@ void invokeOnTimer(
   void invokeSplitRestriction(ArgumentProvider<InputT, OutputT> arguments);
 
   /**
-   * Invoke the {@link DoFn.GetSize} method on the bound {@link DoFn}. Falls back to get the size
-   * from the {@link RestrictionTracker} if it supports {@link Sizes.HasSize}, otherwise returns
-   * 1.0.
+   * Invoke the {@link DoFn.GetSize} method on the bound {@link DoFn}. Falls back to:
+   *
+   * <ol>
+   *   <li>get the work remaining from the {@link RestrictionTracker} if it supports {@link
+   *       HasProgress}.
+   *   <li>returning the constant {@link 1.0}.

Review comment:
       Do we want to highlight that this fallback may impact batch autoscaling if `HasProcess` is not implemented correctly.

##########
File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/splittabledofn/Sizes.java
##########
@@ -1,54 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.transforms.splittabledofn;
-
-import org.apache.beam.sdk.annotations.Experimental;
-import org.apache.beam.sdk.annotations.Experimental.Kind;
-
-/** Definitions and convenience methods for reporting sizes for SplittableDoFns. */
-@Experimental(Kind.SPLITTABLE_DO_FN)
-public final class Sizes {
-  /**
-   * {@link RestrictionTracker}s which can provide a size should implement this interface.
-   * Implementations that do not implement this interface will be assumed to have an equivalent
-   * size.
-   */
-  public interface HasSize {

Review comment:
       I though we still want to keep this and make `HasProgress` as a fallback?

##########
File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
##########
@@ -397,12 +396,14 @@ public WatermarkEstimatorStateT getState() {
   }
 
   public static class DefaultGetSize {
-    /** Uses {@link Sizes.HasSize} to produce the size. */
+    /** Uses {@link HasProgress} to produce the size. */
     @SuppressWarnings("unused")
     public static <InputT, OutputT> double invokeGetSize(
         DoFnInvoker.ArgumentProvider<InputT, OutputT> argumentProvider) {
-      if (argumentProvider.restrictionTracker() instanceof HasSize) {
-        return ((HasSize) argumentProvider.restrictionTracker()).getSize();
+      if (argumentProvider.restrictionTracker() instanceof HasProgress) {

Review comment:
       I'm not very familiar with how this invoker works but I thought we check whether `restrictionTracker ` has `HasSize`, if not then fallback to progress.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org