You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/07/22 15:33:45 UTC

[GitHub] [flink] pnowojski commented on a diff in pull request #20245: [FLINK-28487][connectors] Introduce configurable RateLimitingStrategy…

pnowojski commented on code in PR #20245:
URL: https://github.com/apache/flink/pull/20245#discussion_r927747602


##########
flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/strategy/AIMDScalingStrategy.java:
##########
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.base.sink.writer.strategy;
+
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.util.Preconditions;
+
+/**
+ * AIMDScalingStrategy scales up linearly and scales down multiplicatively. See
+ * https://en.wikipedia.org/wiki/Additive_increase/multiplicative_decrease for more details
+ */
+@PublicEvolving
+public class AIMDScalingStrategy {
+    private final int increaseRate;
+    private final double decreaseFactor;
+    private final int rateThreshold;
+
+    public AIMDScalingStrategy(int increaseRate, double decreaseFactor, int rateThreshold) {
+        Preconditions.checkArgument(increaseRate > 0, "increaseRate must be positive integer.");
+        Preconditions.checkArgument(
+                decreaseFactor < 1.0 && decreaseFactor > 0.0,
+                "decreaseFactor must be strictly between 0.0 and 1.0.");
+        Preconditions.checkArgument(rateThreshold > 0, "rateThreshold must be a positive integer.");
+        Preconditions.checkArgument(
+                rateThreshold >= increaseRate, "rateThreshold must be larger than increaseRate.");
+        this.increaseRate = increaseRate;
+        this.decreaseFactor = decreaseFactor;
+        this.rateThreshold = rateThreshold;
+    }
+
+    public int scaleUp(int currentRate) {
+        return Math.min(currentRate + increaseRate, rateThreshold);
+    }
+
+    public int scaleDown(int currentRate) {
+        return Math.max(1, (int) Math.round(currentRate * decreaseFactor));
+    }
+
+    @PublicEvolving
+    public static AIMDScalingStrategyBuilder builder() {
+        return new AIMDScalingStrategyBuilder();
+    }
+
+    /** Builder for {@link AIMDScalingStrategy}. */
+    public static class AIMDScalingStrategyBuilder {
+
+        private int increaseRate = 10;
+        private double decreaseFactor = 0.5;
+        private int rateThreshold;

Review Comment:
   nit: why does it have default value `0` if later you are doing `checkArgument(rateThreshold > 0)` in the constructor? If it's obligatory parameter without a good default value, then I think it would be cleaner to pass it through the constructor to the `AIMDScalingStrategyBuilder`:
   
   ```
   AIMDScalingStrategyBuilder
     .builder(myRateThreshold)
     .setOptionalParam1(foo)
     .setOptionalParam2(bar)
   ```



##########
flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/strategy/RequestInfo.java:
##########
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.base.sink.writer.strategy;
+
+import org.apache.flink.annotation.PublicEvolving;
+
+import java.time.Instant;
+
+/** Dataclass to encapsulate information about starting requests. */
+@PublicEvolving
+public class RequestInfo {
+    private final int batchSize;
+    private final Instant requestStartTime;
+
+    private RequestInfo(int batchSize, Instant requestStartTime) {
+        this.batchSize = batchSize;
+        this.requestStartTime = requestStartTime;
+    }
+
+    @PublicEvolving
+    public static RequestInfoBuilder builder() {
+        return new RequestInfoBuilder();
+    }
+
+    public int getBatchSize() {
+        return batchSize;
+    }
+
+    public Instant getRequestStartTime() {
+        return requestStartTime;
+    }
+
+    /** Builder for {@link RequestInfo} dataclass. */
+    public static class RequestInfoBuilder {
+        private int batchSize;
+        private Instant requestStartTime;
+
+        public RequestInfoBuilder setBatchSize(final int batchSize) {
+            this.batchSize = batchSize;
+            return this;
+        }
+
+        public RequestInfoBuilder setRequestStartTime(final Instant requestStartTime) {
+            this.requestStartTime = requestStartTime;
+            return this;
+        }
+
+        public RequestInfo build() {
+            return new RequestInfo(batchSize, requestStartTime);
+        }
+    }

Review Comment:
   Why do we need a builder for this class? Especially given that both of those parameters seems to be obligatory?



##########
flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/AsyncSinkWriter.java:
##########
@@ -344,69 +352,69 @@ public void write(InputT element, Context context) throws IOException, Interrupt
      * </ul>
      */
     private void nonBlockingFlush() throws InterruptedException {
-        while (!isInFlightRequestOrMessageLimitExceeded()
+        while (!rateLimitingStrategy.shouldBlock(createRequestInfo())
                 && (bufferedRequestEntries.size() >= getNextBatchSizeLimit()
                         || bufferedRequestEntriesTotalSizeInBytes >= maxBatchSizeInBytes)) {
             flush();
         }
     }
 
-    /**
-     * Determines if the sink should block and complete existing in flight requests before it may
-     * prudently create any new ones. This is exactly determined by if the number of requests
-     * currently in flight exceeds the maximum supported by the sink OR if the number of in flight
-     * messages exceeds the maximum determined to be appropriate by the rate limiting strategy.
-     */
-    private boolean isInFlightRequestOrMessageLimitExceeded() {
-        return inFlightRequestsCount >= maxInFlightRequests
-                || inFlightMessages >= rateLimitingStrategy.getRateLimit();
+    private RequestInfo createRequestInfo() {
+        int batchSize = getNextBatchSize();
+        Instant requestStartTime = Instant.now();
+        return RequestInfo.builder()
+                .setBatchSize(batchSize)
+                .setRequestStartTime(requestStartTime)
+                .build();

Review Comment:
   I have some doubts if creating new `RequestInfo` per every record is a good idea.
   
   1. Just sheer fact of creating new object might increase GC pressure and add a bit of overhead
   2. I would expect `Instant.now();` to be very costly to invoke. Isn't this a syscall underneath?  Why do we even need `requestStartTime`  in the `RequestInfo`?
   
   And to me it looks like this object is reconstructed over and over again at least once per every `AsyncSinkWriter#write` call?



##########
flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/strategy/ResultInfo.java:
##########
@@ -0,0 +1,66 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.base.sink.writer.strategy;
+
+import org.apache.flink.annotation.PublicEvolving;
+
+/** Dataclass to encapsulate results from completed requests. */
+@PublicEvolving
+public class ResultInfo {

Review Comment:
   Same two questions as in `RequestInfo`:
   
   1. Having a builder with optional setter methods without a sane default values doesn't make sense to me at the first glance.
   2. We shouldn't be exposing the concrete class and the builder as `PublicEvolving` API (unless I'm missing something)



##########
flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/AsyncSinkWriter.java:
##########
@@ -256,14 +243,42 @@ public AsyncSinkWriter(
             long maxTimeInBufferMS,
             long maxRecordSizeInBytes,
             Collection<BufferedRequestState<RequestEntryT>> states) {
+        this(
+                elementConverter,
+                context,
+                maxBatchSize,
+                maxBufferedRequests,
+                maxBatchSizeInBytes,
+                maxTimeInBufferMS,
+                maxRecordSizeInBytes,
+                states,
+                CongestionControlRateLimitingStrategy.builder()
+                        .setMaxInFlightRequests(maxInFlightRequests)
+                        .setInitialMaxInFlightMessages(maxBatchSize)
+                        .setAimdScalingStrategy(
+                                AIMDScalingStrategy.builder()
+                                        .setRateThreshold(maxBatchSize * maxInFlightRequests)
+                                        .build())
+                        .build());
+    }
+
+    public AsyncSinkWriter(
+            ElementConverter<InputT, RequestEntryT> elementConverter,
+            Sink.InitContext context,
+            int maxBatchSize,
+            int maxBufferedRequests,
+            long maxBatchSizeInBytes,
+            long maxTimeInBufferMS,
+            long maxRecordSizeInBytes,
+            Collection<BufferedRequestState<RequestEntryT>> states,
+            RateLimitingStrategy rateLimitingStrategy) {

Review Comment:
   Does the `AsyncSinkWriter` have any documentation? How users are supposed to know that it exists and especially what kind of `RateLimitingStrategy` they can use?



##########
flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/strategy/RequestInfo.java:
##########
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.base.sink.writer.strategy;
+
+import org.apache.flink.annotation.PublicEvolving;
+
+import java.time.Instant;
+
+/** Dataclass to encapsulate information about starting requests. */
+@PublicEvolving
+public class RequestInfo {
+    private final int batchSize;
+    private final Instant requestStartTime;
+
+    private RequestInfo(int batchSize, Instant requestStartTime) {
+        this.batchSize = batchSize;
+        this.requestStartTime = requestStartTime;
+    }
+
+    @PublicEvolving
+    public static RequestInfoBuilder builder() {
+        return new RequestInfoBuilder();
+    }

Review Comment:
   Does user need to know how to construct this class? Shouldn't only a `RequestInfo` interface by `@PublicEvolving` while a concrete implementation by `@Internal` to the `AsyncSinkWriter`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org