You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/10/14 17:47:06 UTC

[GitHub] [flink] zhuzhurk opened a new pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

zhuzhurk opened a new pull request #13641:
URL: https://github.com/apache/flink/pull/13641


   ## What is the purpose of the change
   
   Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore.
   
   
   ## Verifying this change
   
   This change is already covered by existing tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (yes / **no**)
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**)
     - The serializers: (yes / **no** / don't know)
     - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know)
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
     - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (yes / **no**)
     - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331",
       "triggerID" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10374",
       "triggerID" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10435",
       "triggerID" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "89bea5233d5efb9db88eacc21b445a617a8c3c27",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10899",
       "triggerID" : "89bea5233d5efb9db88eacc21b445a617a8c3c27",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42 UNKNOWN
   * 89bea5233d5efb9db88eacc21b445a617a8c3c27 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10899) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] azagrebin commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
azagrebin commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r532696797



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphSchedulingTest.java
##########
@@ -1,637 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.api.common.time.Time;
-import org.apache.flink.configuration.Configuration;
-import org.apache.flink.metrics.groups.UnregisteredMetricsGroup;
-import org.apache.flink.runtime.blob.VoidBlobWriter;
-import org.apache.flink.runtime.checkpoint.StandaloneCheckpointRecoveryFactory;
-import org.apache.flink.runtime.clusterframework.types.AllocationID;
-import org.apache.flink.runtime.clusterframework.types.ResourceID;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.instance.SimpleSlotContext;
-import org.apache.flink.runtime.io.network.partition.NoOpJobMasterPartitionTracker;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.ScheduleMode;
-import org.apache.flink.runtime.jobmanager.scheduler.Locality;
-import org.apache.flink.runtime.jobmanager.slots.DummySlotOwner;
-import org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway;
-import org.apache.flink.runtime.jobmanager.slots.TestingSlotOwner;
-import org.apache.flink.runtime.jobmaster.LogicalSlot;
-import org.apache.flink.runtime.jobmaster.SlotOwner;
-import org.apache.flink.runtime.jobmaster.SlotRequestId;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlot;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlotBuilder;
-import org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.shuffle.NettyShuffleMaster;
-import org.apache.flink.runtime.taskmanager.TaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.runtime.testutils.DirectScheduledExecutorService;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.After;
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.net.InetAddress;
-import java.util.Set;
-import java.util.concurrent.ArrayBlockingQueue;
-import java.util.concurrent.BlockingQueue;
-import java.util.concurrent.CompletableFuture;
-import java.util.concurrent.ConcurrentHashMap;
-import java.util.concurrent.ConcurrentMap;
-import java.util.concurrent.ScheduledExecutorService;
-import java.util.concurrent.TimeUnit;
-import java.util.concurrent.TimeoutException;
-
-import static org.hamcrest.Matchers.empty;
-import static org.hamcrest.Matchers.is;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertThat;
-
-/**
- * Tests for the scheduling of the execution graph. This tests that
- * for example the order of deployments is correct and that bulk slot allocation
- * works properly.
- */
-public class ExecutionGraphSchedulingTest extends TestLogger {

Review comment:
       Not whether we test how slot release/timeout affects the new scheduling,
   also `testCancellationOfIncompleteScheduling`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * 517b53a68f3e9c8c0897cd7afba90b8a9befaa4f Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807) 
   * 021cac170ea26cddfd8af0a2bec5fea4e6a76b69 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284) 
   * 9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r533130210



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphNotEnoughResourceTest.java
##########
@@ -113,29 +119,28 @@ public void testRestartWithSlotSharingAndNotEnoughResources() throws Exception {
 			final JobGraph jobGraph = new JobGraph(TEST_JOB_ID, "Test Job", source, sink);
 			jobGraph.setScheduleMode(ScheduleMode.EAGER);
 
-			TestRestartStrategy restartStrategy = new TestRestartStrategy(numRestarts, false);
+			RestartBackoffTimeStrategy restartStrategy = new FixedDelayRestartBackoffTimeStrategy.FixedDelayRestartBackoffTimeStrategyFactory(numRestarts, 0).create();
 
-			final ExecutionGraph eg = TestingExecutionGraphBuilder
-				.newBuilder()
-				.setJobGraph(jobGraph)
-				.setSlotProvider(scheduler)
-				.setRestartStrategy(restartStrategy)
-				.setAllocationTimeout(Time.milliseconds(1L))
+			final SchedulerBase schedulerNG = SchedulerTestingUtils
+				.newSchedulerBuilderWithDefaultSlotAllocator(jobGraph, scheduler, Time.milliseconds(1))
+				.setRestartBackoffTimeStrategy(restartStrategy)
+				.setSchedulingStrategyFactory(new EagerSchedulingStrategy.Factory())
+				.setFailoverStrategyFactory(new RestartAllFailoverStrategy.Factory())
 				.build();
+			final ExecutionGraph eg = schedulerNG.getExecutionGraph();

Review comment:
       Now I think we can remove this test because it is actually testing job failure on slot allocation which is already covered by PipelinedRegionSchedulingITCase#testFailsOnInsufficientSlot()




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] tillrohrmann commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
tillrohrmann commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r543322114



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java
##########
@@ -82,202 +82,12 @@
 	private final TestingComponentMainThreadExecutor testMainThreadUtil =
 		EXECUTOR_RESOURCE.getComponentMainThreadTestExecutor();
 
-	/**
-	 * Tests that slots are released if we cannot assign the allocated resource to the
-	 * Execution.
-	 */
-	@Test
-	public void testSlotReleaseOnFailedResourceAssignment() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final CompletableFuture<LogicalSlot> slotFuture = new CompletableFuture<>();
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, slotFuture);
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final LogicalSlot otherSlot = new TestingLogicalSlotBuilder().createTestingLogicalSlot();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertFalse(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		// assign a different resource to the execution
-		assertTrue(execution.tryAssignResource(otherSlot));
-
-		// completing now the future should cause the slot to be released
-		slotFuture.complete(slot);
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
 	private TestingLogicalSlot createTestingLogicalSlot(SlotOwner slotOwner) {
 		return new TestingLogicalSlotBuilder()
 			.setSlotOwner(slotOwner)
 			.createTestingLogicalSlot();
 	}
 
-	/**
-	 * Tests that the slot is released in case of a execution cancellation when having
-	 * a slot assigned and being in state SCHEDULED.
-	 */
-	@Test
-	public void testSlotReleaseOnExecutionCancellationInScheduled() throws Exception {

Review comment:
       I think you are right @zhuzhurk. Thanks a lot for the information.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-719297248


   @flinkbot run azure


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331",
       "triggerID" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * 021cac170ea26cddfd8af0a2bec5fea4e6a76b69 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756) 
   * fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42 UNKNOWN
   * 20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r514119222



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionVertexCancelTest.java
##########
@@ -244,104 +250,6 @@ public void testSendCancelAndReceiveFail() throws Exception {
 		assertEquals(vertices.length - 1, exec.getVertex().getExecutionGraph().getRegisteredExecutions().size());
 	}
 
-	// --------------------------------------------------------------------------------------------
-	//  Actions after a vertex has been canceled or while canceling
-	// --------------------------------------------------------------------------------------------
-
-	@Test
-	public void testScheduleOrDeployAfterCancel() {

Review comment:
       superseded by `DefaultScheduler#scheduleOnlyIfVertexIsCreated()`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] tillrohrmann commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
tillrohrmann commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r544443243



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphRestartTest.java
##########
@@ -1,866 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.ExecutionConfig;
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.api.common.restartstrategy.RestartStrategies;
-import org.apache.flink.api.common.time.Time;
-import org.apache.flink.runtime.clusterframework.types.AllocationID;
-import org.apache.flink.runtime.clusterframework.types.ResourceProfile;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutor;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.execution.SuppressRestartsException;
-import org.apache.flink.runtime.executiongraph.failover.FailoverStrategy;
-import org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
-import org.apache.flink.runtime.executiongraph.restart.RestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.NotCancelAckingTaskGateway;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.executiongraph.utils.SimpleSlotProvider;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.ScheduleMode;
-import org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup;
-import org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway;
-import org.apache.flink.runtime.jobmaster.JobMasterId;
-import org.apache.flink.runtime.jobmaster.slotpool.LocationPreferenceSlotSelectionStrategy;
-import org.apache.flink.runtime.jobmaster.slotpool.Scheduler;
-import org.apache.flink.runtime.jobmaster.slotpool.SchedulerImpl;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotPool;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.jobmaster.slotpool.TestingSlotPoolImpl;
-import org.apache.flink.runtime.resourcemanager.ResourceManagerGateway;
-import org.apache.flink.runtime.resourcemanager.utils.TestingResourceManagerGateway;
-import org.apache.flink.runtime.taskexecutor.slot.SlotOffer;
-import org.apache.flink.runtime.taskmanager.LocalTaskManagerLocation;
-import org.apache.flink.runtime.taskmanager.TaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.io.IOException;
-import java.util.ArrayList;
-import java.util.Iterator;
-import java.util.List;
-import java.util.concurrent.CompletableFuture;
-import java.util.function.Consumer;
-
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.completeCancellingForAllVertices;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.createNoOpVertex;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.finishAllVertices;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.switchToRunning;
-import static org.hamcrest.Matchers.is;
-import static org.hamcrest.Matchers.lessThanOrEqualTo;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertNotEquals;
-import static org.junit.Assert.assertNotNull;
-import static org.junit.Assert.assertThat;
-import static org.junit.Assert.assertTrue;
-
-/**
- * Tests the restart behaviour of the {@link ExecutionGraph}.
- */
-public class ExecutionGraphRestartTest extends TestLogger {

Review comment:
       I will take a look at #14405.
   
   +1 for reducing duplication by consolidating test cases.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620) Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r544262309



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphSchedulingTest.java
##########
@@ -1,637 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.api.common.time.Time;
-import org.apache.flink.configuration.Configuration;
-import org.apache.flink.metrics.groups.UnregisteredMetricsGroup;
-import org.apache.flink.runtime.blob.VoidBlobWriter;
-import org.apache.flink.runtime.checkpoint.StandaloneCheckpointRecoveryFactory;
-import org.apache.flink.runtime.clusterframework.types.AllocationID;
-import org.apache.flink.runtime.clusterframework.types.ResourceID;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.instance.SimpleSlotContext;
-import org.apache.flink.runtime.io.network.partition.NoOpJobMasterPartitionTracker;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.ScheduleMode;
-import org.apache.flink.runtime.jobmanager.scheduler.Locality;
-import org.apache.flink.runtime.jobmanager.slots.DummySlotOwner;
-import org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway;
-import org.apache.flink.runtime.jobmanager.slots.TestingSlotOwner;
-import org.apache.flink.runtime.jobmaster.LogicalSlot;
-import org.apache.flink.runtime.jobmaster.SlotOwner;
-import org.apache.flink.runtime.jobmaster.SlotRequestId;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlot;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlotBuilder;
-import org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.shuffle.NettyShuffleMaster;
-import org.apache.flink.runtime.taskmanager.TaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.runtime.testutils.DirectScheduledExecutorService;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.After;
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.net.InetAddress;
-import java.util.Set;
-import java.util.concurrent.ArrayBlockingQueue;
-import java.util.concurrent.BlockingQueue;
-import java.util.concurrent.CompletableFuture;
-import java.util.concurrent.ConcurrentHashMap;
-import java.util.concurrent.ConcurrentMap;
-import java.util.concurrent.ScheduledExecutorService;
-import java.util.concurrent.TimeUnit;
-import java.util.concurrent.TimeoutException;
-
-import static org.hamcrest.Matchers.empty;
-import static org.hamcrest.Matchers.is;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertThat;
-
-/**
- * Tests for the scheduling of the execution graph. This tests that
- * for example the order of deployments is correct and that bulk slot allocation
- * works properly.
- */
-public class ExecutionGraphSchedulingTest extends TestLogger {
-
-	private final ScheduledExecutorService executor = new DirectScheduledExecutorService();
-
-	@After
-	public void shutdown() {
-		executor.shutdownNow();
-	}
-
-	// ------------------------------------------------------------------------
-	//  Tests
-	// ------------------------------------------------------------------------
-
-	/**
-	 * Tests that with scheduling futures and pipelined deployment, the target vertex will
-	 * not deploy its task before the source vertex does.
-	 */
-	@Test
-	public void testScheduleSourceBeforeTarget() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 1;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final CompletableFuture<LogicalSlot> sourceFuture = new CompletableFuture<>();
-		final CompletableFuture<LogicalSlot> targetFuture = new CompletableFuture<>();
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlot(sourceVertex.getID(), 0, sourceFuture);
-		slotProvider.addSlot(targetVertex.getID(), 0, targetFuture);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		//  set up two TaskManager gateways and slots
-
-		final InteractionsCountingTaskManagerGateway gatewaySource = createTaskManager();
-		final InteractionsCountingTaskManagerGateway gatewayTarget = createTaskManager();
-
-		final LogicalSlot sourceSlot = createTestingLogicalSlot(gatewaySource);
-		final LogicalSlot targetSlot = createTestingLogicalSlot(gatewayTarget);
-
-		eg.scheduleForExecution();
-
-		// job should be running
-		assertEquals(JobStatus.RUNNING, eg.getState());
-
-		// we fulfill the target slot before the source slot
-		// that should not cause a deployment or deployment related failure
-		targetFuture.complete(targetSlot);
-
-		assertThat(gatewayTarget.getSubmitTaskCount(), is(0));
-		assertEquals(JobStatus.RUNNING, eg.getState());
-
-		// now supply the source slot
-		sourceFuture.complete(sourceSlot);
-
-		// by now, all deployments should have happened
-		assertThat(gatewaySource.getSubmitTaskCount(), is(1));
-		assertThat(gatewayTarget.getSubmitTaskCount(), is(1));
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-	}
-
-	private TestingLogicalSlot createTestingLogicalSlot(InteractionsCountingTaskManagerGateway gatewaySource) {
-		return new TestingLogicalSlotBuilder()
-			.setTaskManagerGateway(gatewaySource)
-			.createTestingLogicalSlot();
-	}
-
-	/**
-	 * This test verifies that before deploying a pipelined connected component, the
-	 * full set of slots is available, and that not some tasks are deployed, and later the
-	 * system realizes that not enough resources are available.
-	 */
-	@Test
-	public void testDeployPipelinedConnectedComponentsTogether() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 8;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] sourceFutures = new CompletableFuture[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] targetFutures = new CompletableFuture[parallelism];
-
-		//
-		//  Create the slots, futures, and the slot provider
-
-		final InteractionsCountingTaskManagerGateway[] sourceTaskManagers = new InteractionsCountingTaskManagerGateway[parallelism];
-		final InteractionsCountingTaskManagerGateway[] targetTaskManagers = new InteractionsCountingTaskManagerGateway[parallelism];
-
-		final LogicalSlot[] sourceSlots = new LogicalSlot[parallelism];
-		final LogicalSlot[] targetSlots = new LogicalSlot[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			sourceTaskManagers[i] = createTaskManager();
-			targetTaskManagers[i] = createTaskManager();
-
-			sourceSlots[i] = createTestingLogicalSlot(sourceTaskManagers[i]);
-			targetSlots[i] = createTestingLogicalSlot(targetTaskManagers[i]);
-
-			sourceFutures[i] = new CompletableFuture<>();
-			targetFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(sourceVertex.getID(), sourceFutures);
-		slotProvider.addSlots(targetVertex.getID(), targetFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-
-		//
-		//  we complete some of the futures
-
-		for (int i = 0; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-		}
-
-		//
-		//  kick off the scheduling
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		verifyNothingDeployed(eg, sourceTaskManagers);
-
-		//  complete the remaining sources
-		for (int i = 1; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-		}
-		verifyNothingDeployed(eg, sourceTaskManagers);
-
-		//  complete the targets except for one
-		for (int i = 1; i < parallelism; i++) {
-			targetFutures[i].complete(targetSlots[i]);
-		}
-		verifyNothingDeployed(eg, targetTaskManagers);
-
-		//  complete the last target slot future
-		targetFutures[0].complete(targetSlots[0]);
-
-		//
-		//  verify that all deployments have happened
-
-		for (InteractionsCountingTaskManagerGateway gateway : sourceTaskManagers) {
-			assertThat(gateway.getSubmitTaskCount(), is(1));
-		}
-		for (InteractionsCountingTaskManagerGateway gateway : targetTaskManagers) {
-			assertThat(gateway.getSubmitTaskCount(), is(1));
-		}
-	}
-
-	/**
-	 * This test verifies that if one slot future fails, the deployment will be aborted.
-	 */
-	@Test
-	public void testOneSlotFailureAbortsDeploy() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 6;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.POINTWISE, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		//
-		//  Create the slots, futures, and the slot provider
-
-		final InteractionsCountingTaskManagerGateway taskManager = createTaskManager();
-		final BlockingQueue<AllocationID> returnedSlots = new ArrayBlockingQueue<>(parallelism);
-		final TestingSlotOwner slotOwner = new TestingSlotOwner();
-		slotOwner.setReturnAllocatedSlotConsumer(
-			(LogicalSlot logicalSlot) -> returnedSlots.offer(logicalSlot.getAllocationId()));
-
-		final LogicalSlot[] sourceSlots = new LogicalSlot[parallelism];
-		final LogicalSlot[] targetSlots = new LogicalSlot[parallelism];
-
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] sourceFutures = new CompletableFuture[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] targetFutures = new CompletableFuture[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			sourceSlots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-			targetSlots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-
-			sourceFutures[i] = new CompletableFuture<>();
-			targetFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(sourceVertex.getID(), sourceFutures);
-		slotProvider.addSlots(targetVertex.getID(), targetFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		//
-		//  we complete some of the futures
-
-		for (int i = 0; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-			targetFutures[i].complete(targetSlots[i]);
-		}
-
-		//  kick off the scheduling
-		eg.scheduleForExecution();
-
-		// fail one slot
-		sourceFutures[1].completeExceptionally(new TestRuntimeException());
-
-		// wait until the job failed as a whole
-		eg.getTerminationFuture().get(2000, TimeUnit.MILLISECONDS);
-
-		// wait until all slots are back
-		for (int i = 0; i < parallelism; i++) {
-			returnedSlots.poll(2000L, TimeUnit.MILLISECONDS);
-		}
-
-		// no deployment calls must have happened
-		assertThat(taskManager.getSubmitTaskCount(), is(0));
-
-		// all completed futures must have been returns
-		for (int i = 0; i < parallelism; i += 2) {
-			assertFalse(sourceSlots[i].isAlive());
-			assertFalse(targetSlots[i].isAlive());
-		}
-	}
-
-	/**
-	 * This tests makes sure that with eager scheduling no task is deployed if a single
-	 * slot allocation fails. Moreover we check that allocated slots will be returned.
-	 */
-	@Test
-	public void testEagerSchedulingWithSlotTimeout() throws Exception {
-
-		//  we construct a simple graph:    (task)
-
-		final int parallelism = 3;
-
-		final JobVertex vertex = new JobVertex("task");
-		vertex.setParallelism(parallelism);
-		vertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", vertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final BlockingQueue<AllocationID> returnedSlots = new ArrayBlockingQueue<>(2);
-		final TestingSlotOwner slotOwner = new TestingSlotOwner();
-		slotOwner.setReturnAllocatedSlotConsumer(
-			(LogicalSlot logicalSlot) -> returnedSlots.offer(logicalSlot.getAllocationId()));
-
-		final InteractionsCountingTaskManagerGateway taskManager = createTaskManager();
-		final LogicalSlot[] slots = new LogicalSlot[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] slotFutures = new CompletableFuture[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			slots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-			slotFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(vertex.getID(), slotFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-
-		//  we complete one future
-		slotFutures[1].complete(slots[1]);
-
-		//  kick off the scheduling
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		//  we complete another future
-		slotFutures[2].complete(slots[2]);
-
-		// check that the ExecutionGraph is not terminated yet
-		assertThat(eg.getTerminationFuture().isDone(), is(false));
-
-		// time out one of the slot futures
-		slotFutures[0].completeExceptionally(new TimeoutException("Test time out"));
-
-		assertThat(eg.getTerminationFuture().get(), is(JobStatus.FAILED));
-
-		// wait until all slots are back
-		for (int i = 0; i < parallelism - 1; i++) {
-			returnedSlots.poll(2000, TimeUnit.MILLISECONDS);
-		}
-
-		//  verify that no deployments have happened
-		assertThat(taskManager.getSubmitTaskCount(), is(0));
-	}
-
-	/**
-	 * Tests that an ongoing scheduling operation does not fail the {@link ExecutionGraph}
-	 * if it gets concurrently cancelled.
-	 */
-	@Test
-	public void testSchedulingOperationCancellationWhenCancel() throws Exception {

Review comment:
       I think the implementation is a bit different in `DefaultScheduler` because the job canceling will increment mod version of all vertices, and further slot assignment and deployment will be skipped as a result and the job status will not be affected. The version incrementing is tested via `cancelJobWillIncrementVertexVersions`. `releaseSlotIfVertexVersionOutdated` and `skipDeploymentIfVertexVersionOutdated` test the skipped slot assignment and deployment.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * 9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279) 
   * 659fb7eddb0acfa0ef49f76c5fafca21c389f3c0 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-747422349


   No other changes in this PR. I'm waiting for a green CI before merging it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r514057853



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphRestartTest.java
##########
@@ -1,866 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.ExecutionConfig;
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.api.common.restartstrategy.RestartStrategies;
-import org.apache.flink.api.common.time.Time;
-import org.apache.flink.runtime.clusterframework.types.AllocationID;
-import org.apache.flink.runtime.clusterframework.types.ResourceProfile;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutor;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.execution.SuppressRestartsException;
-import org.apache.flink.runtime.executiongraph.failover.FailoverStrategy;
-import org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
-import org.apache.flink.runtime.executiongraph.restart.RestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.NotCancelAckingTaskGateway;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.executiongraph.utils.SimpleSlotProvider;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.ScheduleMode;
-import org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup;
-import org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway;
-import org.apache.flink.runtime.jobmaster.JobMasterId;
-import org.apache.flink.runtime.jobmaster.slotpool.LocationPreferenceSlotSelectionStrategy;
-import org.apache.flink.runtime.jobmaster.slotpool.Scheduler;
-import org.apache.flink.runtime.jobmaster.slotpool.SchedulerImpl;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotPool;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.jobmaster.slotpool.TestingSlotPoolImpl;
-import org.apache.flink.runtime.resourcemanager.ResourceManagerGateway;
-import org.apache.flink.runtime.resourcemanager.utils.TestingResourceManagerGateway;
-import org.apache.flink.runtime.taskexecutor.slot.SlotOffer;
-import org.apache.flink.runtime.taskmanager.LocalTaskManagerLocation;
-import org.apache.flink.runtime.taskmanager.TaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.io.IOException;
-import java.util.ArrayList;
-import java.util.Iterator;
-import java.util.List;
-import java.util.concurrent.CompletableFuture;
-import java.util.function.Consumer;
-
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.completeCancellingForAllVertices;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.createNoOpVertex;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.finishAllVertices;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.switchToRunning;
-import static org.hamcrest.Matchers.is;
-import static org.hamcrest.Matchers.lessThanOrEqualTo;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertNotEquals;
-import static org.junit.Assert.assertNotNull;
-import static org.junit.Assert.assertThat;
-import static org.junit.Assert.assertTrue;
-
-/**
- * Tests the restart behaviour of the {@link ExecutionGraph}.
- */
-public class ExecutionGraphRestartTest extends TestLogger {

Review comment:
       Can be removed because it is highly related to the legacy restarting process.
   Tests for latest restarting process is in `DefaultSchedulerTest` and `ExecutionFailureHandlerTest`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r532534958



##########
File path: flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/DefaultScheduler.java
##########
@@ -188,7 +188,7 @@ public void start(ComponentMainThreadExecutor mainThreadExecutor) {
 	}
 
 	@Override
-	protected long getNumberOfRestarts() {
+	public long getNumberOfRestarts() {

Review comment:
       It's actually `JobManagerJobMetricGroup` which is different from `JobManagerMetricGroup`.
   All job global metrics are in this group.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r514059286



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphVariousFailuesTest.java
##########
@@ -20,86 +20,31 @@
 
 import org.apache.flink.api.common.JobStatus;
 import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.SuppressRestartsException;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
 import org.apache.flink.runtime.io.network.partition.ResultPartitionID;
 import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobGraph;
+import org.apache.flink.runtime.scheduler.SchedulerBase;
+import org.apache.flink.runtime.scheduler.SchedulerTestingUtils;
 import org.apache.flink.util.TestLogger;
 
 import org.junit.Test;
 
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.fail;
 
-
 public class ExecutionGraphVariousFailuesTest extends TestLogger {
 
-	/**
-	 * Test that failing in state restarting will retrigger the restarting logic. This means that
-	 * it only goes into the state FAILED after the restart strategy says the job is no longer
-	 * restartable.
-	 */
-	@Test
-	public void testFailureWhileRestarting() throws Exception {

Review comment:
       These tests are similar to those in `ExecutionGraphRestartTest` and can be removed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] azagrebin commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
azagrebin commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r530185201



##########
File path: flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/JobMaster.java
##########
@@ -917,15 +917,15 @@ private void resetAndStartScheduler() throws Exception {
 
 		if (schedulerNG.requestJobStatus() == JobStatus.CREATED) {
 			schedulerAssignedFuture = CompletableFuture.completedFuture(null);
-			schedulerNG.setMainThreadExecutor(getMainThreadExecutor());
+			schedulerNG.start(getMainThreadExecutor());

Review comment:
       If it does initialization, maybe it is better to call it `initialize` or `setup`? we already have `startScheduling`

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphNotEnoughResourceTest.java
##########
@@ -113,29 +119,28 @@ public void testRestartWithSlotSharingAndNotEnoughResources() throws Exception {
 			final JobGraph jobGraph = new JobGraph(TEST_JOB_ID, "Test Job", source, sink);
 			jobGraph.setScheduleMode(ScheduleMode.EAGER);
 
-			TestRestartStrategy restartStrategy = new TestRestartStrategy(numRestarts, false);
+			RestartBackoffTimeStrategy restartStrategy = new FixedDelayRestartBackoffTimeStrategy.FixedDelayRestartBackoffTimeStrategyFactory(numRestarts, 0).create();
 
-			final ExecutionGraph eg = TestingExecutionGraphBuilder
-				.newBuilder()
-				.setJobGraph(jobGraph)
-				.setSlotProvider(scheduler)
-				.setRestartStrategy(restartStrategy)
-				.setAllocationTimeout(Time.milliseconds(1L))
+			final SchedulerBase schedulerNG = SchedulerTestingUtils
+				.newSchedulerBuilderWithDefaultSlotAllocator(jobGraph, scheduler, Time.milliseconds(1))
+				.setRestartBackoffTimeStrategy(restartStrategy)
+				.setSchedulingStrategyFactory(new EagerSchedulingStrategy.Factory())
+				.setFailoverStrategyFactory(new RestartAllFailoverStrategy.Factory())
 				.build();
+			final ExecutionGraph eg = schedulerNG.getExecutionGraph();

Review comment:
       I am wondering whether we should keep the test depending on EG at all if in future EG is planned to be more like a topology. Maybe `eg::getState` and `eg::getFailureCause` rather belong to scheduler in a long term? `eg::getState` can be already replaced `schedulerNG::requestJobStatus`. Then test can be also renamed to e.g. `EagerSchedulingNotEnoughResourceTest`. or the idea the test is going to be removed anyways after pipelined region scheduling is stable?

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java
##########
@@ -361,11 +168,7 @@ public void testTaskRestoreStateIsNulledAfterDeployment() throws Exception {
 		assertThat(execution.getTaskRestore(), is(notNullValue()));
 
 		// schedule the execution vertex and wait for its deployment
-		executionVertex.scheduleForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ANY,
-			Collections.emptySet())
-			.get();
+		scheduler.startScheduling();

Review comment:
       `scheduler.startScheduling` does not look to wait for anything, would it be easier to call `Execution::deploy`?

##########
File path: flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/DefaultScheduler.java
##########
@@ -188,7 +188,7 @@ public void start(ComponentMainThreadExecutor mainThreadExecutor) {
 	}
 
 	@Override
-	protected long getNumberOfRestarts() {
+	public long getNumberOfRestarts() {

Review comment:
       maybe we could query this somehow from `JobManagerMetricGroup` but I am also wondering whether the `JobManagerMetricGroup` registration/query code belongs to scheduler and not to a separate e.g. `SchedulingMetrics` or so component.

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphVariousFailuesTest.java
##########
@@ -20,86 +20,31 @@
 
 import org.apache.flink.api.common.JobStatus;
 import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.SuppressRestartsException;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
 import org.apache.flink.runtime.io.network.partition.ResultPartitionID;
 import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobGraph;
+import org.apache.flink.runtime.scheduler.SchedulerBase;
+import org.apache.flink.runtime.scheduler.SchedulerTestingUtils;
 import org.apache.flink.util.TestLogger;
 
 import org.junit.Test;
 
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.fail;
 
-
 public class ExecutionGraphVariousFailuesTest extends TestLogger {
 
-	/**
-	 * Test that failing in state restarting will retrigger the restarting logic. This means that
-	 * it only goes into the state FAILED after the restart strategy says the job is no longer
-	 * restartable.
-	 */
-	@Test
-	public void testFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(2));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("Test 1"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-
-		// we should restart since we have two restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 2"));
-
-		// we should restart since we have one restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 3"));
-
-		// after depleting all our restart attempts we should go into Failed
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
-	/**
-	 * Tests that a {@link SuppressRestartsException} in state RESTARTING stops the restarting
-	 * immediately and sets the execution graph's state to FAILED.
-	 */
-	@Test
-	public void testSuppressRestartFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("test"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		// suppress a possible restart
-		eg.failGlobal(new SuppressRestartsException(new Exception("Test")));
-
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
 	/**
 	 * Tests that a failing scheduleOrUpdateConsumers call with a non-existing execution attempt
 	 * id, will not fail the execution graph.
 	 */
 	@Test
 	public void testFailingScheduleOrUpdateConsumers() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
+		final SchedulerBase scheduler = SchedulerTestingUtils.newSchedulerBuilder(new JobGraph()).build();
+		scheduler.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
+		scheduler.startScheduling();
+
+		final ExecutionGraph eg = scheduler.getExecutionGraph();

Review comment:
       do we need to test `eg.scheduleOrUpdateConsumers` here?

##########
File path: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/metrics/RestartTimeGauge.java
##########
@@ -39,19 +41,21 @@
 
 	// ------------------------------------------------------------------------
 
-	private final ExecutionGraph eg;
+	private final Supplier<JobStatus> statusSupplier;
+	private final Function<JobStatus, Long> statusTimestampRetriever;
 
-	public RestartTimeGauge(ExecutionGraph executionGraph) {
-		this.eg = checkNotNull(executionGraph);
+	public RestartTimeGauge(Supplier<JobStatus> statusSupplier, Function<JobStatus, Long> statusTimestampRetriever) {

Review comment:
       it looks there are other gauges with the same pattern JobStatus/timestamp. Maybe we can segregate an interface from EG:
   ```
   interface JobStatusProvider {
       JobStatus getJobStatus();
       long getJobStatusTimestamp(JobStatus);
   }
   ```
   and then `TestingJobStatusProvider`

##########
File path: flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/TaskManagerLocation.java
##########
@@ -256,7 +256,7 @@ public boolean equals(Object obj) {
 		if (obj == this) {
 			return true;
 		}
-		else if (obj != null && obj.getClass() == TaskManagerLocation.class) {
+		else if (obj != null && obj.getClass() == getClass()) {

Review comment:
       maybe to revert this nit change for easier git history

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/ExecutionGraphCheckpointCoordinatorTest.java
##########
@@ -159,7 +164,7 @@ private ExecutionGraph createExecutionGraphAndEnableCheckpointing(
 			false,
 			0);
 
-		executionGraph.enableCheckpointing(
+		scheduler.getExecutionGraph().enableCheckpointing(

Review comment:
       do we want to call `enableCheckpointing` on EG?
   Should we use `JobCheckpointingSettings` in `JobGraph` instead like in `ArchivedExecutionGraphTest`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r514415168



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java
##########
@@ -82,202 +82,12 @@
 	private final TestingComponentMainThreadExecutor testMainThreadUtil =
 		EXECUTOR_RESOURCE.getComponentMainThreadTestExecutor();
 
-	/**
-	 * Tests that slots are released if we cannot assign the allocated resource to the
-	 * Execution.
-	 */
-	@Test
-	public void testSlotReleaseOnFailedResourceAssignment() throws Exception {

Review comment:
       Removed cases which test the legacy scheduling code paths, e.g. allocateResourcesForExecution()
   ExecutionSlotAllocators have taken over slot allocation and there are tests for them.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r542357117



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphVariousFailuesTest.java
##########
@@ -20,86 +20,31 @@
 
 import org.apache.flink.api.common.JobStatus;
 import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.SuppressRestartsException;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
 import org.apache.flink.runtime.io.network.partition.ResultPartitionID;
 import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobGraph;
+import org.apache.flink.runtime.scheduler.SchedulerBase;
+import org.apache.flink.runtime.scheduler.SchedulerTestingUtils;
 import org.apache.flink.util.TestLogger;
 
 import org.junit.Test;
 
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.fail;
 
-
 public class ExecutionGraphVariousFailuesTest extends TestLogger {
 
-	/**
-	 * Test that failing in state restarting will retrigger the restarting logic. This means that
-	 * it only goes into the state FAILED after the restart strategy says the job is no longer
-	 * restartable.
-	 */
-	@Test
-	public void testFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(2));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("Test 1"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-
-		// we should restart since we have two restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 2"));
-
-		// we should restart since we have one restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 3"));
-
-		// after depleting all our restart attempts we should go into Failed
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
-	/**
-	 * Tests that a {@link SuppressRestartsException} in state RESTARTING stops the restarting
-	 * immediately and sets the execution graph's state to FAILED.
-	 */
-	@Test
-	public void testSuppressRestartFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("test"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		// suppress a possible restart
-		eg.failGlobal(new SuppressRestartsException(new Exception("Test")));
-
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
 	/**
 	 * Tests that a failing scheduleOrUpdateConsumers call with a non-existing execution attempt
 	 * id, will not fail the execution graph.
 	 */
 	@Test
 	public void testFailingScheduleOrUpdateConsumers() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
+		final SchedulerBase scheduler = SchedulerTestingUtils.newSchedulerBuilder(new JobGraph()).build();
+		scheduler.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
+		scheduler.startScheduling();
+
+		final ExecutionGraph eg = scheduler.getExecutionGraph();

Review comment:
       Yes exactly as Till says.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r514126805



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/GlobalModVersionTest.java
##########
@@ -1,200 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.failover.FailoverStrategy;
-import org.apache.flink.runtime.executiongraph.failover.FailoverStrategy.Factory;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleSlotProvider;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.Test;
-
-import java.util.Random;
-
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.waitUntilExecutionState;
-import static org.junit.Assert.assertEquals;
-import static org.mockito.Mockito.any;
-import static org.mockito.Mockito.mock;
-import static org.mockito.Mockito.times;
-import static org.mockito.Mockito.verify;
-
-public class GlobalModVersionTest extends TestLogger {

Review comment:
       This test relies on the legacy `ExecutionGraph#failGlobal()` process and is outdated.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] tillrohrmann commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
tillrohrmann commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r544424483



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java
##########
@@ -82,202 +82,12 @@
 	private final TestingComponentMainThreadExecutor testMainThreadUtil =
 		EXECUTOR_RESOURCE.getComponentMainThreadTestExecutor();
 
-	/**
-	 * Tests that slots are released if we cannot assign the allocated resource to the
-	 * Execution.
-	 */
-	@Test
-	public void testSlotReleaseOnFailedResourceAssignment() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final CompletableFuture<LogicalSlot> slotFuture = new CompletableFuture<>();
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, slotFuture);
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final LogicalSlot otherSlot = new TestingLogicalSlotBuilder().createTestingLogicalSlot();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertFalse(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		// assign a different resource to the execution
-		assertTrue(execution.tryAssignResource(otherSlot));
-
-		// completing now the future should cause the slot to be released
-		slotFuture.complete(slot);
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
 	private TestingLogicalSlot createTestingLogicalSlot(SlotOwner slotOwner) {
 		return new TestingLogicalSlotBuilder()
 			.setSlotOwner(slotOwner)
 			.createTestingLogicalSlot();
 	}
 
-	/**
-	 * Tests that the slot is released in case of a execution cancellation when having
-	 * a slot assigned and being in state SCHEDULED.
-	 */
-	@Test
-	public void testSlotReleaseOnExecutionCancellationInScheduled() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, CompletableFuture.completedFuture(slot));
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertTrue(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		assertEquals(slot, execution.getAssignedResource());
-
-		// cancelling the execution should move it into state CANCELED
-		execution.cancel();
-		assertEquals(ExecutionState.CANCELED, execution.getState());
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
-	/**
-	 * Tests that the slot is released in case of a execution cancellation when being in state
-	 * RUNNING.
-	 */
-	@Test
-	public void testSlotReleaseOnExecutionCancellationInRunning() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, CompletableFuture.completedFuture(slot));
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertTrue(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		assertEquals(slot, execution.getAssignedResource());
-
-		execution.deploy();
-
-		execution.switchToRunning();
-
-		// cancelling the execution should move it into state CANCELING
-		execution.cancel();
-		assertEquals(ExecutionState.CANCELING, execution.getState());
-
-		execution.completeCancelling();
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
-	/**
-	 * Tests that a slot allocation from a {@link SlotProvider} is cancelled if the
-	 * {@link Execution} is cancelled.
-	 */
-	@Test
-	public void testSlotAllocationCancellationWhenExecutionCancelled() throws Exception {

Review comment:
       sounds good.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r544263769



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphSchedulingTest.java
##########
@@ -1,637 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.api.common.time.Time;
-import org.apache.flink.configuration.Configuration;
-import org.apache.flink.metrics.groups.UnregisteredMetricsGroup;
-import org.apache.flink.runtime.blob.VoidBlobWriter;
-import org.apache.flink.runtime.checkpoint.StandaloneCheckpointRecoveryFactory;
-import org.apache.flink.runtime.clusterframework.types.AllocationID;
-import org.apache.flink.runtime.clusterframework.types.ResourceID;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.instance.SimpleSlotContext;
-import org.apache.flink.runtime.io.network.partition.NoOpJobMasterPartitionTracker;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.ScheduleMode;
-import org.apache.flink.runtime.jobmanager.scheduler.Locality;
-import org.apache.flink.runtime.jobmanager.slots.DummySlotOwner;
-import org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway;
-import org.apache.flink.runtime.jobmanager.slots.TestingSlotOwner;
-import org.apache.flink.runtime.jobmaster.LogicalSlot;
-import org.apache.flink.runtime.jobmaster.SlotOwner;
-import org.apache.flink.runtime.jobmaster.SlotRequestId;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlot;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlotBuilder;
-import org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.shuffle.NettyShuffleMaster;
-import org.apache.flink.runtime.taskmanager.TaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.runtime.testutils.DirectScheduledExecutorService;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.After;
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.net.InetAddress;
-import java.util.Set;
-import java.util.concurrent.ArrayBlockingQueue;
-import java.util.concurrent.BlockingQueue;
-import java.util.concurrent.CompletableFuture;
-import java.util.concurrent.ConcurrentHashMap;
-import java.util.concurrent.ConcurrentMap;
-import java.util.concurrent.ScheduledExecutorService;
-import java.util.concurrent.TimeUnit;
-import java.util.concurrent.TimeoutException;
-
-import static org.hamcrest.Matchers.empty;
-import static org.hamcrest.Matchers.is;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertThat;
-
-/**
- * Tests for the scheduling of the execution graph. This tests that
- * for example the order of deployments is correct and that bulk slot allocation
- * works properly.
- */
-public class ExecutionGraphSchedulingTest extends TestLogger {

Review comment:
       added tests `restartVerticesOnSlotAllocationTimeout` and `restartVerticesOnAssignedSlotReleased `




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] tillrohrmann commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
tillrohrmann commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r544423669



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphSchedulingTest.java
##########
@@ -1,637 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.api.common.time.Time;
-import org.apache.flink.configuration.Configuration;
-import org.apache.flink.metrics.groups.UnregisteredMetricsGroup;
-import org.apache.flink.runtime.blob.VoidBlobWriter;
-import org.apache.flink.runtime.checkpoint.StandaloneCheckpointRecoveryFactory;
-import org.apache.flink.runtime.clusterframework.types.AllocationID;
-import org.apache.flink.runtime.clusterframework.types.ResourceID;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.instance.SimpleSlotContext;
-import org.apache.flink.runtime.io.network.partition.NoOpJobMasterPartitionTracker;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.ScheduleMode;
-import org.apache.flink.runtime.jobmanager.scheduler.Locality;
-import org.apache.flink.runtime.jobmanager.slots.DummySlotOwner;
-import org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway;
-import org.apache.flink.runtime.jobmanager.slots.TestingSlotOwner;
-import org.apache.flink.runtime.jobmaster.LogicalSlot;
-import org.apache.flink.runtime.jobmaster.SlotOwner;
-import org.apache.flink.runtime.jobmaster.SlotRequestId;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlot;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlotBuilder;
-import org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.shuffle.NettyShuffleMaster;
-import org.apache.flink.runtime.taskmanager.TaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.runtime.testutils.DirectScheduledExecutorService;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.After;
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.net.InetAddress;
-import java.util.Set;
-import java.util.concurrent.ArrayBlockingQueue;
-import java.util.concurrent.BlockingQueue;
-import java.util.concurrent.CompletableFuture;
-import java.util.concurrent.ConcurrentHashMap;
-import java.util.concurrent.ConcurrentMap;
-import java.util.concurrent.ScheduledExecutorService;
-import java.util.concurrent.TimeUnit;
-import java.util.concurrent.TimeoutException;
-
-import static org.hamcrest.Matchers.empty;
-import static org.hamcrest.Matchers.is;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertThat;
-
-/**
- * Tests for the scheduling of the execution graph. This tests that
- * for example the order of deployments is correct and that bulk slot allocation
- * works properly.
- */
-public class ExecutionGraphSchedulingTest extends TestLogger {
-
-	private final ScheduledExecutorService executor = new DirectScheduledExecutorService();
-
-	@After
-	public void shutdown() {
-		executor.shutdownNow();
-	}
-
-	// ------------------------------------------------------------------------
-	//  Tests
-	// ------------------------------------------------------------------------
-
-	/**
-	 * Tests that with scheduling futures and pipelined deployment, the target vertex will
-	 * not deploy its task before the source vertex does.
-	 */
-	@Test
-	public void testScheduleSourceBeforeTarget() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 1;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final CompletableFuture<LogicalSlot> sourceFuture = new CompletableFuture<>();
-		final CompletableFuture<LogicalSlot> targetFuture = new CompletableFuture<>();
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlot(sourceVertex.getID(), 0, sourceFuture);
-		slotProvider.addSlot(targetVertex.getID(), 0, targetFuture);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		//  set up two TaskManager gateways and slots
-
-		final InteractionsCountingTaskManagerGateway gatewaySource = createTaskManager();
-		final InteractionsCountingTaskManagerGateway gatewayTarget = createTaskManager();
-
-		final LogicalSlot sourceSlot = createTestingLogicalSlot(gatewaySource);
-		final LogicalSlot targetSlot = createTestingLogicalSlot(gatewayTarget);
-
-		eg.scheduleForExecution();
-
-		// job should be running
-		assertEquals(JobStatus.RUNNING, eg.getState());
-
-		// we fulfill the target slot before the source slot
-		// that should not cause a deployment or deployment related failure
-		targetFuture.complete(targetSlot);
-
-		assertThat(gatewayTarget.getSubmitTaskCount(), is(0));
-		assertEquals(JobStatus.RUNNING, eg.getState());
-
-		// now supply the source slot
-		sourceFuture.complete(sourceSlot);
-
-		// by now, all deployments should have happened
-		assertThat(gatewaySource.getSubmitTaskCount(), is(1));
-		assertThat(gatewayTarget.getSubmitTaskCount(), is(1));
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-	}
-
-	private TestingLogicalSlot createTestingLogicalSlot(InteractionsCountingTaskManagerGateway gatewaySource) {
-		return new TestingLogicalSlotBuilder()
-			.setTaskManagerGateway(gatewaySource)
-			.createTestingLogicalSlot();
-	}
-
-	/**
-	 * This test verifies that before deploying a pipelined connected component, the
-	 * full set of slots is available, and that not some tasks are deployed, and later the
-	 * system realizes that not enough resources are available.
-	 */
-	@Test
-	public void testDeployPipelinedConnectedComponentsTogether() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 8;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] sourceFutures = new CompletableFuture[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] targetFutures = new CompletableFuture[parallelism];
-
-		//
-		//  Create the slots, futures, and the slot provider
-
-		final InteractionsCountingTaskManagerGateway[] sourceTaskManagers = new InteractionsCountingTaskManagerGateway[parallelism];
-		final InteractionsCountingTaskManagerGateway[] targetTaskManagers = new InteractionsCountingTaskManagerGateway[parallelism];
-
-		final LogicalSlot[] sourceSlots = new LogicalSlot[parallelism];
-		final LogicalSlot[] targetSlots = new LogicalSlot[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			sourceTaskManagers[i] = createTaskManager();
-			targetTaskManagers[i] = createTaskManager();
-
-			sourceSlots[i] = createTestingLogicalSlot(sourceTaskManagers[i]);
-			targetSlots[i] = createTestingLogicalSlot(targetTaskManagers[i]);
-
-			sourceFutures[i] = new CompletableFuture<>();
-			targetFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(sourceVertex.getID(), sourceFutures);
-		slotProvider.addSlots(targetVertex.getID(), targetFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-
-		//
-		//  we complete some of the futures
-
-		for (int i = 0; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-		}
-
-		//
-		//  kick off the scheduling
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		verifyNothingDeployed(eg, sourceTaskManagers);
-
-		//  complete the remaining sources
-		for (int i = 1; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-		}
-		verifyNothingDeployed(eg, sourceTaskManagers);
-
-		//  complete the targets except for one
-		for (int i = 1; i < parallelism; i++) {
-			targetFutures[i].complete(targetSlots[i]);
-		}
-		verifyNothingDeployed(eg, targetTaskManagers);
-
-		//  complete the last target slot future
-		targetFutures[0].complete(targetSlots[0]);
-
-		//
-		//  verify that all deployments have happened
-
-		for (InteractionsCountingTaskManagerGateway gateway : sourceTaskManagers) {
-			assertThat(gateway.getSubmitTaskCount(), is(1));
-		}
-		for (InteractionsCountingTaskManagerGateway gateway : targetTaskManagers) {
-			assertThat(gateway.getSubmitTaskCount(), is(1));
-		}
-	}
-
-	/**
-	 * This test verifies that if one slot future fails, the deployment will be aborted.
-	 */
-	@Test
-	public void testOneSlotFailureAbortsDeploy() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 6;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.POINTWISE, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		//
-		//  Create the slots, futures, and the slot provider
-
-		final InteractionsCountingTaskManagerGateway taskManager = createTaskManager();
-		final BlockingQueue<AllocationID> returnedSlots = new ArrayBlockingQueue<>(parallelism);
-		final TestingSlotOwner slotOwner = new TestingSlotOwner();
-		slotOwner.setReturnAllocatedSlotConsumer(
-			(LogicalSlot logicalSlot) -> returnedSlots.offer(logicalSlot.getAllocationId()));
-
-		final LogicalSlot[] sourceSlots = new LogicalSlot[parallelism];
-		final LogicalSlot[] targetSlots = new LogicalSlot[parallelism];
-
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] sourceFutures = new CompletableFuture[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] targetFutures = new CompletableFuture[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			sourceSlots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-			targetSlots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-
-			sourceFutures[i] = new CompletableFuture<>();
-			targetFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(sourceVertex.getID(), sourceFutures);
-		slotProvider.addSlots(targetVertex.getID(), targetFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		//
-		//  we complete some of the futures
-
-		for (int i = 0; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-			targetFutures[i].complete(targetSlots[i]);
-		}
-
-		//  kick off the scheduling
-		eg.scheduleForExecution();
-
-		// fail one slot
-		sourceFutures[1].completeExceptionally(new TestRuntimeException());
-
-		// wait until the job failed as a whole
-		eg.getTerminationFuture().get(2000, TimeUnit.MILLISECONDS);
-
-		// wait until all slots are back
-		for (int i = 0; i < parallelism; i++) {
-			returnedSlots.poll(2000L, TimeUnit.MILLISECONDS);
-		}
-
-		// no deployment calls must have happened
-		assertThat(taskManager.getSubmitTaskCount(), is(0));
-
-		// all completed futures must have been returns
-		for (int i = 0; i < parallelism; i += 2) {
-			assertFalse(sourceSlots[i].isAlive());
-			assertFalse(targetSlots[i].isAlive());
-		}
-	}
-
-	/**
-	 * This tests makes sure that with eager scheduling no task is deployed if a single
-	 * slot allocation fails. Moreover we check that allocated slots will be returned.
-	 */
-	@Test
-	public void testEagerSchedulingWithSlotTimeout() throws Exception {
-
-		//  we construct a simple graph:    (task)
-
-		final int parallelism = 3;
-
-		final JobVertex vertex = new JobVertex("task");
-		vertex.setParallelism(parallelism);
-		vertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", vertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final BlockingQueue<AllocationID> returnedSlots = new ArrayBlockingQueue<>(2);
-		final TestingSlotOwner slotOwner = new TestingSlotOwner();
-		slotOwner.setReturnAllocatedSlotConsumer(
-			(LogicalSlot logicalSlot) -> returnedSlots.offer(logicalSlot.getAllocationId()));
-
-		final InteractionsCountingTaskManagerGateway taskManager = createTaskManager();
-		final LogicalSlot[] slots = new LogicalSlot[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] slotFutures = new CompletableFuture[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			slots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-			slotFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(vertex.getID(), slotFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-
-		//  we complete one future
-		slotFutures[1].complete(slots[1]);
-
-		//  kick off the scheduling
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		//  we complete another future
-		slotFutures[2].complete(slots[2]);
-
-		// check that the ExecutionGraph is not terminated yet
-		assertThat(eg.getTerminationFuture().isDone(), is(false));
-
-		// time out one of the slot futures
-		slotFutures[0].completeExceptionally(new TimeoutException("Test time out"));
-
-		assertThat(eg.getTerminationFuture().get(), is(JobStatus.FAILED));
-
-		// wait until all slots are back
-		for (int i = 0; i < parallelism - 1; i++) {
-			returnedSlots.poll(2000, TimeUnit.MILLISECONDS);
-		}
-
-		//  verify that no deployments have happened
-		assertThat(taskManager.getSubmitTaskCount(), is(0));
-	}
-
-	/**
-	 * Tests that an ongoing scheduling operation does not fail the {@link ExecutionGraph}
-	 * if it gets concurrently cancelled.
-	 */
-	@Test
-	public void testSchedulingOperationCancellationWhenCancel() throws Exception {
-		final JobVertex jobVertex = new JobVertex("NoOp JobVertex");
-		jobVertex.setInvokableClass(NoOpInvokable.class);
-		jobVertex.setParallelism(2);
-		final JobGraph jobGraph = new JobGraph(jobVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final CompletableFuture<LogicalSlot> slotFuture1 = new CompletableFuture<>();
-		final CompletableFuture<LogicalSlot> slotFuture2 = new CompletableFuture<>();
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(2);
-		slotProvider.addSlots(jobVertex.getID(), new CompletableFuture[]{slotFuture1, slotFuture2});
-		final ExecutionGraph executionGraph = createExecutionGraph(jobGraph, slotProvider);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		executionGraph.scheduleForExecution();
-
-		final TestingLogicalSlot slot = createTestingSlot();
-		final CompletableFuture<?> releaseFuture = slot.getReleaseFuture();
-		slotFuture1.complete(slot);
-
-		// cancel should change the state of all executions to CANCELLED
-		executionGraph.cancel();
-
-		// complete the now CANCELLED execution --> this should cause a failure
-		slotFuture2.complete(new TestingLogicalSlotBuilder().createTestingLogicalSlot());
-
-		Thread.sleep(1L);
-		// release the first slot to finish the cancellation
-		releaseFuture.complete(null);
-
-		// NOTE: This test will only occasionally fail without the fix since there is
-		// a race between the releaseFuture and the slotFuture2
-		assertThat(executionGraph.getTerminationFuture().get(), is(JobStatus.CANCELED));
-	}
-
-	/**
-	 * Tests that a partially completed eager scheduling operation fails if a
-	 * completed slot is released. See FLINK-9099.
-	 */
-	@Test
-	public void testSlotReleasingFailsSchedulingOperation() throws Exception {
-		final int parallelism = 2;
-
-		final JobVertex jobVertex = new JobVertex("Testing job vertex");
-		jobVertex.setInvokableClass(NoOpInvokable.class);
-		jobVertex.setParallelism(parallelism);
-		final JobGraph jobGraph = new JobGraph(jobVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-
-		final LogicalSlot slot = createSingleLogicalSlot(new DummySlotOwner(), new SimpleAckingTaskManagerGateway(), new SlotRequestId());
-		slotProvider.addSlot(jobVertex.getID(), 0, CompletableFuture.completedFuture(slot));
-
-		final CompletableFuture<LogicalSlot> slotFuture = new CompletableFuture<>();
-		slotProvider.addSlot(jobVertex.getID(), 1, slotFuture);
-
-		final ExecutionGraph executionGraph = createExecutionGraph(jobGraph, slotProvider);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		executionGraph.scheduleForExecution();
-
-		assertThat(executionGraph.getState(), is(JobStatus.RUNNING));
-
-		final ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertex.getID());
-		final ExecutionVertex[] taskVertices = executionJobVertex.getTaskVertices();
-		assertThat(taskVertices[0].getExecutionState(), is(ExecutionState.SCHEDULED));
-		assertThat(taskVertices[1].getExecutionState(), is(ExecutionState.SCHEDULED));
-
-		// fail the single allocated slot --> this should fail the scheduling operation
-		slot.releaseSlot(new FlinkException("Test failure"));
-
-		assertThat(executionGraph.getTerminationFuture().get(), is(JobStatus.FAILED));
-	}
-
-	/**
-	 * Tests that all slots are being returned to the {@link SlotOwner} if the
-	 * {@link ExecutionGraph} is being cancelled. See FLINK-9908
-	 */
-	@Test
-	public void testCancellationOfIncompleteScheduling() throws Exception {

Review comment:
       I think it is not strictly needed by we are relying here on the behaviour of some other components. In general it would be nice if a component is self-contained. I think we don't have to add it now. We just should be aware of it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r544264987



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java
##########
@@ -82,202 +82,12 @@
 	private final TestingComponentMainThreadExecutor testMainThreadUtil =
 		EXECUTOR_RESOURCE.getComponentMainThreadTestExecutor();
 
-	/**
-	 * Tests that slots are released if we cannot assign the allocated resource to the
-	 * Execution.
-	 */
-	@Test
-	public void testSlotReleaseOnFailedResourceAssignment() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final CompletableFuture<LogicalSlot> slotFuture = new CompletableFuture<>();
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, slotFuture);
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final LogicalSlot otherSlot = new TestingLogicalSlotBuilder().createTestingLogicalSlot();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertFalse(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		// assign a different resource to the execution
-		assertTrue(execution.tryAssignResource(otherSlot));
-
-		// completing now the future should cause the slot to be released
-		slotFuture.complete(slot);
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
 	private TestingLogicalSlot createTestingLogicalSlot(SlotOwner slotOwner) {
 		return new TestingLogicalSlotBuilder()
 			.setSlotOwner(slotOwner)
 			.createTestingLogicalSlot();
 	}
 
-	/**
-	 * Tests that the slot is released in case of a execution cancellation when having
-	 * a slot assigned and being in state SCHEDULED.
-	 */
-	@Test
-	public void testSlotReleaseOnExecutionCancellationInScheduled() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, CompletableFuture.completedFuture(slot));
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertTrue(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		assertEquals(slot, execution.getAssignedResource());
-
-		// cancelling the execution should move it into state CANCELED
-		execution.cancel();
-		assertEquals(ExecutionState.CANCELED, execution.getState());
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
-	/**
-	 * Tests that the slot is released in case of a execution cancellation when being in state
-	 * RUNNING.
-	 */
-	@Test
-	public void testSlotReleaseOnExecutionCancellationInRunning() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, CompletableFuture.completedFuture(slot));
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertTrue(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		assertEquals(slot, execution.getAssignedResource());
-
-		execution.deploy();
-
-		execution.switchToRunning();
-
-		// cancelling the execution should move it into state CANCELING
-		execution.cancel();
-		assertEquals(ExecutionState.CANCELING, execution.getState());
-
-		execution.completeCancelling();
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
-	/**
-	 * Tests that a slot allocation from a {@link SlotProvider} is cancelled if the
-	 * {@link Execution} is cancelled.
-	 */
-	@Test
-	public void testSlotAllocationCancellationWhenExecutionCancelled() throws Exception {

Review comment:
       added `DefaultSchedulerTest#allocationIsCanceledWhenVertexIsFailedOrCanceled` and `SlotPoolImpl#testShutdownCancelsAllPendingRequests`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r543306413



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java
##########
@@ -82,202 +82,12 @@
 	private final TestingComponentMainThreadExecutor testMainThreadUtil =
 		EXECUTOR_RESOURCE.getComponentMainThreadTestExecutor();
 
-	/**
-	 * Tests that slots are released if we cannot assign the allocated resource to the
-	 * Execution.
-	 */
-	@Test
-	public void testSlotReleaseOnFailedResourceAssignment() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final CompletableFuture<LogicalSlot> slotFuture = new CompletableFuture<>();
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, slotFuture);
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final LogicalSlot otherSlot = new TestingLogicalSlotBuilder().createTestingLogicalSlot();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertFalse(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		// assign a different resource to the execution
-		assertTrue(execution.tryAssignResource(otherSlot));
-
-		// completing now the future should cause the slot to be released
-		slotFuture.complete(slot);
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
 	private TestingLogicalSlot createTestingLogicalSlot(SlotOwner slotOwner) {
 		return new TestingLogicalSlotBuilder()
 			.setSlotOwner(slotOwner)
 			.createTestingLogicalSlot();
 	}
 
-	/**
-	 * Tests that the slot is released in case of a execution cancellation when having
-	 * a slot assigned and being in state SCHEDULED.
-	 */
-	@Test
-	public void testSlotReleaseOnExecutionCancellationInScheduled() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, CompletableFuture.completedFuture(slot));
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertTrue(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		assertEquals(slot, execution.getAssignedResource());
-
-		// cancelling the execution should move it into state CANCELED
-		execution.cancel();
-		assertEquals(ExecutionState.CANCELED, execution.getState());
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
-	/**
-	 * Tests that the slot is released in case of a execution cancellation when being in state
-	 * RUNNING.
-	 */
-	@Test
-	public void testSlotReleaseOnExecutionCancellationInRunning() throws Exception {

Review comment:
       I think this test is not needed because 
   - the slot releasing check is covered by `ExecutionTest#testCanceledExecutionReturnsSlot`
   - the state transition check is covered in `ExecutionVertexCancelTest#testCancelFromRunning`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] azagrebin commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
azagrebin commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r536051606



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/GlobalModVersionTest.java
##########
@@ -1,200 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.failover.FailoverStrategy;
-import org.apache.flink.runtime.executiongraph.failover.FailoverStrategy.Factory;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleSlotProvider;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.Test;
-
-import java.util.Random;
-
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.waitUntilExecutionState;
-import static org.junit.Assert.assertEquals;
-import static org.mockito.Mockito.any;
-import static org.mockito.Mockito.mock;
-import static org.mockito.Mockito.times;
-import static org.mockito.Mockito.verify;
-
-public class GlobalModVersionTest extends TestLogger {

Review comment:
       True, we test the new versioning in `DefaultSchedulerTest` and `globalModeVersion` is not used any more.
   We still have global/local failures in `DefaultScheduler` though. My question was more about whether we need the tests for global/local/double failures in `DefaultSchedulerTest` to see how this affects job state/versions, similar to what we have in e.g. `GlobalModVersionTest::testNoLocalFailoverWhileFailing `.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r512792107



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphDeploymentTest.java
##########
@@ -566,174 +552,6 @@ public void testSettingIllegalMaxNumberOfCheckpointsToRetain() throws Exception
 			eg.getCheckpointCoordinator().getCheckpointStore().getMaxNumberOfRetainedCheckpoints());
 	}
 
-	/**
-	 * Tests that eager scheduling will wait until all input locations have been set before
-	 * scheduling a task.
-	 */
-	@Test
-	public void testEagerSchedulingWaitsOnAllInputPreferredLocations() throws Exception {
-		final int parallelism = 2;
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-
-		final Time timeout = Time.hours(1L);
-		final JobVertexID sourceVertexId = new JobVertexID();
-		final JobVertex sourceVertex = new JobVertex("Test source", sourceVertexId);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-		sourceVertex.setParallelism(parallelism);
-
-		final JobVertexID sinkVertexId = new JobVertexID();
-		final JobVertex sinkVertex = new JobVertex("Test sink", sinkVertexId);
-		sinkVertex.setInvokableClass(NoOpInvokable.class);
-		sinkVertex.setParallelism(parallelism);
-
-		sinkVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
-
-		final Map<JobVertexID, CompletableFuture<LogicalSlot>[]> slotFutures = new HashMap<>(2);
-
-		for (JobVertexID jobVertexID : Arrays.asList(sourceVertexId, sinkVertexId)) {
-			CompletableFuture<LogicalSlot>[] slotFutureArray = new CompletableFuture[parallelism];
-
-			for (int i = 0; i < parallelism; i++) {
-				slotFutureArray[i] = new CompletableFuture<>();
-			}
-
-			slotFutures.put(jobVertexID, slotFutureArray);
-			slotProvider.addSlots(jobVertexID, slotFutureArray);
-		}
-
-		final ScheduledExecutorService scheduledExecutorService = new ScheduledThreadPoolExecutor(3);
-
-		final JobGraph jobGraph = new JobGraph(sourceVertex, sinkVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final ExecutionGraph executionGraph = TestingExecutionGraphBuilder
-			.newBuilder()
-			.setJobGraph(jobGraph)
-			.setSlotProvider(slotProvider)
-			.setIoExecutor(scheduledExecutorService)
-			.setFutureExecutor(scheduledExecutorService)
-			.setAllocationTimeout(timeout)
-			.setRpcTimeout(timeout)
-			.build();
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		executionGraph.scheduleForExecution();
-
-		// all tasks should be in state SCHEDULED
-		for (ExecutionVertex executionVertex : executionGraph.getAllExecutionVertices()) {
-			assertEquals(ExecutionState.SCHEDULED, executionVertex.getCurrentExecutionAttempt().getState());
-		}
-
-		// wait until the source vertex slots have been requested
-		assertTrue(slotProvider.getSlotRequestedFuture(sourceVertexId, 0).get());
-		assertTrue(slotProvider.getSlotRequestedFuture(sourceVertexId, 1).get());
-
-		// check that the sinks have not requested their slots because they need the location
-		// information of the sources
-		assertFalse(slotProvider.getSlotRequestedFuture(sinkVertexId, 0).isDone());
-		assertFalse(slotProvider.getSlotRequestedFuture(sinkVertexId, 1).isDone());
-
-		final TaskManagerLocation localTaskManagerLocation = new LocalTaskManagerLocation();
-
-		final LogicalSlot sourceSlot1 = createSlot(localTaskManagerLocation, 0);
-		final LogicalSlot sourceSlot2 = createSlot(localTaskManagerLocation, 1);
-
-		final LogicalSlot sinkSlot1 = createSlot(localTaskManagerLocation, 0);
-		final LogicalSlot sinkSlot2 = createSlot(localTaskManagerLocation, 1);
-
-		slotFutures.get(sourceVertexId)[0].complete(sourceSlot1);
-		slotFutures.get(sourceVertexId)[1].complete(sourceSlot2);
-
-		// wait until the sink vertex slots have been requested after we completed the source slots
-		assertTrue(slotProvider.getSlotRequestedFuture(sinkVertexId, 0).get());
-		assertTrue(slotProvider.getSlotRequestedFuture(sinkVertexId, 1).get());
-
-		slotFutures.get(sinkVertexId)[0].complete(sinkSlot1);
-		slotFutures.get(sinkVertexId)[1].complete(sinkSlot2);
-
-		for (ExecutionVertex executionVertex : executionGraph.getAllExecutionVertices()) {
-			ExecutionGraphTestUtils.waitUntilExecutionState(executionVertex.getCurrentExecutionAttempt(), ExecutionState.DEPLOYING, 5000L);
-		}
-	}
-
-	/**
-	 * Tests that the {@link ExecutionGraph} is deployed in topological order.
-	 */
-	@Test
-	public void testExecutionGraphIsDeployedInTopologicalOrder() throws Exception {

Review comment:
       Can be removed because we already have `DefaultSchedulerTest#scheduledVertexOrderFromSchedulingStrategyIsRespected()`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331",
       "triggerID" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10374",
       "triggerID" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10435",
       "triggerID" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "89bea5233d5efb9db88eacc21b445a617a8c3c27",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "89bea5233d5efb9db88eacc21b445a617a8c3c27",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42 UNKNOWN
   * 52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10435) 
   * 89bea5233d5efb9db88eacc21b445a617a8c3c27 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331",
       "triggerID" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10374",
       "triggerID" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10435",
       "triggerID" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "89bea5233d5efb9db88eacc21b445a617a8c3c27",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10899",
       "triggerID" : "89bea5233d5efb9db88eacc21b445a617a8c3c27",
       "triggerType" : "PUSH"
     }, {
       "hash" : "dfd82cd0de7a46432eae70f1f86aafb823ed7ff2",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10935",
       "triggerID" : "dfd82cd0de7a46432eae70f1f86aafb823ed7ff2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42 UNKNOWN
   * dfd82cd0de7a46432eae70f1f86aafb823ed7ff2 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10935) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] tillrohrmann commented on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
tillrohrmann commented on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-747373482


   Things are looking good to me @zhuzhurk. Any other changes you want to apply?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * 021cac170ea26cddfd8af0a2bec5fea4e6a76b69 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r532552567



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/ExecutionGraphCheckpointCoordinatorTest.java
##########
@@ -159,7 +164,7 @@ private ExecutionGraph createExecutionGraphAndEnableCheckpointing(
 			false,
 			0);
 
-		executionGraph.enableCheckpointing(
+		scheduler.getExecutionGraph().enableCheckpointing(

Review comment:
       The `CheckpointIDCounter` and `CompletedCheckpointStore` are specified by tests so there is no easy way to do it like in `ArchivedExecutionGraphTest`.
   Possibly in the future `enableCheckpointing` can be factored out from EG, maybe into `SchedulerNG`. At that time we will not need to invoke this action on EG anymore.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331",
       "triggerID" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10374",
       "triggerID" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10435",
       "triggerID" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "89bea5233d5efb9db88eacc21b445a617a8c3c27",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10899",
       "triggerID" : "89bea5233d5efb9db88eacc21b445a617a8c3c27",
       "triggerType" : "PUSH"
     }, {
       "hash" : "dfd82cd0de7a46432eae70f1f86aafb823ed7ff2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10935",
       "triggerID" : "dfd82cd0de7a46432eae70f1f86aafb823ed7ff2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42 UNKNOWN
   * 89bea5233d5efb9db88eacc21b445a617a8c3c27 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10899) 
   * dfd82cd0de7a46432eae70f1f86aafb823ed7ff2 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10935) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r514046235



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphSchedulingTest.java
##########
@@ -1,637 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.api.common.time.Time;
-import org.apache.flink.configuration.Configuration;
-import org.apache.flink.metrics.groups.UnregisteredMetricsGroup;
-import org.apache.flink.runtime.blob.VoidBlobWriter;
-import org.apache.flink.runtime.checkpoint.StandaloneCheckpointRecoveryFactory;
-import org.apache.flink.runtime.clusterframework.types.AllocationID;
-import org.apache.flink.runtime.clusterframework.types.ResourceID;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.instance.SimpleSlotContext;
-import org.apache.flink.runtime.io.network.partition.NoOpJobMasterPartitionTracker;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.ScheduleMode;
-import org.apache.flink.runtime.jobmanager.scheduler.Locality;
-import org.apache.flink.runtime.jobmanager.slots.DummySlotOwner;
-import org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway;
-import org.apache.flink.runtime.jobmanager.slots.TestingSlotOwner;
-import org.apache.flink.runtime.jobmaster.LogicalSlot;
-import org.apache.flink.runtime.jobmaster.SlotOwner;
-import org.apache.flink.runtime.jobmaster.SlotRequestId;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlot;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlotBuilder;
-import org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.shuffle.NettyShuffleMaster;
-import org.apache.flink.runtime.taskmanager.TaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.runtime.testutils.DirectScheduledExecutorService;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.After;
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.net.InetAddress;
-import java.util.Set;
-import java.util.concurrent.ArrayBlockingQueue;
-import java.util.concurrent.BlockingQueue;
-import java.util.concurrent.CompletableFuture;
-import java.util.concurrent.ConcurrentHashMap;
-import java.util.concurrent.ConcurrentMap;
-import java.util.concurrent.ScheduledExecutorService;
-import java.util.concurrent.TimeUnit;
-import java.util.concurrent.TimeoutException;
-
-import static org.hamcrest.Matchers.empty;
-import static org.hamcrest.Matchers.is;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertThat;
-
-/**
- * Tests for the scheduling of the execution graph. This tests that
- * for example the order of deployments is correct and that bulk slot allocation
- * works properly.
- */
-public class ExecutionGraphSchedulingTest extends TestLogger {

Review comment:
       Can be remove because it's highly related to legacy scheduling. 
   There are already tests for latest scheduling in `DefaultSchedulerTest` and `SchedulingStrategy` tests.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r544246284



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphRestartTest.java
##########
@@ -1,866 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.ExecutionConfig;
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.api.common.restartstrategy.RestartStrategies;
-import org.apache.flink.api.common.time.Time;
-import org.apache.flink.runtime.clusterframework.types.AllocationID;
-import org.apache.flink.runtime.clusterframework.types.ResourceProfile;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutor;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.execution.SuppressRestartsException;
-import org.apache.flink.runtime.executiongraph.failover.FailoverStrategy;
-import org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
-import org.apache.flink.runtime.executiongraph.restart.RestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.NotCancelAckingTaskGateway;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.executiongraph.utils.SimpleSlotProvider;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.ScheduleMode;
-import org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup;
-import org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway;
-import org.apache.flink.runtime.jobmaster.JobMasterId;
-import org.apache.flink.runtime.jobmaster.slotpool.LocationPreferenceSlotSelectionStrategy;
-import org.apache.flink.runtime.jobmaster.slotpool.Scheduler;
-import org.apache.flink.runtime.jobmaster.slotpool.SchedulerImpl;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotPool;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.jobmaster.slotpool.TestingSlotPoolImpl;
-import org.apache.flink.runtime.resourcemanager.ResourceManagerGateway;
-import org.apache.flink.runtime.resourcemanager.utils.TestingResourceManagerGateway;
-import org.apache.flink.runtime.taskexecutor.slot.SlotOffer;
-import org.apache.flink.runtime.taskmanager.LocalTaskManagerLocation;
-import org.apache.flink.runtime.taskmanager.TaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.io.IOException;
-import java.util.ArrayList;
-import java.util.Iterator;
-import java.util.List;
-import java.util.concurrent.CompletableFuture;
-import java.util.function.Consumer;
-
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.completeCancellingForAllVertices;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.createNoOpVertex;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.finishAllVertices;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.switchToRunning;
-import static org.hamcrest.Matchers.is;
-import static org.hamcrest.Matchers.lessThanOrEqualTo;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertNotEquals;
-import static org.junit.Assert.assertNotNull;
-import static org.junit.Assert.assertThat;
-import static org.junit.Assert.assertTrue;
-
-/**
- * Tests the restart behaviour of the {@link ExecutionGraph}.
- */
-public class ExecutionGraphRestartTest extends TestLogger {

Review comment:
       I think you are right that we should keep these tests for concurrent canceling/failing/suspending.
   But given that there is another `testSuspendWhileRestarting` in `ExecutionGraphSuspendTest`, I think we can drop this one and slightly improve that one to check whether it leaves SUSPENDED state after task restarting is triggered later so we do not need to maintain duplicated tests.
   
   A bad and good news is, by reworking these tests, I found a bug that "Canceling a job when it is failing will result in job hanging in CANCELING state" (FLINK-20626). PR #14405 is opened to fix it and rework of `ExecutionGraphRestartTest` is included in it. Would you take a look?
   
   
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r532533688



##########
File path: flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/JobMaster.java
##########
@@ -917,15 +917,15 @@ private void resetAndStartScheduler() throws Exception {
 
 		if (schedulerNG.requestJobStatus() == JobStatus.CREATED) {
 			schedulerAssignedFuture = CompletableFuture.completedFuture(null);
-			schedulerNG.setMainThreadExecutor(getMainThreadExecutor());
+			schedulerNG.start(getMainThreadExecutor());

Review comment:
       done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * fef9bff28a988ffab789fc6bb0cbde754273a2e0 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630) 
   * 1e959ffb3e7837247842ae4ade724a999ad7ca3b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r532444044



##########
File path: flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/JobMaster.java
##########
@@ -917,15 +917,15 @@ private void resetAndStartScheduler() throws Exception {
 
 		if (schedulerNG.requestJobStatus() == JobStatus.CREATED) {
 			schedulerAssignedFuture = CompletableFuture.completedFuture(null);
-			schedulerNG.setMainThreadExecutor(getMainThreadExecutor());
+			schedulerNG.start(getMainThreadExecutor());

Review comment:
       `initialize` sounds good to me




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r514122257



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionVertexInputConstraintTest.java
##########
@@ -1,276 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.InputDependencyConstraint;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutor;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.utils.SimpleSlotProvider;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionID;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.JobVertexID;
-import org.apache.flink.runtime.jobgraph.tasks.AbstractInvokable;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.taskmanager.TaskExecutionState;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.Test;
-
-import java.time.Duration;
-import java.util.Arrays;
-import java.util.List;
-
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.isInExecutionState;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.waitForAllExecutionsPredicate;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.waitUntilExecutionVertexState;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.waitUntilJobStatus;
-import static org.hamcrest.Matchers.lessThan;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertThat;
-import static org.junit.Assert.assertTrue;
-
-/**
- * Tests for the inputs constraint for {@link ExecutionVertex}.
- */
-public class ExecutionVertexInputConstraintTest extends TestLogger {

Review comment:
       superseded by `InputDependencyConstraintCheckerTest`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331",
       "triggerID" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10374",
       "triggerID" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10435",
       "triggerID" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42 UNKNOWN
   * 52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10435) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r543404219



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphSchedulingTest.java
##########
@@ -1,637 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.api.common.time.Time;
-import org.apache.flink.configuration.Configuration;
-import org.apache.flink.metrics.groups.UnregisteredMetricsGroup;
-import org.apache.flink.runtime.blob.VoidBlobWriter;
-import org.apache.flink.runtime.checkpoint.StandaloneCheckpointRecoveryFactory;
-import org.apache.flink.runtime.clusterframework.types.AllocationID;
-import org.apache.flink.runtime.clusterframework.types.ResourceID;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.instance.SimpleSlotContext;
-import org.apache.flink.runtime.io.network.partition.NoOpJobMasterPartitionTracker;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.ScheduleMode;
-import org.apache.flink.runtime.jobmanager.scheduler.Locality;
-import org.apache.flink.runtime.jobmanager.slots.DummySlotOwner;
-import org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway;
-import org.apache.flink.runtime.jobmanager.slots.TestingSlotOwner;
-import org.apache.flink.runtime.jobmaster.LogicalSlot;
-import org.apache.flink.runtime.jobmaster.SlotOwner;
-import org.apache.flink.runtime.jobmaster.SlotRequestId;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlot;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlotBuilder;
-import org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.shuffle.NettyShuffleMaster;
-import org.apache.flink.runtime.taskmanager.TaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.runtime.testutils.DirectScheduledExecutorService;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.After;
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.net.InetAddress;
-import java.util.Set;
-import java.util.concurrent.ArrayBlockingQueue;
-import java.util.concurrent.BlockingQueue;
-import java.util.concurrent.CompletableFuture;
-import java.util.concurrent.ConcurrentHashMap;
-import java.util.concurrent.ConcurrentMap;
-import java.util.concurrent.ScheduledExecutorService;
-import java.util.concurrent.TimeUnit;
-import java.util.concurrent.TimeoutException;
-
-import static org.hamcrest.Matchers.empty;
-import static org.hamcrest.Matchers.is;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertThat;
-
-/**
- * Tests for the scheduling of the execution graph. This tests that
- * for example the order of deployments is correct and that bulk slot allocation
- * works properly.
- */
-public class ExecutionGraphSchedulingTest extends TestLogger {
-
-	private final ScheduledExecutorService executor = new DirectScheduledExecutorService();
-
-	@After
-	public void shutdown() {
-		executor.shutdownNow();
-	}
-
-	// ------------------------------------------------------------------------
-	//  Tests
-	// ------------------------------------------------------------------------
-
-	/**
-	 * Tests that with scheduling futures and pipelined deployment, the target vertex will
-	 * not deploy its task before the source vertex does.
-	 */
-	@Test
-	public void testScheduleSourceBeforeTarget() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 1;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final CompletableFuture<LogicalSlot> sourceFuture = new CompletableFuture<>();
-		final CompletableFuture<LogicalSlot> targetFuture = new CompletableFuture<>();
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlot(sourceVertex.getID(), 0, sourceFuture);
-		slotProvider.addSlot(targetVertex.getID(), 0, targetFuture);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		//  set up two TaskManager gateways and slots
-
-		final InteractionsCountingTaskManagerGateway gatewaySource = createTaskManager();
-		final InteractionsCountingTaskManagerGateway gatewayTarget = createTaskManager();
-
-		final LogicalSlot sourceSlot = createTestingLogicalSlot(gatewaySource);
-		final LogicalSlot targetSlot = createTestingLogicalSlot(gatewayTarget);
-
-		eg.scheduleForExecution();
-
-		// job should be running
-		assertEquals(JobStatus.RUNNING, eg.getState());
-
-		// we fulfill the target slot before the source slot
-		// that should not cause a deployment or deployment related failure
-		targetFuture.complete(targetSlot);
-
-		assertThat(gatewayTarget.getSubmitTaskCount(), is(0));
-		assertEquals(JobStatus.RUNNING, eg.getState());
-
-		// now supply the source slot
-		sourceFuture.complete(sourceSlot);
-
-		// by now, all deployments should have happened
-		assertThat(gatewaySource.getSubmitTaskCount(), is(1));
-		assertThat(gatewayTarget.getSubmitTaskCount(), is(1));
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-	}
-
-	private TestingLogicalSlot createTestingLogicalSlot(InteractionsCountingTaskManagerGateway gatewaySource) {
-		return new TestingLogicalSlotBuilder()
-			.setTaskManagerGateway(gatewaySource)
-			.createTestingLogicalSlot();
-	}
-
-	/**
-	 * This test verifies that before deploying a pipelined connected component, the
-	 * full set of slots is available, and that not some tasks are deployed, and later the
-	 * system realizes that not enough resources are available.
-	 */
-	@Test
-	public void testDeployPipelinedConnectedComponentsTogether() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 8;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] sourceFutures = new CompletableFuture[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] targetFutures = new CompletableFuture[parallelism];
-
-		//
-		//  Create the slots, futures, and the slot provider
-
-		final InteractionsCountingTaskManagerGateway[] sourceTaskManagers = new InteractionsCountingTaskManagerGateway[parallelism];
-		final InteractionsCountingTaskManagerGateway[] targetTaskManagers = new InteractionsCountingTaskManagerGateway[parallelism];
-
-		final LogicalSlot[] sourceSlots = new LogicalSlot[parallelism];
-		final LogicalSlot[] targetSlots = new LogicalSlot[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			sourceTaskManagers[i] = createTaskManager();
-			targetTaskManagers[i] = createTaskManager();
-
-			sourceSlots[i] = createTestingLogicalSlot(sourceTaskManagers[i]);
-			targetSlots[i] = createTestingLogicalSlot(targetTaskManagers[i]);
-
-			sourceFutures[i] = new CompletableFuture<>();
-			targetFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(sourceVertex.getID(), sourceFutures);
-		slotProvider.addSlots(targetVertex.getID(), targetFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-
-		//
-		//  we complete some of the futures
-
-		for (int i = 0; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-		}
-
-		//
-		//  kick off the scheduling
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		verifyNothingDeployed(eg, sourceTaskManagers);
-
-		//  complete the remaining sources
-		for (int i = 1; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-		}
-		verifyNothingDeployed(eg, sourceTaskManagers);
-
-		//  complete the targets except for one
-		for (int i = 1; i < parallelism; i++) {
-			targetFutures[i].complete(targetSlots[i]);
-		}
-		verifyNothingDeployed(eg, targetTaskManagers);
-
-		//  complete the last target slot future
-		targetFutures[0].complete(targetSlots[0]);
-
-		//
-		//  verify that all deployments have happened
-
-		for (InteractionsCountingTaskManagerGateway gateway : sourceTaskManagers) {
-			assertThat(gateway.getSubmitTaskCount(), is(1));
-		}
-		for (InteractionsCountingTaskManagerGateway gateway : targetTaskManagers) {
-			assertThat(gateway.getSubmitTaskCount(), is(1));
-		}
-	}
-
-	/**
-	 * This test verifies that if one slot future fails, the deployment will be aborted.
-	 */
-	@Test
-	public void testOneSlotFailureAbortsDeploy() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 6;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.POINTWISE, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		//
-		//  Create the slots, futures, and the slot provider
-
-		final InteractionsCountingTaskManagerGateway taskManager = createTaskManager();
-		final BlockingQueue<AllocationID> returnedSlots = new ArrayBlockingQueue<>(parallelism);
-		final TestingSlotOwner slotOwner = new TestingSlotOwner();
-		slotOwner.setReturnAllocatedSlotConsumer(
-			(LogicalSlot logicalSlot) -> returnedSlots.offer(logicalSlot.getAllocationId()));
-
-		final LogicalSlot[] sourceSlots = new LogicalSlot[parallelism];
-		final LogicalSlot[] targetSlots = new LogicalSlot[parallelism];
-
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] sourceFutures = new CompletableFuture[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] targetFutures = new CompletableFuture[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			sourceSlots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-			targetSlots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-
-			sourceFutures[i] = new CompletableFuture<>();
-			targetFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(sourceVertex.getID(), sourceFutures);
-		slotProvider.addSlots(targetVertex.getID(), targetFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		//
-		//  we complete some of the futures
-
-		for (int i = 0; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-			targetFutures[i].complete(targetSlots[i]);
-		}
-
-		//  kick off the scheduling
-		eg.scheduleForExecution();
-
-		// fail one slot
-		sourceFutures[1].completeExceptionally(new TestRuntimeException());
-
-		// wait until the job failed as a whole
-		eg.getTerminationFuture().get(2000, TimeUnit.MILLISECONDS);
-
-		// wait until all slots are back
-		for (int i = 0; i < parallelism; i++) {
-			returnedSlots.poll(2000L, TimeUnit.MILLISECONDS);
-		}
-
-		// no deployment calls must have happened
-		assertThat(taskManager.getSubmitTaskCount(), is(0));
-
-		// all completed futures must have been returns
-		for (int i = 0; i < parallelism; i += 2) {
-			assertFalse(sourceSlots[i].isAlive());
-			assertFalse(targetSlots[i].isAlive());
-		}
-	}
-
-	/**
-	 * This tests makes sure that with eager scheduling no task is deployed if a single
-	 * slot allocation fails. Moreover we check that allocated slots will be returned.
-	 */
-	@Test
-	public void testEagerSchedulingWithSlotTimeout() throws Exception {
-
-		//  we construct a simple graph:    (task)
-
-		final int parallelism = 3;
-
-		final JobVertex vertex = new JobVertex("task");
-		vertex.setParallelism(parallelism);
-		vertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", vertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final BlockingQueue<AllocationID> returnedSlots = new ArrayBlockingQueue<>(2);
-		final TestingSlotOwner slotOwner = new TestingSlotOwner();
-		slotOwner.setReturnAllocatedSlotConsumer(
-			(LogicalSlot logicalSlot) -> returnedSlots.offer(logicalSlot.getAllocationId()));
-
-		final InteractionsCountingTaskManagerGateway taskManager = createTaskManager();
-		final LogicalSlot[] slots = new LogicalSlot[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] slotFutures = new CompletableFuture[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			slots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-			slotFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(vertex.getID(), slotFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-
-		//  we complete one future
-		slotFutures[1].complete(slots[1]);
-
-		//  kick off the scheduling
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		//  we complete another future
-		slotFutures[2].complete(slots[2]);
-
-		// check that the ExecutionGraph is not terminated yet
-		assertThat(eg.getTerminationFuture().isDone(), is(false));
-
-		// time out one of the slot futures
-		slotFutures[0].completeExceptionally(new TimeoutException("Test time out"));
-
-		assertThat(eg.getTerminationFuture().get(), is(JobStatus.FAILED));
-
-		// wait until all slots are back
-		for (int i = 0; i < parallelism - 1; i++) {
-			returnedSlots.poll(2000, TimeUnit.MILLISECONDS);
-		}
-
-		//  verify that no deployments have happened
-		assertThat(taskManager.getSubmitTaskCount(), is(0));
-	}
-
-	/**
-	 * Tests that an ongoing scheduling operation does not fail the {@link ExecutionGraph}
-	 * if it gets concurrently cancelled.
-	 */
-	@Test
-	public void testSchedulingOperationCancellationWhenCancel() throws Exception {
-		final JobVertex jobVertex = new JobVertex("NoOp JobVertex");
-		jobVertex.setInvokableClass(NoOpInvokable.class);
-		jobVertex.setParallelism(2);
-		final JobGraph jobGraph = new JobGraph(jobVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final CompletableFuture<LogicalSlot> slotFuture1 = new CompletableFuture<>();
-		final CompletableFuture<LogicalSlot> slotFuture2 = new CompletableFuture<>();
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(2);
-		slotProvider.addSlots(jobVertex.getID(), new CompletableFuture[]{slotFuture1, slotFuture2});
-		final ExecutionGraph executionGraph = createExecutionGraph(jobGraph, slotProvider);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		executionGraph.scheduleForExecution();
-
-		final TestingLogicalSlot slot = createTestingSlot();
-		final CompletableFuture<?> releaseFuture = slot.getReleaseFuture();
-		slotFuture1.complete(slot);
-
-		// cancel should change the state of all executions to CANCELLED
-		executionGraph.cancel();
-
-		// complete the now CANCELLED execution --> this should cause a failure
-		slotFuture2.complete(new TestingLogicalSlotBuilder().createTestingLogicalSlot());
-
-		Thread.sleep(1L);
-		// release the first slot to finish the cancellation
-		releaseFuture.complete(null);
-
-		// NOTE: This test will only occasionally fail without the fix since there is
-		// a race between the releaseFuture and the slotFuture2
-		assertThat(executionGraph.getTerminationFuture().get(), is(JobStatus.CANCELED));
-	}
-
-	/**
-	 * Tests that a partially completed eager scheduling operation fails if a
-	 * completed slot is released. See FLINK-9099.
-	 */
-	@Test
-	public void testSlotReleasingFailsSchedulingOperation() throws Exception {
-		final int parallelism = 2;
-
-		final JobVertex jobVertex = new JobVertex("Testing job vertex");
-		jobVertex.setInvokableClass(NoOpInvokable.class);
-		jobVertex.setParallelism(parallelism);
-		final JobGraph jobGraph = new JobGraph(jobVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-
-		final LogicalSlot slot = createSingleLogicalSlot(new DummySlotOwner(), new SimpleAckingTaskManagerGateway(), new SlotRequestId());
-		slotProvider.addSlot(jobVertex.getID(), 0, CompletableFuture.completedFuture(slot));
-
-		final CompletableFuture<LogicalSlot> slotFuture = new CompletableFuture<>();
-		slotProvider.addSlot(jobVertex.getID(), 1, slotFuture);
-
-		final ExecutionGraph executionGraph = createExecutionGraph(jobGraph, slotProvider);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		executionGraph.scheduleForExecution();
-
-		assertThat(executionGraph.getState(), is(JobStatus.RUNNING));
-
-		final ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertex.getID());
-		final ExecutionVertex[] taskVertices = executionJobVertex.getTaskVertices();
-		assertThat(taskVertices[0].getExecutionState(), is(ExecutionState.SCHEDULED));
-		assertThat(taskVertices[1].getExecutionState(), is(ExecutionState.SCHEDULED));
-
-		// fail the single allocated slot --> this should fail the scheduling operation
-		slot.releaseSlot(new FlinkException("Test failure"));
-
-		assertThat(executionGraph.getTerminationFuture().get(), is(JobStatus.FAILED));
-	}
-
-	/**
-	 * Tests that all slots are being returned to the {@link SlotOwner} if the
-	 * {@link ExecutionGraph} is being cancelled. See FLINK-9908
-	 */
-	@Test
-	public void testCancellationOfIncompleteScheduling() throws Exception {

Review comment:
       I think that on invocation of `DefaultScheduler.cancel()`, the job will be fully canceled and the JobMaster will shutdown and `close` the SlotPool. `SlotPoolImpl#close()` will release all slots, both allocated and available ones. So looks to me we the scheduler does not need to take care of slot releasing in this case. Please correct me if I missed anything.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r532555177



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphVariousFailuesTest.java
##########
@@ -20,86 +20,31 @@
 
 import org.apache.flink.api.common.JobStatus;
 import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.SuppressRestartsException;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
 import org.apache.flink.runtime.io.network.partition.ResultPartitionID;
 import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobGraph;
+import org.apache.flink.runtime.scheduler.SchedulerBase;
+import org.apache.flink.runtime.scheduler.SchedulerTestingUtils;
 import org.apache.flink.util.TestLogger;
 
 import org.junit.Test;
 
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.fail;
 
-
 public class ExecutionGraphVariousFailuesTest extends TestLogger {
 
-	/**
-	 * Test that failing in state restarting will retrigger the restarting logic. This means that
-	 * it only goes into the state FAILED after the restart strategy says the job is no longer
-	 * restartable.
-	 */
-	@Test
-	public void testFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(2));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("Test 1"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-
-		// we should restart since we have two restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 2"));
-
-		// we should restart since we have one restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 3"));
-
-		// after depleting all our restart attempts we should go into Failed
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
-	/**
-	 * Tests that a {@link SuppressRestartsException} in state RESTARTING stops the restarting
-	 * immediately and sets the execution graph's state to FAILED.
-	 */
-	@Test
-	public void testSuppressRestartFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("test"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		// suppress a possible restart
-		eg.failGlobal(new SuppressRestartsException(new Exception("Test")));
-
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
 	/**
 	 * Tests that a failing scheduleOrUpdateConsumers call with a non-existing execution attempt
 	 * id, will not fail the execution graph.
 	 */
 	@Test
 	public void testFailingScheduleOrUpdateConsumers() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
+		final SchedulerBase scheduler = SchedulerTestingUtils.newSchedulerBuilder(new JobGraph()).build();
+		scheduler.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
+		scheduler.startScheduling();
+
+		final ExecutionGraph eg = scheduler.getExecutionGraph();

Review comment:
       @tillrohrmann Sure we can do that. Maybe `@TestLegacyScheduling`?
   I will go through the modified tests to try to find out them.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r542362945



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionVertexInputConstraintTest.java
##########
@@ -1,276 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.InputDependencyConstraint;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutor;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.utils.SimpleSlotProvider;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionID;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.JobVertexID;
-import org.apache.flink.runtime.jobgraph.tasks.AbstractInvokable;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.taskmanager.TaskExecutionState;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.Test;
-
-import java.time.Duration;
-import java.util.Arrays;
-import java.util.List;
-
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.isInExecutionState;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.waitForAllExecutionsPredicate;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.waitUntilExecutionVertexState;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.waitUntilJobStatus;
-import static org.hamcrest.Matchers.lessThan;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertThat;
-import static org.junit.Assert.assertTrue;
-
-/**
- * Tests for the inputs constraint for {@link ExecutionVertex}.
- */
-public class ExecutionVertexInputConstraintTest extends TestLogger {

Review comment:
       The perf test was for legacy scheduling and for schedulerNG it is `DefaultSchedulerTest#testInputConstraintALLPerf`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r532450806



##########
File path: flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/TaskManagerLocation.java
##########
@@ -256,7 +256,7 @@ public boolean equals(Object obj) {
 		if (obj == this) {
 			return true;
 		}
-		else if (obj != null && obj.getClass() == TaskManagerLocation.class) {
+		else if (obj != null && obj.getClass() == getClass()) {

Review comment:
       It is needed because otherwise this `equals` will not work for `LocalTaskManagerLocation` which extends `TaskManagerLocation`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r543306037



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java
##########
@@ -82,202 +82,12 @@
 	private final TestingComponentMainThreadExecutor testMainThreadUtil =
 		EXECUTOR_RESOURCE.getComponentMainThreadTestExecutor();
 
-	/**
-	 * Tests that slots are released if we cannot assign the allocated resource to the
-	 * Execution.
-	 */
-	@Test
-	public void testSlotReleaseOnFailedResourceAssignment() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final CompletableFuture<LogicalSlot> slotFuture = new CompletableFuture<>();
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, slotFuture);
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final LogicalSlot otherSlot = new TestingLogicalSlotBuilder().createTestingLogicalSlot();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertFalse(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		// assign a different resource to the execution
-		assertTrue(execution.tryAssignResource(otherSlot));
-
-		// completing now the future should cause the slot to be released
-		slotFuture.complete(slot);
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
 	private TestingLogicalSlot createTestingLogicalSlot(SlotOwner slotOwner) {
 		return new TestingLogicalSlotBuilder()
 			.setSlotOwner(slotOwner)
 			.createTestingLogicalSlot();
 	}
 
-	/**
-	 * Tests that the slot is released in case of a execution cancellation when having
-	 * a slot assigned and being in state SCHEDULED.
-	 */
-	@Test
-	public void testSlotReleaseOnExecutionCancellationInScheduled() throws Exception {

Review comment:
       I think this test is not needed because 
   - the slot releasing check is covered by `ExecutionTest#testCanceledExecutionReturnsSlot` (renamed from `testEagerSchedulingFailureReturnsSlot `).
   - the state transition check is covered in `ExecutionVertexCancelTest#testCancelFromScheduled`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331",
       "triggerID" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42 UNKNOWN
   * 20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r514115275



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java
##########
@@ -1,570 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.runtime.checkpoint.JobManagerTaskRestore;
-import org.apache.flink.runtime.checkpoint.TaskStateSnapshot;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.JobVertexID;
-import org.apache.flink.runtime.jobgraph.tasks.AbstractInvokable;
-import org.apache.flink.runtime.jobmanager.scheduler.LocationPreferenceConstraint;
-import org.apache.flink.runtime.jobmaster.LogicalSlot;
-import org.apache.flink.runtime.jobmaster.SlotOwner;
-import org.apache.flink.runtime.jobmaster.SlotRequestId;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlot;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlotBuilder;
-import org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.shuffle.PartitionDescriptor;
-import org.apache.flink.runtime.shuffle.ProducerDescriptor;
-import org.apache.flink.runtime.shuffle.ShuffleDescriptor;
-import org.apache.flink.runtime.shuffle.ShuffleMaster;
-import org.apache.flink.runtime.taskmanager.LocalTaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.ClassRule;
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.util.Collection;
-import java.util.Collections;
-import java.util.Set;
-import java.util.concurrent.CompletableFuture;
-import java.util.concurrent.CountDownLatch;
-import java.util.concurrent.ExecutionException;
-
-import static org.apache.flink.runtime.io.network.partition.ResultPartitionType.PIPELINED;
-import static org.apache.flink.runtime.jobgraph.DistributionPattern.POINTWISE;
-import static org.hamcrest.Matchers.equalTo;
-import static org.hamcrest.Matchers.hasSize;
-import static org.hamcrest.Matchers.is;
-import static org.hamcrest.Matchers.notNullValue;
-import static org.hamcrest.Matchers.nullValue;
-import static org.hamcrest.Matchers.sameInstance;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertThat;
-import static org.junit.Assert.assertTrue;
-
-/**
- * Tests for the {@link Execution}.
- */
-public class ExecutionTest extends TestLogger {

Review comment:
       Removed cases which test the legacy scheduling code paths, e.g. allocateResourcesForExecution()
   `ExecutionSlotAllocator`s now takes over slot allocation and there are tests for them.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331",
       "triggerID" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10374",
       "triggerID" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42 UNKNOWN
   * e7941f905ec697ad09a3f1010f90a2a69a512ce0 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10374) 
   * 52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * 517b53a68f3e9c8c0897cd7afba90b8a9befaa4f Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807) 
   * 021cac170ea26cddfd8af0a2bec5fea4e6a76b69 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r543395440



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java
##########
@@ -82,202 +82,12 @@
 	private final TestingComponentMainThreadExecutor testMainThreadUtil =
 		EXECUTOR_RESOURCE.getComponentMainThreadTestExecutor();
 
-	/**
-	 * Tests that slots are released if we cannot assign the allocated resource to the
-	 * Execution.
-	 */
-	@Test
-	public void testSlotReleaseOnFailedResourceAssignment() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final CompletableFuture<LogicalSlot> slotFuture = new CompletableFuture<>();
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, slotFuture);
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final LogicalSlot otherSlot = new TestingLogicalSlotBuilder().createTestingLogicalSlot();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertFalse(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		// assign a different resource to the execution
-		assertTrue(execution.tryAssignResource(otherSlot));
-
-		// completing now the future should cause the slot to be released
-		slotFuture.complete(slot);
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
 	private TestingLogicalSlot createTestingLogicalSlot(SlotOwner slotOwner) {
 		return new TestingLogicalSlotBuilder()
 			.setSlotOwner(slotOwner)
 			.createTestingLogicalSlot();
 	}
 
-	/**
-	 * Tests that the slot is released in case of a execution cancellation when having
-	 * a slot assigned and being in state SCHEDULED.
-	 */
-	@Test
-	public void testSlotReleaseOnExecutionCancellationInScheduled() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, CompletableFuture.completedFuture(slot));
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertTrue(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		assertEquals(slot, execution.getAssignedResource());
-
-		// cancelling the execution should move it into state CANCELED
-		execution.cancel();
-		assertEquals(ExecutionState.CANCELED, execution.getState());
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
-	/**
-	 * Tests that the slot is released in case of a execution cancellation when being in state
-	 * RUNNING.
-	 */
-	@Test
-	public void testSlotReleaseOnExecutionCancellationInRunning() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, CompletableFuture.completedFuture(slot));
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertTrue(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		assertEquals(slot, execution.getAssignedResource());
-
-		execution.deploy();
-
-		execution.switchToRunning();
-
-		// cancelling the execution should move it into state CANCELING
-		execution.cancel();
-		assertEquals(ExecutionState.CANCELING, execution.getState());
-
-		execution.completeCancelling();
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
-	/**
-	 * Tests that a slot allocation from a {@link SlotProvider} is cancelled if the
-	 * {@link Execution} is cancelled.
-	 */
-	@Test
-	public void testSlotAllocationCancellationWhenExecutionCancelled() throws Exception {

Review comment:
       This test is outdated because it is testing the implementation of legacy method `allocateResourcesForExecution()`, while the DefaultScheduler has taken over the responsibility to allocate a slot for an execution. The cancellation of pending requests on vertex cancellation now locates in `DefaultScheduler#cancelExecutionVertex()`.
   But I think you are right that now we lacks a test to verify that `ExecutionSlotAllocator#cancel()` is invoked when an execution vertex is canceled. I will add one in `DefaultSchedulerTest`.
   
   Regarding `SchedulerNG.cancel()`, currently canceling a job does not directly cancel all pending requests in SlotPool. After the entire job is canceled and JobMaster is shutting down, all pending requests as well as allocated slots will be released when the SlotPool is closed. So I think the test needed is `SlotPoolImplTest#testShutdownCancelsAllPendingRequests()` which is missing at the moment. I will add it as well.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r514120224



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionVertexCancelTest.java
##########
@@ -244,104 +250,6 @@ public void testSendCancelAndReceiveFail() throws Exception {
 		assertEquals(vertices.length - 1, exec.getVertex().getExecutionGraph().getRegisteredExecutions().size());
 	}
 
-	// --------------------------------------------------------------------------------------------
-	//  Actions after a vertex has been canceled or while canceling
-	// --------------------------------------------------------------------------------------------
-
-	@Test
-	public void testScheduleOrDeployAfterCancel() {
-		try {
-			final ExecutionVertex vertex = getExecutionVertex();
-			setVertexState(vertex, ExecutionState.CANCELED);
-
-			assertEquals(ExecutionState.CANCELED, vertex.getExecutionState());
-
-			// 1)
-			// scheduling after being canceled should be tolerated (no exception) because
-			// it can occur as the result of races
-			{
-				vertex.scheduleForExecution(
-					TestingSlotProviderStrategy.from(new ProgrammedSlotProvider(1)),
-					LocationPreferenceConstraint.ALL,
-					Collections.emptySet());
-
-				assertEquals(ExecutionState.CANCELED, vertex.getExecutionState());
-			}
-
-			// 2)
-			// deploying after canceling from CREATED needs to raise an exception, because
-			// the scheduler (or any caller) needs to know that the slot should be released
-			try {
-
-				final LogicalSlot slot = new TestingLogicalSlotBuilder().createTestingLogicalSlot();
-
-				vertex.deployToSlot(slot);
-				fail("Method should throw an exception");
-			}
-			catch (IllegalStateException e) {
-				assertEquals(ExecutionState.CANCELED, vertex.getExecutionState());
-			}
-		}
-		catch (Exception e) {
-			e.printStackTrace();
-			fail(e.getMessage());
-		}
-	}
-
-	@Test
-	public void testActionsWhileCancelling() {

Review comment:
       superseded by `DefaultSchedulerTest#skipDeploymentIfVertexVersionOutdated()`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
flinkbot commented on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331",
       "triggerID" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10374",
       "triggerID" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10435",
       "triggerID" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "89bea5233d5efb9db88eacc21b445a617a8c3c27",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10899",
       "triggerID" : "89bea5233d5efb9db88eacc21b445a617a8c3c27",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42 UNKNOWN
   * 52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10435) 
   * 89bea5233d5efb9db88eacc21b445a617a8c3c27 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10899) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * fef9bff28a988ffab789fc6bb0cbde754273a2e0 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r543395440



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java
##########
@@ -82,202 +82,12 @@
 	private final TestingComponentMainThreadExecutor testMainThreadUtil =
 		EXECUTOR_RESOURCE.getComponentMainThreadTestExecutor();
 
-	/**
-	 * Tests that slots are released if we cannot assign the allocated resource to the
-	 * Execution.
-	 */
-	@Test
-	public void testSlotReleaseOnFailedResourceAssignment() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final CompletableFuture<LogicalSlot> slotFuture = new CompletableFuture<>();
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, slotFuture);
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final LogicalSlot otherSlot = new TestingLogicalSlotBuilder().createTestingLogicalSlot();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertFalse(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		// assign a different resource to the execution
-		assertTrue(execution.tryAssignResource(otherSlot));
-
-		// completing now the future should cause the slot to be released
-		slotFuture.complete(slot);
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
 	private TestingLogicalSlot createTestingLogicalSlot(SlotOwner slotOwner) {
 		return new TestingLogicalSlotBuilder()
 			.setSlotOwner(slotOwner)
 			.createTestingLogicalSlot();
 	}
 
-	/**
-	 * Tests that the slot is released in case of a execution cancellation when having
-	 * a slot assigned and being in state SCHEDULED.
-	 */
-	@Test
-	public void testSlotReleaseOnExecutionCancellationInScheduled() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, CompletableFuture.completedFuture(slot));
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertTrue(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		assertEquals(slot, execution.getAssignedResource());
-
-		// cancelling the execution should move it into state CANCELED
-		execution.cancel();
-		assertEquals(ExecutionState.CANCELED, execution.getState());
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
-	/**
-	 * Tests that the slot is released in case of a execution cancellation when being in state
-	 * RUNNING.
-	 */
-	@Test
-	public void testSlotReleaseOnExecutionCancellationInRunning() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, CompletableFuture.completedFuture(slot));
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertTrue(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		assertEquals(slot, execution.getAssignedResource());
-
-		execution.deploy();
-
-		execution.switchToRunning();
-
-		// cancelling the execution should move it into state CANCELING
-		execution.cancel();
-		assertEquals(ExecutionState.CANCELING, execution.getState());
-
-		execution.completeCancelling();
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
-	/**
-	 * Tests that a slot allocation from a {@link SlotProvider} is cancelled if the
-	 * {@link Execution} is cancelled.
-	 */
-	@Test
-	public void testSlotAllocationCancellationWhenExecutionCancelled() throws Exception {

Review comment:
       This test is outdated because it is testing the implementation of legacy method `allocateResourcesForExecution()`, while the DefaultScheduler has taken over the responsibility to allocate a slot for an execution. The cancellation of pending requests on vertex cancellation now locates in `DefaultScheduler#cancelExecutionVertex()`.
   But I think you are right that now we lacks a test to verify that `ExecutionSlotAllocator#cancel()` is invoked when an execution vertex is canceled. I will add one in `DefaultSchedulerTest`.
   
   Regarding `SchedulerNG.cancel()`, currently canceling a job does not directly cancel all pending requests in SlotPool. Instead, after the entire job is canceled and JobMaster is shutting down, all pending requests as well as allocated slots will be released when the SlotPool is closed. So I think the test needed is `SlotPoolImplTest#testShutdownCancelsAllPendingRequests()` which is missing at the moment. I will add it as well.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk closed pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk closed pull request #13641:
URL: https://github.com/apache/flink/pull/13641


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r543925697



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/metrics/RestartTimeGaugeTest.java
##########
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.executiongraph.metrics;
+
+import org.apache.flink.api.common.JobStatus;
+import org.apache.flink.runtime.executiongraph.TestingJobStatusProvider;
+import org.apache.flink.util.TestLogger;
+
+import org.junit.Test;
+
+import java.util.HashMap;
+import java.util.Map;
+
+import static org.hamcrest.Matchers.greaterThan;
+import static org.hamcrest.Matchers.is;
+import static org.junit.Assert.assertThat;
+
+/**
+ * Tests for {@link RestartTimeGauge}.
+ */
+public class RestartTimeGaugeTest extends TestLogger {
+
+	@Test
+	public void testNotRestarted() {
+		final RestartTimeGauge gauge = new RestartTimeGauge(new TestingJobStatusProvider(JobStatus.RUNNING, -1));
+		assertThat(gauge.getValue(), is(0L));
+	}
+
+	@Test
+	public void testInRestarting() {
+		final Map<JobStatus, Long> statusTimestampMap = new HashMap<>();
+		statusTimestampMap.put(JobStatus.RESTARTING, 1L);
+
+		final RestartTimeGauge gauge = new RestartTimeGauge(
+			new TestingJobStatusProvider(
+				JobStatus.RESTARTING,
+				status -> statusTimestampMap.getOrDefault(status, -1L)));
+		// System.currentTimeMillis() is surely to be larger than 123L

Review comment:
       removed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r514126805



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/GlobalModVersionTest.java
##########
@@ -1,200 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.failover.FailoverStrategy;
-import org.apache.flink.runtime.executiongraph.failover.FailoverStrategy.Factory;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleSlotProvider;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.Test;
-
-import java.util.Random;
-
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.waitUntilExecutionState;
-import static org.junit.Assert.assertEquals;
-import static org.mockito.Mockito.any;
-import static org.mockito.Mockito.mock;
-import static org.mockito.Mockito.times;
-import static org.mockito.Mockito.verify;
-
-public class GlobalModVersionTest extends TestLogger {

Review comment:
       This test relies on the legacy failGlobal process and is outdated.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * 1e959ffb3e7837247842ae4ade724a999ad7ca3b Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734) 
   * a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r533275538



##########
File path: flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/DefaultScheduler.java
##########
@@ -188,7 +188,7 @@ public void start(ComponentMainThreadExecutor mainThreadExecutor) {
 	}
 
 	@Override
-	protected long getNumberOfRestarts() {
+	public long getNumberOfRestarts() {

Review comment:
       ExecutionGraphNotEnoughResourceTest is removed and there is no need to do it now




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * 659fb7eddb0acfa0ef49f76c5fafca21c389f3c0 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * 1e959ffb3e7837247842ae4ade724a999ad7ca3b Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734) 
   * a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331",
       "triggerID" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10374",
       "triggerID" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10435",
       "triggerID" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42 UNKNOWN
   * e7941f905ec697ad09a3f1010f90a2a69a512ce0 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10374) 
   * 52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10435) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r544247385



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/GlobalModVersionTest.java
##########
@@ -1,200 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.failover.FailoverStrategy;
-import org.apache.flink.runtime.executiongraph.failover.FailoverStrategy.Factory;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleSlotProvider;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.Test;
-
-import java.util.Random;
-
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.waitUntilExecutionState;
-import static org.junit.Assert.assertEquals;
-import static org.mockito.Mockito.any;
-import static org.mockito.Mockito.mock;
-import static org.mockito.Mockito.times;
-import static org.mockito.Mockito.verify;
-
-public class GlobalModVersionTest extends TestLogger {

Review comment:
       global/local/double failures are tested in `DefaultSchedulerTest ` via `skipDeploymentIfVertexVersionOutdated`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * 659fb7eddb0acfa0ef49f76c5fafca21c389f3c0 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562) 
   * f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331",
       "triggerID" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10374",
       "triggerID" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42 UNKNOWN
   * 20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331) 
   * e7941f905ec697ad09a3f1010f90a2a69a512ce0 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10374) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * 021cac170ea26cddfd8af0a2bec5fea4e6a76b69 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756) 
   * fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331",
       "triggerID" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10374",
       "triggerID" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10435",
       "triggerID" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "89bea5233d5efb9db88eacc21b445a617a8c3c27",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10899",
       "triggerID" : "89bea5233d5efb9db88eacc21b445a617a8c3c27",
       "triggerType" : "PUSH"
     }, {
       "hash" : "dfd82cd0de7a46432eae70f1f86aafb823ed7ff2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10935",
       "triggerID" : "dfd82cd0de7a46432eae70f1f86aafb823ed7ff2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf675a9aff300f1d9e7af38bbae24a44377d404d",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10957",
       "triggerID" : "cf675a9aff300f1d9e7af38bbae24a44377d404d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42 UNKNOWN
   * cf675a9aff300f1d9e7af38bbae24a44377d404d Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10957) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r532534958



##########
File path: flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/DefaultScheduler.java
##########
@@ -188,7 +188,7 @@ public void start(ComponentMainThreadExecutor mainThreadExecutor) {
 	}
 
 	@Override
-	protected long getNumberOfRestarts() {
+	public long getNumberOfRestarts() {

Review comment:
       >> maybe we could query this somehow from JobManagerMetricGroup
   Good idea!
   
   It's actually `JobManagerJobMetricGroup` which is different from `JobManagerMetricGroup`.
   All job global metrics are in this group.

##########
File path: flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/DefaultScheduler.java
##########
@@ -188,7 +188,7 @@ public void start(ComponentMainThreadExecutor mainThreadExecutor) {
 	}
 
 	@Override
-	protected long getNumberOfRestarts() {
+	public long getNumberOfRestarts() {

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653) Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] tillrohrmann commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
tillrohrmann commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r540934025



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/metrics/RestartTimeGaugeTest.java
##########
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.executiongraph.metrics;
+
+import org.apache.flink.api.common.JobStatus;
+import org.apache.flink.runtime.executiongraph.TestingJobStatusProvider;
+import org.apache.flink.util.TestLogger;
+
+import org.junit.Test;
+
+import java.util.HashMap;
+import java.util.Map;
+
+import static org.hamcrest.Matchers.greaterThan;
+import static org.hamcrest.Matchers.is;
+import static org.junit.Assert.assertThat;
+
+/**
+ * Tests for {@link RestartTimeGauge}.
+ */
+public class RestartTimeGaugeTest extends TestLogger {
+
+	@Test
+	public void testNotRestarted() {
+		final RestartTimeGauge gauge = new RestartTimeGauge(new TestingJobStatusProvider(JobStatus.RUNNING, -1));
+		assertThat(gauge.getValue(), is(0L));
+	}
+
+	@Test
+	public void testInRestarting() {
+		final Map<JobStatus, Long> statusTimestampMap = new HashMap<>();
+		statusTimestampMap.put(JobStatus.RESTARTING, 1L);
+
+		final RestartTimeGauge gauge = new RestartTimeGauge(
+			new TestingJobStatusProvider(
+				JobStatus.RESTARTING,
+				status -> statusTimestampMap.getOrDefault(status, -1L)));
+		// System.currentTimeMillis() is surely to be larger than 123L

Review comment:
       The comment seems outdated.

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java
##########
@@ -82,202 +82,12 @@
 	private final TestingComponentMainThreadExecutor testMainThreadUtil =
 		EXECUTOR_RESOURCE.getComponentMainThreadTestExecutor();
 
-	/**
-	 * Tests that slots are released if we cannot assign the allocated resource to the
-	 * Execution.
-	 */
-	@Test
-	public void testSlotReleaseOnFailedResourceAssignment() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final CompletableFuture<LogicalSlot> slotFuture = new CompletableFuture<>();
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, slotFuture);
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final LogicalSlot otherSlot = new TestingLogicalSlotBuilder().createTestingLogicalSlot();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertFalse(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		// assign a different resource to the execution
-		assertTrue(execution.tryAssignResource(otherSlot));
-
-		// completing now the future should cause the slot to be released
-		slotFuture.complete(slot);
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
 	private TestingLogicalSlot createTestingLogicalSlot(SlotOwner slotOwner) {
 		return new TestingLogicalSlotBuilder()
 			.setSlotOwner(slotOwner)
 			.createTestingLogicalSlot();
 	}
 
-	/**
-	 * Tests that the slot is released in case of a execution cancellation when having
-	 * a slot assigned and being in state SCHEDULED.
-	 */
-	@Test
-	public void testSlotReleaseOnExecutionCancellationInScheduled() throws Exception {

Review comment:
       I think this test is still valid because we want to return the assigned slots if the `Execution` gets cancelled.

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java
##########
@@ -82,202 +82,12 @@
 	private final TestingComponentMainThreadExecutor testMainThreadUtil =
 		EXECUTOR_RESOURCE.getComponentMainThreadTestExecutor();
 
-	/**
-	 * Tests that slots are released if we cannot assign the allocated resource to the
-	 * Execution.
-	 */
-	@Test
-	public void testSlotReleaseOnFailedResourceAssignment() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final CompletableFuture<LogicalSlot> slotFuture = new CompletableFuture<>();
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, slotFuture);
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final LogicalSlot otherSlot = new TestingLogicalSlotBuilder().createTestingLogicalSlot();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertFalse(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		// assign a different resource to the execution
-		assertTrue(execution.tryAssignResource(otherSlot));
-
-		// completing now the future should cause the slot to be released
-		slotFuture.complete(slot);
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
 	private TestingLogicalSlot createTestingLogicalSlot(SlotOwner slotOwner) {
 		return new TestingLogicalSlotBuilder()
 			.setSlotOwner(slotOwner)
 			.createTestingLogicalSlot();
 	}
 
-	/**
-	 * Tests that the slot is released in case of a execution cancellation when having
-	 * a slot assigned and being in state SCHEDULED.
-	 */
-	@Test
-	public void testSlotReleaseOnExecutionCancellationInScheduled() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, CompletableFuture.completedFuture(slot));
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertTrue(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		assertEquals(slot, execution.getAssignedResource());
-
-		// cancelling the execution should move it into state CANCELED
-		execution.cancel();
-		assertEquals(ExecutionState.CANCELED, execution.getState());
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
-	/**
-	 * Tests that the slot is released in case of a execution cancellation when being in state
-	 * RUNNING.
-	 */
-	@Test
-	public void testSlotReleaseOnExecutionCancellationInRunning() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, CompletableFuture.completedFuture(slot));
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertTrue(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		assertEquals(slot, execution.getAssignedResource());
-
-		execution.deploy();
-
-		execution.switchToRunning();
-
-		// cancelling the execution should move it into state CANCELING
-		execution.cancel();
-		assertEquals(ExecutionState.CANCELING, execution.getState());
-
-		execution.completeCancelling();
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
-	/**
-	 * Tests that a slot allocation from a {@link SlotProvider} is cancelled if the
-	 * {@link Execution} is cancelled.
-	 */
-	@Test
-	public void testSlotAllocationCancellationWhenExecutionCancelled() throws Exception {

Review comment:
       I think this test is also still valid. At least I couldn't find a test which ensures that we cancel our slot allocations when we cancel  the execution. I think currently, this only happens if a task failed. But maybe the test need to make sure that the `DefaultScheduler` cancels all pending slot requests if the `SchedulerNG.cancel` is called.

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphSchedulingTest.java
##########
@@ -1,637 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.api.common.time.Time;
-import org.apache.flink.configuration.Configuration;
-import org.apache.flink.metrics.groups.UnregisteredMetricsGroup;
-import org.apache.flink.runtime.blob.VoidBlobWriter;
-import org.apache.flink.runtime.checkpoint.StandaloneCheckpointRecoveryFactory;
-import org.apache.flink.runtime.clusterframework.types.AllocationID;
-import org.apache.flink.runtime.clusterframework.types.ResourceID;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.instance.SimpleSlotContext;
-import org.apache.flink.runtime.io.network.partition.NoOpJobMasterPartitionTracker;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.ScheduleMode;
-import org.apache.flink.runtime.jobmanager.scheduler.Locality;
-import org.apache.flink.runtime.jobmanager.slots.DummySlotOwner;
-import org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway;
-import org.apache.flink.runtime.jobmanager.slots.TestingSlotOwner;
-import org.apache.flink.runtime.jobmaster.LogicalSlot;
-import org.apache.flink.runtime.jobmaster.SlotOwner;
-import org.apache.flink.runtime.jobmaster.SlotRequestId;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlot;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlotBuilder;
-import org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.shuffle.NettyShuffleMaster;
-import org.apache.flink.runtime.taskmanager.TaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.runtime.testutils.DirectScheduledExecutorService;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.After;
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.net.InetAddress;
-import java.util.Set;
-import java.util.concurrent.ArrayBlockingQueue;
-import java.util.concurrent.BlockingQueue;
-import java.util.concurrent.CompletableFuture;
-import java.util.concurrent.ConcurrentHashMap;
-import java.util.concurrent.ConcurrentMap;
-import java.util.concurrent.ScheduledExecutorService;
-import java.util.concurrent.TimeUnit;
-import java.util.concurrent.TimeoutException;
-
-import static org.hamcrest.Matchers.empty;
-import static org.hamcrest.Matchers.is;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertThat;
-
-/**
- * Tests for the scheduling of the execution graph. This tests that
- * for example the order of deployments is correct and that bulk slot allocation
- * works properly.
- */
-public class ExecutionGraphSchedulingTest extends TestLogger {
-
-	private final ScheduledExecutorService executor = new DirectScheduledExecutorService();
-
-	@After
-	public void shutdown() {
-		executor.shutdownNow();
-	}
-
-	// ------------------------------------------------------------------------
-	//  Tests
-	// ------------------------------------------------------------------------
-
-	/**
-	 * Tests that with scheduling futures and pipelined deployment, the target vertex will
-	 * not deploy its task before the source vertex does.
-	 */
-	@Test
-	public void testScheduleSourceBeforeTarget() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 1;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final CompletableFuture<LogicalSlot> sourceFuture = new CompletableFuture<>();
-		final CompletableFuture<LogicalSlot> targetFuture = new CompletableFuture<>();
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlot(sourceVertex.getID(), 0, sourceFuture);
-		slotProvider.addSlot(targetVertex.getID(), 0, targetFuture);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		//  set up two TaskManager gateways and slots
-
-		final InteractionsCountingTaskManagerGateway gatewaySource = createTaskManager();
-		final InteractionsCountingTaskManagerGateway gatewayTarget = createTaskManager();
-
-		final LogicalSlot sourceSlot = createTestingLogicalSlot(gatewaySource);
-		final LogicalSlot targetSlot = createTestingLogicalSlot(gatewayTarget);
-
-		eg.scheduleForExecution();
-
-		// job should be running
-		assertEquals(JobStatus.RUNNING, eg.getState());
-
-		// we fulfill the target slot before the source slot
-		// that should not cause a deployment or deployment related failure
-		targetFuture.complete(targetSlot);
-
-		assertThat(gatewayTarget.getSubmitTaskCount(), is(0));
-		assertEquals(JobStatus.RUNNING, eg.getState());
-
-		// now supply the source slot
-		sourceFuture.complete(sourceSlot);
-
-		// by now, all deployments should have happened
-		assertThat(gatewaySource.getSubmitTaskCount(), is(1));
-		assertThat(gatewayTarget.getSubmitTaskCount(), is(1));
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-	}
-
-	private TestingLogicalSlot createTestingLogicalSlot(InteractionsCountingTaskManagerGateway gatewaySource) {
-		return new TestingLogicalSlotBuilder()
-			.setTaskManagerGateway(gatewaySource)
-			.createTestingLogicalSlot();
-	}
-
-	/**
-	 * This test verifies that before deploying a pipelined connected component, the
-	 * full set of slots is available, and that not some tasks are deployed, and later the
-	 * system realizes that not enough resources are available.
-	 */
-	@Test
-	public void testDeployPipelinedConnectedComponentsTogether() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 8;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] sourceFutures = new CompletableFuture[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] targetFutures = new CompletableFuture[parallelism];
-
-		//
-		//  Create the slots, futures, and the slot provider
-
-		final InteractionsCountingTaskManagerGateway[] sourceTaskManagers = new InteractionsCountingTaskManagerGateway[parallelism];
-		final InteractionsCountingTaskManagerGateway[] targetTaskManagers = new InteractionsCountingTaskManagerGateway[parallelism];
-
-		final LogicalSlot[] sourceSlots = new LogicalSlot[parallelism];
-		final LogicalSlot[] targetSlots = new LogicalSlot[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			sourceTaskManagers[i] = createTaskManager();
-			targetTaskManagers[i] = createTaskManager();
-
-			sourceSlots[i] = createTestingLogicalSlot(sourceTaskManagers[i]);
-			targetSlots[i] = createTestingLogicalSlot(targetTaskManagers[i]);
-
-			sourceFutures[i] = new CompletableFuture<>();
-			targetFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(sourceVertex.getID(), sourceFutures);
-		slotProvider.addSlots(targetVertex.getID(), targetFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-
-		//
-		//  we complete some of the futures
-
-		for (int i = 0; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-		}
-
-		//
-		//  kick off the scheduling
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		verifyNothingDeployed(eg, sourceTaskManagers);
-
-		//  complete the remaining sources
-		for (int i = 1; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-		}
-		verifyNothingDeployed(eg, sourceTaskManagers);
-
-		//  complete the targets except for one
-		for (int i = 1; i < parallelism; i++) {
-			targetFutures[i].complete(targetSlots[i]);
-		}
-		verifyNothingDeployed(eg, targetTaskManagers);
-
-		//  complete the last target slot future
-		targetFutures[0].complete(targetSlots[0]);
-
-		//
-		//  verify that all deployments have happened
-
-		for (InteractionsCountingTaskManagerGateway gateway : sourceTaskManagers) {
-			assertThat(gateway.getSubmitTaskCount(), is(1));
-		}
-		for (InteractionsCountingTaskManagerGateway gateway : targetTaskManagers) {
-			assertThat(gateway.getSubmitTaskCount(), is(1));
-		}
-	}
-
-	/**
-	 * This test verifies that if one slot future fails, the deployment will be aborted.
-	 */
-	@Test
-	public void testOneSlotFailureAbortsDeploy() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 6;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.POINTWISE, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		//
-		//  Create the slots, futures, and the slot provider
-
-		final InteractionsCountingTaskManagerGateway taskManager = createTaskManager();
-		final BlockingQueue<AllocationID> returnedSlots = new ArrayBlockingQueue<>(parallelism);
-		final TestingSlotOwner slotOwner = new TestingSlotOwner();
-		slotOwner.setReturnAllocatedSlotConsumer(
-			(LogicalSlot logicalSlot) -> returnedSlots.offer(logicalSlot.getAllocationId()));
-
-		final LogicalSlot[] sourceSlots = new LogicalSlot[parallelism];
-		final LogicalSlot[] targetSlots = new LogicalSlot[parallelism];
-
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] sourceFutures = new CompletableFuture[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] targetFutures = new CompletableFuture[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			sourceSlots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-			targetSlots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-
-			sourceFutures[i] = new CompletableFuture<>();
-			targetFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(sourceVertex.getID(), sourceFutures);
-		slotProvider.addSlots(targetVertex.getID(), targetFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		//
-		//  we complete some of the futures
-
-		for (int i = 0; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-			targetFutures[i].complete(targetSlots[i]);
-		}
-
-		//  kick off the scheduling
-		eg.scheduleForExecution();
-
-		// fail one slot
-		sourceFutures[1].completeExceptionally(new TestRuntimeException());
-
-		// wait until the job failed as a whole
-		eg.getTerminationFuture().get(2000, TimeUnit.MILLISECONDS);
-
-		// wait until all slots are back
-		for (int i = 0; i < parallelism; i++) {
-			returnedSlots.poll(2000L, TimeUnit.MILLISECONDS);
-		}
-
-		// no deployment calls must have happened
-		assertThat(taskManager.getSubmitTaskCount(), is(0));
-
-		// all completed futures must have been returns
-		for (int i = 0; i < parallelism; i += 2) {
-			assertFalse(sourceSlots[i].isAlive());
-			assertFalse(targetSlots[i].isAlive());
-		}
-	}
-
-	/**
-	 * This tests makes sure that with eager scheduling no task is deployed if a single
-	 * slot allocation fails. Moreover we check that allocated slots will be returned.
-	 */
-	@Test
-	public void testEagerSchedulingWithSlotTimeout() throws Exception {
-
-		//  we construct a simple graph:    (task)
-
-		final int parallelism = 3;
-
-		final JobVertex vertex = new JobVertex("task");
-		vertex.setParallelism(parallelism);
-		vertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", vertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final BlockingQueue<AllocationID> returnedSlots = new ArrayBlockingQueue<>(2);
-		final TestingSlotOwner slotOwner = new TestingSlotOwner();
-		slotOwner.setReturnAllocatedSlotConsumer(
-			(LogicalSlot logicalSlot) -> returnedSlots.offer(logicalSlot.getAllocationId()));
-
-		final InteractionsCountingTaskManagerGateway taskManager = createTaskManager();
-		final LogicalSlot[] slots = new LogicalSlot[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] slotFutures = new CompletableFuture[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			slots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-			slotFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(vertex.getID(), slotFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-
-		//  we complete one future
-		slotFutures[1].complete(slots[1]);
-
-		//  kick off the scheduling
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		//  we complete another future
-		slotFutures[2].complete(slots[2]);
-
-		// check that the ExecutionGraph is not terminated yet
-		assertThat(eg.getTerminationFuture().isDone(), is(false));
-
-		// time out one of the slot futures
-		slotFutures[0].completeExceptionally(new TimeoutException("Test time out"));
-
-		assertThat(eg.getTerminationFuture().get(), is(JobStatus.FAILED));
-
-		// wait until all slots are back
-		for (int i = 0; i < parallelism - 1; i++) {
-			returnedSlots.poll(2000, TimeUnit.MILLISECONDS);
-		}
-
-		//  verify that no deployments have happened
-		assertThat(taskManager.getSubmitTaskCount(), is(0));
-	}
-
-	/**
-	 * Tests that an ongoing scheduling operation does not fail the {@link ExecutionGraph}
-	 * if it gets concurrently cancelled.
-	 */
-	@Test
-	public void testSchedulingOperationCancellationWhenCancel() throws Exception {

Review comment:
       I guess this test is similar to `ExecutionGraphRestartTest.testFailWhileCanceling` which is missing. If the state is `CANCELLING`, then we should not transition into `FAILING`.

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java
##########
@@ -82,202 +82,12 @@
 	private final TestingComponentMainThreadExecutor testMainThreadUtil =
 		EXECUTOR_RESOURCE.getComponentMainThreadTestExecutor();
 
-	/**
-	 * Tests that slots are released if we cannot assign the allocated resource to the
-	 * Execution.
-	 */
-	@Test
-	public void testSlotReleaseOnFailedResourceAssignment() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final CompletableFuture<LogicalSlot> slotFuture = new CompletableFuture<>();
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, slotFuture);
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final LogicalSlot otherSlot = new TestingLogicalSlotBuilder().createTestingLogicalSlot();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertFalse(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		// assign a different resource to the execution
-		assertTrue(execution.tryAssignResource(otherSlot));
-
-		// completing now the future should cause the slot to be released
-		slotFuture.complete(slot);
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
 	private TestingLogicalSlot createTestingLogicalSlot(SlotOwner slotOwner) {
 		return new TestingLogicalSlotBuilder()
 			.setSlotOwner(slotOwner)
 			.createTestingLogicalSlot();
 	}
 
-	/**
-	 * Tests that the slot is released in case of a execution cancellation when having
-	 * a slot assigned and being in state SCHEDULED.
-	 */
-	@Test
-	public void testSlotReleaseOnExecutionCancellationInScheduled() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, CompletableFuture.completedFuture(slot));
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertTrue(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		assertEquals(slot, execution.getAssignedResource());
-
-		// cancelling the execution should move it into state CANCELED
-		execution.cancel();
-		assertEquals(ExecutionState.CANCELED, execution.getState());
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
-	/**
-	 * Tests that the slot is released in case of a execution cancellation when being in state
-	 * RUNNING.
-	 */
-	@Test
-	public void testSlotReleaseOnExecutionCancellationInRunning() throws Exception {

Review comment:
       I think this test is also still valid because we want to return the logical slot to its owner if the `Execution` is cancelled.

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/ExecutionGraphCheckpointCoordinatorTest.java
##########
@@ -159,7 +164,7 @@ private ExecutionGraph createExecutionGraphAndEnableCheckpointing(
 			false,
 			0);
 
-		executionGraph.enableCheckpointing(
+		scheduler.getExecutionGraph().enableCheckpointing(

Review comment:
       I guess now we would have to specify a specialized `CheckpointRecoveryFactory` which we pass to the `SchedulerBuilder`.

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphRestartTest.java
##########
@@ -1,866 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.ExecutionConfig;
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.api.common.restartstrategy.RestartStrategies;
-import org.apache.flink.api.common.time.Time;
-import org.apache.flink.runtime.clusterframework.types.AllocationID;
-import org.apache.flink.runtime.clusterframework.types.ResourceProfile;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutor;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.execution.SuppressRestartsException;
-import org.apache.flink.runtime.executiongraph.failover.FailoverStrategy;
-import org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
-import org.apache.flink.runtime.executiongraph.restart.RestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.NotCancelAckingTaskGateway;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.executiongraph.utils.SimpleSlotProvider;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.ScheduleMode;
-import org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup;
-import org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway;
-import org.apache.flink.runtime.jobmaster.JobMasterId;
-import org.apache.flink.runtime.jobmaster.slotpool.LocationPreferenceSlotSelectionStrategy;
-import org.apache.flink.runtime.jobmaster.slotpool.Scheduler;
-import org.apache.flink.runtime.jobmaster.slotpool.SchedulerImpl;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotPool;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.jobmaster.slotpool.TestingSlotPoolImpl;
-import org.apache.flink.runtime.resourcemanager.ResourceManagerGateway;
-import org.apache.flink.runtime.resourcemanager.utils.TestingResourceManagerGateway;
-import org.apache.flink.runtime.taskexecutor.slot.SlotOffer;
-import org.apache.flink.runtime.taskmanager.LocalTaskManagerLocation;
-import org.apache.flink.runtime.taskmanager.TaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.io.IOException;
-import java.util.ArrayList;
-import java.util.Iterator;
-import java.util.List;
-import java.util.concurrent.CompletableFuture;
-import java.util.function.Consumer;
-
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.completeCancellingForAllVertices;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.createNoOpVertex;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.finishAllVertices;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.switchToRunning;
-import static org.hamcrest.Matchers.is;
-import static org.hamcrest.Matchers.lessThanOrEqualTo;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertNotEquals;
-import static org.junit.Assert.assertNotNull;
-import static org.junit.Assert.assertThat;
-import static org.junit.Assert.assertTrue;
-
-/**
- * Tests the restart behaviour of the {@link ExecutionGraph}.
- */
-public class ExecutionGraphRestartTest extends TestLogger {

Review comment:
       I think 
   
   `testTaskFailingWhileGlobalFailing` is irrelevant now
   
   `testFailWhileRestarting` should be covered by `DefaultSchedulerTest.failJobIfCannotRestart`
   
   For the following tests I couldn't find test coverage:
   
   ```
   testCancelWhileRestarting
   testCancelWhileFailing
   testFailWhileCanceling
   testFailingExecutionAfterRestart
   testFailExecutionAfterCancel
   testSuspendWhileRestarting
   ```

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionVertexCancelTest.java
##########
@@ -244,104 +250,6 @@ public void testSendCancelAndReceiveFail() throws Exception {
 		assertEquals(vertices.length - 1, exec.getVertex().getExecutionGraph().getRegisteredExecutions().size());
 	}
 
-	// --------------------------------------------------------------------------------------------
-	//  Actions after a vertex has been canceled or while canceling
-	// --------------------------------------------------------------------------------------------
-
-	@Test
-	public void testScheduleOrDeployAfterCancel() {
-		try {
-			final ExecutionVertex vertex = getExecutionVertex();
-			setVertexState(vertex, ExecutionState.CANCELED);
-
-			assertEquals(ExecutionState.CANCELED, vertex.getExecutionState());
-
-			// 1)
-			// scheduling after being canceled should be tolerated (no exception) because
-			// it can occur as the result of races
-			{
-				vertex.scheduleForExecution(
-					TestingSlotProviderStrategy.from(new ProgrammedSlotProvider(1)),
-					LocationPreferenceConstraint.ALL,
-					Collections.emptySet());
-
-				assertEquals(ExecutionState.CANCELED, vertex.getExecutionState());
-			}
-
-			// 2)
-			// deploying after canceling from CREATED needs to raise an exception, because
-			// the scheduler (or any caller) needs to know that the slot should be released
-			try {
-
-				final LogicalSlot slot = new TestingLogicalSlotBuilder().createTestingLogicalSlot();
-
-				vertex.deployToSlot(slot);
-				fail("Method should throw an exception");
-			}
-			catch (IllegalStateException e) {
-				assertEquals(ExecutionState.CANCELED, vertex.getExecutionState());
-			}
-		}
-		catch (Exception e) {
-			e.printStackTrace();
-			fail(e.getMessage());
-		}
-	}
-
-	@Test
-	public void testActionsWhileCancelling() {

Review comment:
       I think `DefaultSchedulerTest#skipDeploymentIfVertexVersionOutdated()` is fine.

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphVariousFailuesTest.java
##########
@@ -20,86 +20,31 @@
 
 import org.apache.flink.api.common.JobStatus;
 import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.SuppressRestartsException;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
 import org.apache.flink.runtime.io.network.partition.ResultPartitionID;
 import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobGraph;
+import org.apache.flink.runtime.scheduler.SchedulerBase;
+import org.apache.flink.runtime.scheduler.SchedulerTestingUtils;
 import org.apache.flink.util.TestLogger;
 
 import org.junit.Test;
 
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.fail;
 
-
 public class ExecutionGraphVariousFailuesTest extends TestLogger {
 
-	/**
-	 * Test that failing in state restarting will retrigger the restarting logic. This means that
-	 * it only goes into the state FAILED after the restart strategy says the job is no longer
-	 * restartable.
-	 */
-	@Test
-	public void testFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(2));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("Test 1"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-
-		// we should restart since we have two restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 2"));
-
-		// we should restart since we have one restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 3"));
-
-		// after depleting all our restart attempts we should go into Failed
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
-	/**
-	 * Tests that a {@link SuppressRestartsException} in state RESTARTING stops the restarting
-	 * immediately and sets the execution graph's state to FAILED.
-	 */
-	@Test
-	public void testSuppressRestartFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("test"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		// suppress a possible restart
-		eg.failGlobal(new SuppressRestartsException(new Exception("Test")));
-
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
 	/**
 	 * Tests that a failing scheduleOrUpdateConsumers call with a non-existing execution attempt
 	 * id, will not fail the execution graph.
 	 */
 	@Test
 	public void testFailingScheduleOrUpdateConsumers() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
+		final SchedulerBase scheduler = SchedulerTestingUtils.newSchedulerBuilder(new JobGraph()).build();
+		scheduler.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
+		scheduler.startScheduling();
+
+		final ExecutionGraph eg = scheduler.getExecutionGraph();

Review comment:
       I think `DefaultSchedulerTest#deployTasksOnlyWhenAllSlotRequestsAreFulfilled` replaces `ExecutionGraphDeploymentTest#testEagerSchedulingWaitsOnAllInputPreferredLocations()` because what we are interested is that the scheduler deploys the whole pipelined region. There is no longer the need that pipelined consumers will be scheduled after the producers because they are scheduled together.

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java
##########
@@ -82,202 +82,12 @@
 	private final TestingComponentMainThreadExecutor testMainThreadUtil =
 		EXECUTOR_RESOURCE.getComponentMainThreadTestExecutor();
 
-	/**
-	 * Tests that slots are released if we cannot assign the allocated resource to the
-	 * Execution.
-	 */
-	@Test
-	public void testSlotReleaseOnFailedResourceAssignment() throws Exception {

Review comment:
       `testSlotReleaseOnFailedResourceAssignment` should indeed be obsolete.

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphSchedulingTest.java
##########
@@ -1,637 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.api.common.time.Time;
-import org.apache.flink.configuration.Configuration;
-import org.apache.flink.metrics.groups.UnregisteredMetricsGroup;
-import org.apache.flink.runtime.blob.VoidBlobWriter;
-import org.apache.flink.runtime.checkpoint.StandaloneCheckpointRecoveryFactory;
-import org.apache.flink.runtime.clusterframework.types.AllocationID;
-import org.apache.flink.runtime.clusterframework.types.ResourceID;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.instance.SimpleSlotContext;
-import org.apache.flink.runtime.io.network.partition.NoOpJobMasterPartitionTracker;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.ScheduleMode;
-import org.apache.flink.runtime.jobmanager.scheduler.Locality;
-import org.apache.flink.runtime.jobmanager.slots.DummySlotOwner;
-import org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway;
-import org.apache.flink.runtime.jobmanager.slots.TestingSlotOwner;
-import org.apache.flink.runtime.jobmaster.LogicalSlot;
-import org.apache.flink.runtime.jobmaster.SlotOwner;
-import org.apache.flink.runtime.jobmaster.SlotRequestId;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlot;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlotBuilder;
-import org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.shuffle.NettyShuffleMaster;
-import org.apache.flink.runtime.taskmanager.TaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.runtime.testutils.DirectScheduledExecutorService;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.After;
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.net.InetAddress;
-import java.util.Set;
-import java.util.concurrent.ArrayBlockingQueue;
-import java.util.concurrent.BlockingQueue;
-import java.util.concurrent.CompletableFuture;
-import java.util.concurrent.ConcurrentHashMap;
-import java.util.concurrent.ConcurrentMap;
-import java.util.concurrent.ScheduledExecutorService;
-import java.util.concurrent.TimeUnit;
-import java.util.concurrent.TimeoutException;
-
-import static org.hamcrest.Matchers.empty;
-import static org.hamcrest.Matchers.is;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertThat;
-
-/**
- * Tests for the scheduling of the execution graph. This tests that
- * for example the order of deployments is correct and that bulk slot allocation
- * works properly.
- */
-public class ExecutionGraphSchedulingTest extends TestLogger {
-
-	private final ScheduledExecutorService executor = new DirectScheduledExecutorService();
-
-	@After
-	public void shutdown() {
-		executor.shutdownNow();
-	}
-
-	// ------------------------------------------------------------------------
-	//  Tests
-	// ------------------------------------------------------------------------
-
-	/**
-	 * Tests that with scheduling futures and pipelined deployment, the target vertex will
-	 * not deploy its task before the source vertex does.
-	 */
-	@Test
-	public void testScheduleSourceBeforeTarget() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 1;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final CompletableFuture<LogicalSlot> sourceFuture = new CompletableFuture<>();
-		final CompletableFuture<LogicalSlot> targetFuture = new CompletableFuture<>();
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlot(sourceVertex.getID(), 0, sourceFuture);
-		slotProvider.addSlot(targetVertex.getID(), 0, targetFuture);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		//  set up two TaskManager gateways and slots
-
-		final InteractionsCountingTaskManagerGateway gatewaySource = createTaskManager();
-		final InteractionsCountingTaskManagerGateway gatewayTarget = createTaskManager();
-
-		final LogicalSlot sourceSlot = createTestingLogicalSlot(gatewaySource);
-		final LogicalSlot targetSlot = createTestingLogicalSlot(gatewayTarget);
-
-		eg.scheduleForExecution();
-
-		// job should be running
-		assertEquals(JobStatus.RUNNING, eg.getState());
-
-		// we fulfill the target slot before the source slot
-		// that should not cause a deployment or deployment related failure
-		targetFuture.complete(targetSlot);
-
-		assertThat(gatewayTarget.getSubmitTaskCount(), is(0));
-		assertEquals(JobStatus.RUNNING, eg.getState());
-
-		// now supply the source slot
-		sourceFuture.complete(sourceSlot);
-
-		// by now, all deployments should have happened
-		assertThat(gatewaySource.getSubmitTaskCount(), is(1));
-		assertThat(gatewayTarget.getSubmitTaskCount(), is(1));
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-	}
-
-	private TestingLogicalSlot createTestingLogicalSlot(InteractionsCountingTaskManagerGateway gatewaySource) {
-		return new TestingLogicalSlotBuilder()
-			.setTaskManagerGateway(gatewaySource)
-			.createTestingLogicalSlot();
-	}
-
-	/**
-	 * This test verifies that before deploying a pipelined connected component, the
-	 * full set of slots is available, and that not some tasks are deployed, and later the
-	 * system realizes that not enough resources are available.
-	 */
-	@Test
-	public void testDeployPipelinedConnectedComponentsTogether() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 8;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] sourceFutures = new CompletableFuture[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] targetFutures = new CompletableFuture[parallelism];
-
-		//
-		//  Create the slots, futures, and the slot provider
-
-		final InteractionsCountingTaskManagerGateway[] sourceTaskManagers = new InteractionsCountingTaskManagerGateway[parallelism];
-		final InteractionsCountingTaskManagerGateway[] targetTaskManagers = new InteractionsCountingTaskManagerGateway[parallelism];
-
-		final LogicalSlot[] sourceSlots = new LogicalSlot[parallelism];
-		final LogicalSlot[] targetSlots = new LogicalSlot[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			sourceTaskManagers[i] = createTaskManager();
-			targetTaskManagers[i] = createTaskManager();
-
-			sourceSlots[i] = createTestingLogicalSlot(sourceTaskManagers[i]);
-			targetSlots[i] = createTestingLogicalSlot(targetTaskManagers[i]);
-
-			sourceFutures[i] = new CompletableFuture<>();
-			targetFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(sourceVertex.getID(), sourceFutures);
-		slotProvider.addSlots(targetVertex.getID(), targetFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-
-		//
-		//  we complete some of the futures
-
-		for (int i = 0; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-		}
-
-		//
-		//  kick off the scheduling
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		verifyNothingDeployed(eg, sourceTaskManagers);
-
-		//  complete the remaining sources
-		for (int i = 1; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-		}
-		verifyNothingDeployed(eg, sourceTaskManagers);
-
-		//  complete the targets except for one
-		for (int i = 1; i < parallelism; i++) {
-			targetFutures[i].complete(targetSlots[i]);
-		}
-		verifyNothingDeployed(eg, targetTaskManagers);
-
-		//  complete the last target slot future
-		targetFutures[0].complete(targetSlots[0]);
-
-		//
-		//  verify that all deployments have happened
-
-		for (InteractionsCountingTaskManagerGateway gateway : sourceTaskManagers) {
-			assertThat(gateway.getSubmitTaskCount(), is(1));
-		}
-		for (InteractionsCountingTaskManagerGateway gateway : targetTaskManagers) {
-			assertThat(gateway.getSubmitTaskCount(), is(1));
-		}
-	}
-
-	/**
-	 * This test verifies that if one slot future fails, the deployment will be aborted.
-	 */
-	@Test
-	public void testOneSlotFailureAbortsDeploy() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 6;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.POINTWISE, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		//
-		//  Create the slots, futures, and the slot provider
-
-		final InteractionsCountingTaskManagerGateway taskManager = createTaskManager();
-		final BlockingQueue<AllocationID> returnedSlots = new ArrayBlockingQueue<>(parallelism);
-		final TestingSlotOwner slotOwner = new TestingSlotOwner();
-		slotOwner.setReturnAllocatedSlotConsumer(
-			(LogicalSlot logicalSlot) -> returnedSlots.offer(logicalSlot.getAllocationId()));
-
-		final LogicalSlot[] sourceSlots = new LogicalSlot[parallelism];
-		final LogicalSlot[] targetSlots = new LogicalSlot[parallelism];
-
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] sourceFutures = new CompletableFuture[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] targetFutures = new CompletableFuture[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			sourceSlots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-			targetSlots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-
-			sourceFutures[i] = new CompletableFuture<>();
-			targetFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(sourceVertex.getID(), sourceFutures);
-		slotProvider.addSlots(targetVertex.getID(), targetFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		//
-		//  we complete some of the futures
-
-		for (int i = 0; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-			targetFutures[i].complete(targetSlots[i]);
-		}
-
-		//  kick off the scheduling
-		eg.scheduleForExecution();
-
-		// fail one slot
-		sourceFutures[1].completeExceptionally(new TestRuntimeException());
-
-		// wait until the job failed as a whole
-		eg.getTerminationFuture().get(2000, TimeUnit.MILLISECONDS);
-
-		// wait until all slots are back
-		for (int i = 0; i < parallelism; i++) {
-			returnedSlots.poll(2000L, TimeUnit.MILLISECONDS);
-		}
-
-		// no deployment calls must have happened
-		assertThat(taskManager.getSubmitTaskCount(), is(0));
-
-		// all completed futures must have been returns
-		for (int i = 0; i < parallelism; i += 2) {
-			assertFalse(sourceSlots[i].isAlive());
-			assertFalse(targetSlots[i].isAlive());
-		}
-	}
-
-	/**
-	 * This tests makes sure that with eager scheduling no task is deployed if a single
-	 * slot allocation fails. Moreover we check that allocated slots will be returned.
-	 */
-	@Test
-	public void testEagerSchedulingWithSlotTimeout() throws Exception {
-
-		//  we construct a simple graph:    (task)
-
-		final int parallelism = 3;
-
-		final JobVertex vertex = new JobVertex("task");
-		vertex.setParallelism(parallelism);
-		vertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", vertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final BlockingQueue<AllocationID> returnedSlots = new ArrayBlockingQueue<>(2);
-		final TestingSlotOwner slotOwner = new TestingSlotOwner();
-		slotOwner.setReturnAllocatedSlotConsumer(
-			(LogicalSlot logicalSlot) -> returnedSlots.offer(logicalSlot.getAllocationId()));
-
-		final InteractionsCountingTaskManagerGateway taskManager = createTaskManager();
-		final LogicalSlot[] slots = new LogicalSlot[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] slotFutures = new CompletableFuture[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			slots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-			slotFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(vertex.getID(), slotFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-
-		//  we complete one future
-		slotFutures[1].complete(slots[1]);
-
-		//  kick off the scheduling
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		//  we complete another future
-		slotFutures[2].complete(slots[2]);
-
-		// check that the ExecutionGraph is not terminated yet
-		assertThat(eg.getTerminationFuture().isDone(), is(false));
-
-		// time out one of the slot futures
-		slotFutures[0].completeExceptionally(new TimeoutException("Test time out"));
-
-		assertThat(eg.getTerminationFuture().get(), is(JobStatus.FAILED));
-
-		// wait until all slots are back
-		for (int i = 0; i < parallelism - 1; i++) {
-			returnedSlots.poll(2000, TimeUnit.MILLISECONDS);
-		}
-
-		//  verify that no deployments have happened
-		assertThat(taskManager.getSubmitTaskCount(), is(0));
-	}
-
-	/**
-	 * Tests that an ongoing scheduling operation does not fail the {@link ExecutionGraph}
-	 * if it gets concurrently cancelled.
-	 */
-	@Test
-	public void testSchedulingOperationCancellationWhenCancel() throws Exception {
-		final JobVertex jobVertex = new JobVertex("NoOp JobVertex");
-		jobVertex.setInvokableClass(NoOpInvokable.class);
-		jobVertex.setParallelism(2);
-		final JobGraph jobGraph = new JobGraph(jobVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final CompletableFuture<LogicalSlot> slotFuture1 = new CompletableFuture<>();
-		final CompletableFuture<LogicalSlot> slotFuture2 = new CompletableFuture<>();
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(2);
-		slotProvider.addSlots(jobVertex.getID(), new CompletableFuture[]{slotFuture1, slotFuture2});
-		final ExecutionGraph executionGraph = createExecutionGraph(jobGraph, slotProvider);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		executionGraph.scheduleForExecution();
-
-		final TestingLogicalSlot slot = createTestingSlot();
-		final CompletableFuture<?> releaseFuture = slot.getReleaseFuture();
-		slotFuture1.complete(slot);
-
-		// cancel should change the state of all executions to CANCELLED
-		executionGraph.cancel();
-
-		// complete the now CANCELLED execution --> this should cause a failure
-		slotFuture2.complete(new TestingLogicalSlotBuilder().createTestingLogicalSlot());
-
-		Thread.sleep(1L);
-		// release the first slot to finish the cancellation
-		releaseFuture.complete(null);
-
-		// NOTE: This test will only occasionally fail without the fix since there is
-		// a race between the releaseFuture and the slotFuture2
-		assertThat(executionGraph.getTerminationFuture().get(), is(JobStatus.CANCELED));
-	}
-
-	/**
-	 * Tests that a partially completed eager scheduling operation fails if a
-	 * completed slot is released. See FLINK-9099.
-	 */
-	@Test
-	public void testSlotReleasingFailsSchedulingOperation() throws Exception {
-		final int parallelism = 2;
-
-		final JobVertex jobVertex = new JobVertex("Testing job vertex");
-		jobVertex.setInvokableClass(NoOpInvokable.class);
-		jobVertex.setParallelism(parallelism);
-		final JobGraph jobGraph = new JobGraph(jobVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-
-		final LogicalSlot slot = createSingleLogicalSlot(new DummySlotOwner(), new SimpleAckingTaskManagerGateway(), new SlotRequestId());
-		slotProvider.addSlot(jobVertex.getID(), 0, CompletableFuture.completedFuture(slot));
-
-		final CompletableFuture<LogicalSlot> slotFuture = new CompletableFuture<>();
-		slotProvider.addSlot(jobVertex.getID(), 1, slotFuture);
-
-		final ExecutionGraph executionGraph = createExecutionGraph(jobGraph, slotProvider);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		executionGraph.scheduleForExecution();
-
-		assertThat(executionGraph.getState(), is(JobStatus.RUNNING));
-
-		final ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertex.getID());
-		final ExecutionVertex[] taskVertices = executionJobVertex.getTaskVertices();
-		assertThat(taskVertices[0].getExecutionState(), is(ExecutionState.SCHEDULED));
-		assertThat(taskVertices[1].getExecutionState(), is(ExecutionState.SCHEDULED));
-
-		// fail the single allocated slot --> this should fail the scheduling operation
-		slot.releaseSlot(new FlinkException("Test failure"));
-
-		assertThat(executionGraph.getTerminationFuture().get(), is(JobStatus.FAILED));
-	}
-
-	/**
-	 * Tests that all slots are being returned to the {@link SlotOwner} if the
-	 * {@link ExecutionGraph} is being cancelled. See FLINK-9908
-	 */
-	@Test
-	public void testCancellationOfIncompleteScheduling() throws Exception {

Review comment:
       How does the `DefaultScheduler` cancels pending slot requests if the user calls `DefaultScheduler.cancel()`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] tillrohrmann commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
tillrohrmann commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r544440450



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphSchedulingTest.java
##########
@@ -1,637 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.api.common.time.Time;
-import org.apache.flink.configuration.Configuration;
-import org.apache.flink.metrics.groups.UnregisteredMetricsGroup;
-import org.apache.flink.runtime.blob.VoidBlobWriter;
-import org.apache.flink.runtime.checkpoint.StandaloneCheckpointRecoveryFactory;
-import org.apache.flink.runtime.clusterframework.types.AllocationID;
-import org.apache.flink.runtime.clusterframework.types.ResourceID;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.instance.SimpleSlotContext;
-import org.apache.flink.runtime.io.network.partition.NoOpJobMasterPartitionTracker;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.ScheduleMode;
-import org.apache.flink.runtime.jobmanager.scheduler.Locality;
-import org.apache.flink.runtime.jobmanager.slots.DummySlotOwner;
-import org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway;
-import org.apache.flink.runtime.jobmanager.slots.TestingSlotOwner;
-import org.apache.flink.runtime.jobmaster.LogicalSlot;
-import org.apache.flink.runtime.jobmaster.SlotOwner;
-import org.apache.flink.runtime.jobmaster.SlotRequestId;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlot;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlotBuilder;
-import org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.shuffle.NettyShuffleMaster;
-import org.apache.flink.runtime.taskmanager.TaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.runtime.testutils.DirectScheduledExecutorService;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.After;
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.net.InetAddress;
-import java.util.Set;
-import java.util.concurrent.ArrayBlockingQueue;
-import java.util.concurrent.BlockingQueue;
-import java.util.concurrent.CompletableFuture;
-import java.util.concurrent.ConcurrentHashMap;
-import java.util.concurrent.ConcurrentMap;
-import java.util.concurrent.ScheduledExecutorService;
-import java.util.concurrent.TimeUnit;
-import java.util.concurrent.TimeoutException;
-
-import static org.hamcrest.Matchers.empty;
-import static org.hamcrest.Matchers.is;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertThat;
-
-/**
- * Tests for the scheduling of the execution graph. This tests that
- * for example the order of deployments is correct and that bulk slot allocation
- * works properly.
- */
-public class ExecutionGraphSchedulingTest extends TestLogger {
-
-	private final ScheduledExecutorService executor = new DirectScheduledExecutorService();
-
-	@After
-	public void shutdown() {
-		executor.shutdownNow();
-	}
-
-	// ------------------------------------------------------------------------
-	//  Tests
-	// ------------------------------------------------------------------------
-
-	/**
-	 * Tests that with scheduling futures and pipelined deployment, the target vertex will
-	 * not deploy its task before the source vertex does.
-	 */
-	@Test
-	public void testScheduleSourceBeforeTarget() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 1;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final CompletableFuture<LogicalSlot> sourceFuture = new CompletableFuture<>();
-		final CompletableFuture<LogicalSlot> targetFuture = new CompletableFuture<>();
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlot(sourceVertex.getID(), 0, sourceFuture);
-		slotProvider.addSlot(targetVertex.getID(), 0, targetFuture);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		//  set up two TaskManager gateways and slots
-
-		final InteractionsCountingTaskManagerGateway gatewaySource = createTaskManager();
-		final InteractionsCountingTaskManagerGateway gatewayTarget = createTaskManager();
-
-		final LogicalSlot sourceSlot = createTestingLogicalSlot(gatewaySource);
-		final LogicalSlot targetSlot = createTestingLogicalSlot(gatewayTarget);
-
-		eg.scheduleForExecution();
-
-		// job should be running
-		assertEquals(JobStatus.RUNNING, eg.getState());
-
-		// we fulfill the target slot before the source slot
-		// that should not cause a deployment or deployment related failure
-		targetFuture.complete(targetSlot);
-
-		assertThat(gatewayTarget.getSubmitTaskCount(), is(0));
-		assertEquals(JobStatus.RUNNING, eg.getState());
-
-		// now supply the source slot
-		sourceFuture.complete(sourceSlot);
-
-		// by now, all deployments should have happened
-		assertThat(gatewaySource.getSubmitTaskCount(), is(1));
-		assertThat(gatewayTarget.getSubmitTaskCount(), is(1));
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-	}
-
-	private TestingLogicalSlot createTestingLogicalSlot(InteractionsCountingTaskManagerGateway gatewaySource) {
-		return new TestingLogicalSlotBuilder()
-			.setTaskManagerGateway(gatewaySource)
-			.createTestingLogicalSlot();
-	}
-
-	/**
-	 * This test verifies that before deploying a pipelined connected component, the
-	 * full set of slots is available, and that not some tasks are deployed, and later the
-	 * system realizes that not enough resources are available.
-	 */
-	@Test
-	public void testDeployPipelinedConnectedComponentsTogether() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 8;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] sourceFutures = new CompletableFuture[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] targetFutures = new CompletableFuture[parallelism];
-
-		//
-		//  Create the slots, futures, and the slot provider
-
-		final InteractionsCountingTaskManagerGateway[] sourceTaskManagers = new InteractionsCountingTaskManagerGateway[parallelism];
-		final InteractionsCountingTaskManagerGateway[] targetTaskManagers = new InteractionsCountingTaskManagerGateway[parallelism];
-
-		final LogicalSlot[] sourceSlots = new LogicalSlot[parallelism];
-		final LogicalSlot[] targetSlots = new LogicalSlot[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			sourceTaskManagers[i] = createTaskManager();
-			targetTaskManagers[i] = createTaskManager();
-
-			sourceSlots[i] = createTestingLogicalSlot(sourceTaskManagers[i]);
-			targetSlots[i] = createTestingLogicalSlot(targetTaskManagers[i]);
-
-			sourceFutures[i] = new CompletableFuture<>();
-			targetFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(sourceVertex.getID(), sourceFutures);
-		slotProvider.addSlots(targetVertex.getID(), targetFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-
-		//
-		//  we complete some of the futures
-
-		for (int i = 0; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-		}
-
-		//
-		//  kick off the scheduling
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		verifyNothingDeployed(eg, sourceTaskManagers);
-
-		//  complete the remaining sources
-		for (int i = 1; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-		}
-		verifyNothingDeployed(eg, sourceTaskManagers);
-
-		//  complete the targets except for one
-		for (int i = 1; i < parallelism; i++) {
-			targetFutures[i].complete(targetSlots[i]);
-		}
-		verifyNothingDeployed(eg, targetTaskManagers);
-
-		//  complete the last target slot future
-		targetFutures[0].complete(targetSlots[0]);
-
-		//
-		//  verify that all deployments have happened
-
-		for (InteractionsCountingTaskManagerGateway gateway : sourceTaskManagers) {
-			assertThat(gateway.getSubmitTaskCount(), is(1));
-		}
-		for (InteractionsCountingTaskManagerGateway gateway : targetTaskManagers) {
-			assertThat(gateway.getSubmitTaskCount(), is(1));
-		}
-	}
-
-	/**
-	 * This test verifies that if one slot future fails, the deployment will be aborted.
-	 */
-	@Test
-	public void testOneSlotFailureAbortsDeploy() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 6;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.POINTWISE, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		//
-		//  Create the slots, futures, and the slot provider
-
-		final InteractionsCountingTaskManagerGateway taskManager = createTaskManager();
-		final BlockingQueue<AllocationID> returnedSlots = new ArrayBlockingQueue<>(parallelism);
-		final TestingSlotOwner slotOwner = new TestingSlotOwner();
-		slotOwner.setReturnAllocatedSlotConsumer(
-			(LogicalSlot logicalSlot) -> returnedSlots.offer(logicalSlot.getAllocationId()));
-
-		final LogicalSlot[] sourceSlots = new LogicalSlot[parallelism];
-		final LogicalSlot[] targetSlots = new LogicalSlot[parallelism];
-
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] sourceFutures = new CompletableFuture[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] targetFutures = new CompletableFuture[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			sourceSlots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-			targetSlots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-
-			sourceFutures[i] = new CompletableFuture<>();
-			targetFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(sourceVertex.getID(), sourceFutures);
-		slotProvider.addSlots(targetVertex.getID(), targetFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		//
-		//  we complete some of the futures
-
-		for (int i = 0; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-			targetFutures[i].complete(targetSlots[i]);
-		}
-
-		//  kick off the scheduling
-		eg.scheduleForExecution();
-
-		// fail one slot
-		sourceFutures[1].completeExceptionally(new TestRuntimeException());
-
-		// wait until the job failed as a whole
-		eg.getTerminationFuture().get(2000, TimeUnit.MILLISECONDS);
-
-		// wait until all slots are back
-		for (int i = 0; i < parallelism; i++) {
-			returnedSlots.poll(2000L, TimeUnit.MILLISECONDS);
-		}
-
-		// no deployment calls must have happened
-		assertThat(taskManager.getSubmitTaskCount(), is(0));
-
-		// all completed futures must have been returns
-		for (int i = 0; i < parallelism; i += 2) {
-			assertFalse(sourceSlots[i].isAlive());
-			assertFalse(targetSlots[i].isAlive());
-		}
-	}
-
-	/**
-	 * This tests makes sure that with eager scheduling no task is deployed if a single
-	 * slot allocation fails. Moreover we check that allocated slots will be returned.
-	 */
-	@Test
-	public void testEagerSchedulingWithSlotTimeout() throws Exception {
-
-		//  we construct a simple graph:    (task)
-
-		final int parallelism = 3;
-
-		final JobVertex vertex = new JobVertex("task");
-		vertex.setParallelism(parallelism);
-		vertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", vertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final BlockingQueue<AllocationID> returnedSlots = new ArrayBlockingQueue<>(2);
-		final TestingSlotOwner slotOwner = new TestingSlotOwner();
-		slotOwner.setReturnAllocatedSlotConsumer(
-			(LogicalSlot logicalSlot) -> returnedSlots.offer(logicalSlot.getAllocationId()));
-
-		final InteractionsCountingTaskManagerGateway taskManager = createTaskManager();
-		final LogicalSlot[] slots = new LogicalSlot[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] slotFutures = new CompletableFuture[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			slots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-			slotFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(vertex.getID(), slotFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-
-		//  we complete one future
-		slotFutures[1].complete(slots[1]);
-
-		//  kick off the scheduling
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		//  we complete another future
-		slotFutures[2].complete(slots[2]);
-
-		// check that the ExecutionGraph is not terminated yet
-		assertThat(eg.getTerminationFuture().isDone(), is(false));
-
-		// time out one of the slot futures
-		slotFutures[0].completeExceptionally(new TimeoutException("Test time out"));
-
-		assertThat(eg.getTerminationFuture().get(), is(JobStatus.FAILED));
-
-		// wait until all slots are back
-		for (int i = 0; i < parallelism - 1; i++) {
-			returnedSlots.poll(2000, TimeUnit.MILLISECONDS);
-		}
-
-		//  verify that no deployments have happened
-		assertThat(taskManager.getSubmitTaskCount(), is(0));
-	}
-
-	/**
-	 * Tests that an ongoing scheduling operation does not fail the {@link ExecutionGraph}
-	 * if it gets concurrently cancelled.
-	 */
-	@Test
-	public void testSchedulingOperationCancellationWhenCancel() throws Exception {

Review comment:
       I think you are right. Marking this conversation as resolved.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * fef9bff28a988ffab789fc6bb0cbde754273a2e0 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r544264487



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphSchedulingTest.java
##########
@@ -1,637 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.api.common.time.Time;
-import org.apache.flink.configuration.Configuration;
-import org.apache.flink.metrics.groups.UnregisteredMetricsGroup;
-import org.apache.flink.runtime.blob.VoidBlobWriter;
-import org.apache.flink.runtime.checkpoint.StandaloneCheckpointRecoveryFactory;
-import org.apache.flink.runtime.clusterframework.types.AllocationID;
-import org.apache.flink.runtime.clusterframework.types.ResourceID;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.instance.SimpleSlotContext;
-import org.apache.flink.runtime.io.network.partition.NoOpJobMasterPartitionTracker;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.ScheduleMode;
-import org.apache.flink.runtime.jobmanager.scheduler.Locality;
-import org.apache.flink.runtime.jobmanager.slots.DummySlotOwner;
-import org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway;
-import org.apache.flink.runtime.jobmanager.slots.TestingSlotOwner;
-import org.apache.flink.runtime.jobmaster.LogicalSlot;
-import org.apache.flink.runtime.jobmaster.SlotOwner;
-import org.apache.flink.runtime.jobmaster.SlotRequestId;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlot;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlotBuilder;
-import org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.shuffle.NettyShuffleMaster;
-import org.apache.flink.runtime.taskmanager.TaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.runtime.testutils.DirectScheduledExecutorService;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.After;
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.net.InetAddress;
-import java.util.Set;
-import java.util.concurrent.ArrayBlockingQueue;
-import java.util.concurrent.BlockingQueue;
-import java.util.concurrent.CompletableFuture;
-import java.util.concurrent.ConcurrentHashMap;
-import java.util.concurrent.ConcurrentMap;
-import java.util.concurrent.ScheduledExecutorService;
-import java.util.concurrent.TimeUnit;
-import java.util.concurrent.TimeoutException;
-
-import static org.hamcrest.Matchers.empty;
-import static org.hamcrest.Matchers.is;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertThat;
-
-/**
- * Tests for the scheduling of the execution graph. This tests that
- * for example the order of deployments is correct and that bulk slot allocation
- * works properly.
- */
-public class ExecutionGraphSchedulingTest extends TestLogger {
-
-	private final ScheduledExecutorService executor = new DirectScheduledExecutorService();
-
-	@After
-	public void shutdown() {
-		executor.shutdownNow();
-	}
-
-	// ------------------------------------------------------------------------
-	//  Tests
-	// ------------------------------------------------------------------------
-
-	/**
-	 * Tests that with scheduling futures and pipelined deployment, the target vertex will
-	 * not deploy its task before the source vertex does.
-	 */
-	@Test
-	public void testScheduleSourceBeforeTarget() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 1;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final CompletableFuture<LogicalSlot> sourceFuture = new CompletableFuture<>();
-		final CompletableFuture<LogicalSlot> targetFuture = new CompletableFuture<>();
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlot(sourceVertex.getID(), 0, sourceFuture);
-		slotProvider.addSlot(targetVertex.getID(), 0, targetFuture);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		//  set up two TaskManager gateways and slots
-
-		final InteractionsCountingTaskManagerGateway gatewaySource = createTaskManager();
-		final InteractionsCountingTaskManagerGateway gatewayTarget = createTaskManager();
-
-		final LogicalSlot sourceSlot = createTestingLogicalSlot(gatewaySource);
-		final LogicalSlot targetSlot = createTestingLogicalSlot(gatewayTarget);
-
-		eg.scheduleForExecution();
-
-		// job should be running
-		assertEquals(JobStatus.RUNNING, eg.getState());
-
-		// we fulfill the target slot before the source slot
-		// that should not cause a deployment or deployment related failure
-		targetFuture.complete(targetSlot);
-
-		assertThat(gatewayTarget.getSubmitTaskCount(), is(0));
-		assertEquals(JobStatus.RUNNING, eg.getState());
-
-		// now supply the source slot
-		sourceFuture.complete(sourceSlot);
-
-		// by now, all deployments should have happened
-		assertThat(gatewaySource.getSubmitTaskCount(), is(1));
-		assertThat(gatewayTarget.getSubmitTaskCount(), is(1));
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-	}
-
-	private TestingLogicalSlot createTestingLogicalSlot(InteractionsCountingTaskManagerGateway gatewaySource) {
-		return new TestingLogicalSlotBuilder()
-			.setTaskManagerGateway(gatewaySource)
-			.createTestingLogicalSlot();
-	}
-
-	/**
-	 * This test verifies that before deploying a pipelined connected component, the
-	 * full set of slots is available, and that not some tasks are deployed, and later the
-	 * system realizes that not enough resources are available.
-	 */
-	@Test
-	public void testDeployPipelinedConnectedComponentsTogether() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 8;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] sourceFutures = new CompletableFuture[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] targetFutures = new CompletableFuture[parallelism];
-
-		//
-		//  Create the slots, futures, and the slot provider
-
-		final InteractionsCountingTaskManagerGateway[] sourceTaskManagers = new InteractionsCountingTaskManagerGateway[parallelism];
-		final InteractionsCountingTaskManagerGateway[] targetTaskManagers = new InteractionsCountingTaskManagerGateway[parallelism];
-
-		final LogicalSlot[] sourceSlots = new LogicalSlot[parallelism];
-		final LogicalSlot[] targetSlots = new LogicalSlot[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			sourceTaskManagers[i] = createTaskManager();
-			targetTaskManagers[i] = createTaskManager();
-
-			sourceSlots[i] = createTestingLogicalSlot(sourceTaskManagers[i]);
-			targetSlots[i] = createTestingLogicalSlot(targetTaskManagers[i]);
-
-			sourceFutures[i] = new CompletableFuture<>();
-			targetFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(sourceVertex.getID(), sourceFutures);
-		slotProvider.addSlots(targetVertex.getID(), targetFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-
-		//
-		//  we complete some of the futures
-
-		for (int i = 0; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-		}
-
-		//
-		//  kick off the scheduling
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		verifyNothingDeployed(eg, sourceTaskManagers);
-
-		//  complete the remaining sources
-		for (int i = 1; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-		}
-		verifyNothingDeployed(eg, sourceTaskManagers);
-
-		//  complete the targets except for one
-		for (int i = 1; i < parallelism; i++) {
-			targetFutures[i].complete(targetSlots[i]);
-		}
-		verifyNothingDeployed(eg, targetTaskManagers);
-
-		//  complete the last target slot future
-		targetFutures[0].complete(targetSlots[0]);
-
-		//
-		//  verify that all deployments have happened
-
-		for (InteractionsCountingTaskManagerGateway gateway : sourceTaskManagers) {
-			assertThat(gateway.getSubmitTaskCount(), is(1));
-		}
-		for (InteractionsCountingTaskManagerGateway gateway : targetTaskManagers) {
-			assertThat(gateway.getSubmitTaskCount(), is(1));
-		}
-	}
-
-	/**
-	 * This test verifies that if one slot future fails, the deployment will be aborted.
-	 */
-	@Test
-	public void testOneSlotFailureAbortsDeploy() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 6;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.POINTWISE, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		//
-		//  Create the slots, futures, and the slot provider
-
-		final InteractionsCountingTaskManagerGateway taskManager = createTaskManager();
-		final BlockingQueue<AllocationID> returnedSlots = new ArrayBlockingQueue<>(parallelism);
-		final TestingSlotOwner slotOwner = new TestingSlotOwner();
-		slotOwner.setReturnAllocatedSlotConsumer(
-			(LogicalSlot logicalSlot) -> returnedSlots.offer(logicalSlot.getAllocationId()));
-
-		final LogicalSlot[] sourceSlots = new LogicalSlot[parallelism];
-		final LogicalSlot[] targetSlots = new LogicalSlot[parallelism];
-
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] sourceFutures = new CompletableFuture[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] targetFutures = new CompletableFuture[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			sourceSlots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-			targetSlots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-
-			sourceFutures[i] = new CompletableFuture<>();
-			targetFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(sourceVertex.getID(), sourceFutures);
-		slotProvider.addSlots(targetVertex.getID(), targetFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		//
-		//  we complete some of the futures
-
-		for (int i = 0; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-			targetFutures[i].complete(targetSlots[i]);
-		}
-
-		//  kick off the scheduling
-		eg.scheduleForExecution();
-
-		// fail one slot
-		sourceFutures[1].completeExceptionally(new TestRuntimeException());
-
-		// wait until the job failed as a whole
-		eg.getTerminationFuture().get(2000, TimeUnit.MILLISECONDS);
-
-		// wait until all slots are back
-		for (int i = 0; i < parallelism; i++) {
-			returnedSlots.poll(2000L, TimeUnit.MILLISECONDS);
-		}
-
-		// no deployment calls must have happened
-		assertThat(taskManager.getSubmitTaskCount(), is(0));
-
-		// all completed futures must have been returns
-		for (int i = 0; i < parallelism; i += 2) {
-			assertFalse(sourceSlots[i].isAlive());
-			assertFalse(targetSlots[i].isAlive());
-		}
-	}
-
-	/**
-	 * This tests makes sure that with eager scheduling no task is deployed if a single
-	 * slot allocation fails. Moreover we check that allocated slots will be returned.
-	 */
-	@Test
-	public void testEagerSchedulingWithSlotTimeout() throws Exception {
-
-		//  we construct a simple graph:    (task)
-
-		final int parallelism = 3;
-
-		final JobVertex vertex = new JobVertex("task");
-		vertex.setParallelism(parallelism);
-		vertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", vertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final BlockingQueue<AllocationID> returnedSlots = new ArrayBlockingQueue<>(2);
-		final TestingSlotOwner slotOwner = new TestingSlotOwner();
-		slotOwner.setReturnAllocatedSlotConsumer(
-			(LogicalSlot logicalSlot) -> returnedSlots.offer(logicalSlot.getAllocationId()));
-
-		final InteractionsCountingTaskManagerGateway taskManager = createTaskManager();
-		final LogicalSlot[] slots = new LogicalSlot[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] slotFutures = new CompletableFuture[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			slots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-			slotFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(vertex.getID(), slotFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-
-		//  we complete one future
-		slotFutures[1].complete(slots[1]);
-
-		//  kick off the scheduling
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		//  we complete another future
-		slotFutures[2].complete(slots[2]);
-
-		// check that the ExecutionGraph is not terminated yet
-		assertThat(eg.getTerminationFuture().isDone(), is(false));
-
-		// time out one of the slot futures
-		slotFutures[0].completeExceptionally(new TimeoutException("Test time out"));
-
-		assertThat(eg.getTerminationFuture().get(), is(JobStatus.FAILED));
-
-		// wait until all slots are back
-		for (int i = 0; i < parallelism - 1; i++) {
-			returnedSlots.poll(2000, TimeUnit.MILLISECONDS);
-		}
-
-		//  verify that no deployments have happened
-		assertThat(taskManager.getSubmitTaskCount(), is(0));
-	}
-
-	/**
-	 * Tests that an ongoing scheduling operation does not fail the {@link ExecutionGraph}
-	 * if it gets concurrently cancelled.
-	 */
-	@Test
-	public void testSchedulingOperationCancellationWhenCancel() throws Exception {
-		final JobVertex jobVertex = new JobVertex("NoOp JobVertex");
-		jobVertex.setInvokableClass(NoOpInvokable.class);
-		jobVertex.setParallelism(2);
-		final JobGraph jobGraph = new JobGraph(jobVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final CompletableFuture<LogicalSlot> slotFuture1 = new CompletableFuture<>();
-		final CompletableFuture<LogicalSlot> slotFuture2 = new CompletableFuture<>();
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(2);
-		slotProvider.addSlots(jobVertex.getID(), new CompletableFuture[]{slotFuture1, slotFuture2});
-		final ExecutionGraph executionGraph = createExecutionGraph(jobGraph, slotProvider);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		executionGraph.scheduleForExecution();
-
-		final TestingLogicalSlot slot = createTestingSlot();
-		final CompletableFuture<?> releaseFuture = slot.getReleaseFuture();
-		slotFuture1.complete(slot);
-
-		// cancel should change the state of all executions to CANCELLED
-		executionGraph.cancel();
-
-		// complete the now CANCELLED execution --> this should cause a failure
-		slotFuture2.complete(new TestingLogicalSlotBuilder().createTestingLogicalSlot());
-
-		Thread.sleep(1L);
-		// release the first slot to finish the cancellation
-		releaseFuture.complete(null);
-
-		// NOTE: This test will only occasionally fail without the fix since there is
-		// a race between the releaseFuture and the slotFuture2
-		assertThat(executionGraph.getTerminationFuture().get(), is(JobStatus.CANCELED));
-	}
-
-	/**
-	 * Tests that a partially completed eager scheduling operation fails if a
-	 * completed slot is released. See FLINK-9099.
-	 */
-	@Test
-	public void testSlotReleasingFailsSchedulingOperation() throws Exception {
-		final int parallelism = 2;
-
-		final JobVertex jobVertex = new JobVertex("Testing job vertex");
-		jobVertex.setInvokableClass(NoOpInvokable.class);
-		jobVertex.setParallelism(parallelism);
-		final JobGraph jobGraph = new JobGraph(jobVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-
-		final LogicalSlot slot = createSingleLogicalSlot(new DummySlotOwner(), new SimpleAckingTaskManagerGateway(), new SlotRequestId());
-		slotProvider.addSlot(jobVertex.getID(), 0, CompletableFuture.completedFuture(slot));
-
-		final CompletableFuture<LogicalSlot> slotFuture = new CompletableFuture<>();
-		slotProvider.addSlot(jobVertex.getID(), 1, slotFuture);
-
-		final ExecutionGraph executionGraph = createExecutionGraph(jobGraph, slotProvider);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		executionGraph.scheduleForExecution();
-
-		assertThat(executionGraph.getState(), is(JobStatus.RUNNING));
-
-		final ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertex.getID());
-		final ExecutionVertex[] taskVertices = executionJobVertex.getTaskVertices();
-		assertThat(taskVertices[0].getExecutionState(), is(ExecutionState.SCHEDULED));
-		assertThat(taskVertices[1].getExecutionState(), is(ExecutionState.SCHEDULED));
-
-		// fail the single allocated slot --> this should fail the scheduling operation
-		slot.releaseSlot(new FlinkException("Test failure"));
-
-		assertThat(executionGraph.getTerminationFuture().get(), is(JobStatus.FAILED));
-	}
-
-	/**
-	 * Tests that all slots are being returned to the {@link SlotOwner} if the
-	 * {@link ExecutionGraph} is being cancelled. See FLINK-9908
-	 */
-	@Test
-	public void testCancellationOfIncompleteScheduling() throws Exception {

Review comment:
       added `SlotPoolImpl#testShutdownCancelsAllPendingRequests `




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk removed a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk removed a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-719297248


   @flinkbot run azure


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r532534958



##########
File path: flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/DefaultScheduler.java
##########
@@ -188,7 +188,7 @@ public void start(ComponentMainThreadExecutor mainThreadExecutor) {
 	}
 
 	@Override
-	protected long getNumberOfRestarts() {
+	public long getNumberOfRestarts() {

Review comment:
       > maybe we could query this somehow from JobManagerMetricGroup
   
   Good idea!
   
   It's actually `JobManagerJobMetricGroup` which is different from `JobManagerMetricGroup`.
   All job global metrics are in this group.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] azagrebin commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
azagrebin commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r536055982



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphVariousFailuesTest.java
##########
@@ -20,86 +20,31 @@
 
 import org.apache.flink.api.common.JobStatus;
 import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.SuppressRestartsException;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
 import org.apache.flink.runtime.io.network.partition.ResultPartitionID;
 import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobGraph;
+import org.apache.flink.runtime.scheduler.SchedulerBase;
+import org.apache.flink.runtime.scheduler.SchedulerTestingUtils;
 import org.apache.flink.util.TestLogger;
 
 import org.junit.Test;
 
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.fail;
 
-
 public class ExecutionGraphVariousFailuesTest extends TestLogger {
 
-	/**
-	 * Test that failing in state restarting will retrigger the restarting logic. This means that
-	 * it only goes into the state FAILED after the restart strategy says the job is no longer
-	 * restartable.
-	 */
-	@Test
-	public void testFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(2));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("Test 1"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-
-		// we should restart since we have two restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 2"));
-
-		// we should restart since we have one restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 3"));
-
-		// after depleting all our restart attempts we should go into Failed
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
-	/**
-	 * Tests that a {@link SuppressRestartsException} in state RESTARTING stops the restarting
-	 * immediately and sets the execution graph's state to FAILED.
-	 */
-	@Test
-	public void testSuppressRestartFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("test"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		// suppress a possible restart
-		eg.failGlobal(new SuppressRestartsException(new Exception("Test")));
-
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
 	/**
 	 * Tests that a failing scheduleOrUpdateConsumers call with a non-existing execution attempt
 	 * id, will not fail the execution graph.
 	 */
 	@Test
 	public void testFailingScheduleOrUpdateConsumers() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
+		final SchedulerBase scheduler = SchedulerTestingUtils.newSchedulerBuilder(new JobGraph()).build();
+		scheduler.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
+		scheduler.startScheduling();
+
+		final ExecutionGraph eg = scheduler.getExecutionGraph();

Review comment:
       `ExecutionGraphDeploymentTest#testEagerSchedulingWaitsOnAllInputPreferredLocations` seems to test that consumers wait on producers. The JobGraph has only one vertex in `DefaultSchedulerTest#deployTasksOnlyWhenAllSlotRequestsAreFulfilled` and it seems to test a different thing. Is it intended?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * 9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * 659fb7eddb0acfa0ef49f76c5fafca21c389f3c0 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562) 
   * f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r533211602



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphVariousFailuesTest.java
##########
@@ -20,86 +20,31 @@
 
 import org.apache.flink.api.common.JobStatus;
 import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.SuppressRestartsException;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
 import org.apache.flink.runtime.io.network.partition.ResultPartitionID;
 import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobGraph;
+import org.apache.flink.runtime.scheduler.SchedulerBase;
+import org.apache.flink.runtime.scheduler.SchedulerTestingUtils;
 import org.apache.flink.util.TestLogger;
 
 import org.junit.Test;
 
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.fail;
 
-
 public class ExecutionGraphVariousFailuesTest extends TestLogger {
 
-	/**
-	 * Test that failing in state restarting will retrigger the restarting logic. This means that
-	 * it only goes into the state FAILED after the restart strategy says the job is no longer
-	 * restartable.
-	 */
-	@Test
-	public void testFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(2));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("Test 1"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-
-		// we should restart since we have two restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 2"));
-
-		// we should restart since we have one restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 3"));
-
-		// after depleting all our restart attempts we should go into Failed
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
-	/**
-	 * Tests that a {@link SuppressRestartsException} in state RESTARTING stops the restarting
-	 * immediately and sets the execution graph's state to FAILED.
-	 */
-	@Test
-	public void testSuppressRestartFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("test"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		// suppress a possible restart
-		eg.failGlobal(new SuppressRestartsException(new Exception("Test")));
-
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
 	/**
 	 * Tests that a failing scheduleOrUpdateConsumers call with a non-existing execution attempt
 	 * id, will not fail the execution graph.
 	 */
 	@Test
 	public void testFailingScheduleOrUpdateConsumers() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
+		final SchedulerBase scheduler = SchedulerTestingUtils.newSchedulerBuilder(new JobGraph()).build();
+		scheduler.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
+		scheduler.startScheduling();
+
+		final ExecutionGraph eg = scheduler.getExecutionGraph();

Review comment:
       I checked the tests and find only 2 related to legacy eager/lazy scheduling. However, I think I can directly remove or rework them because they are testing other things but just was designed to use eager/lazy scheduling. In this way, I think we do not need an annotation like `@TestLegacyScheduling` anymore.
   - ExecutionGraphNotEnoughResourceTest: can be removed because it is superseded by `DefaultSchedulerTest#failJobIfNotEnoughResources()`
   - ExecutionGraphDeploymentTest#testEagerSchedulingWaitsOnAllInputPreferredLocations(): added a new test `DefaultSchedulerTest#deployTasksOnlyWhenAllSlotRequestsAreFulfilled()` to replace it
   
   Regarding tests for `scheduleOrUpdateConsumers`, I think the `scheduleOrUpdateConsumers` mechanism deserves a separate discussion so I opened FLINK-20439 to track and discuss it.
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r532462500



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphVariousFailuesTest.java
##########
@@ -20,86 +20,31 @@
 
 import org.apache.flink.api.common.JobStatus;
 import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.SuppressRestartsException;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
 import org.apache.flink.runtime.io.network.partition.ResultPartitionID;
 import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobGraph;
+import org.apache.flink.runtime.scheduler.SchedulerBase;
+import org.apache.flink.runtime.scheduler.SchedulerTestingUtils;
 import org.apache.flink.util.TestLogger;
 
 import org.junit.Test;
 
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.fail;
 
-
 public class ExecutionGraphVariousFailuesTest extends TestLogger {
 
-	/**
-	 * Test that failing in state restarting will retrigger the restarting logic. This means that
-	 * it only goes into the state FAILED after the restart strategy says the job is no longer
-	 * restartable.
-	 */
-	@Test
-	public void testFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(2));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("Test 1"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-
-		// we should restart since we have two restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 2"));
-
-		// we should restart since we have one restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 3"));
-
-		// after depleting all our restart attempts we should go into Failed
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
-	/**
-	 * Tests that a {@link SuppressRestartsException} in state RESTARTING stops the restarting
-	 * immediately and sets the execution graph's state to FAILED.
-	 */
-	@Test
-	public void testSuppressRestartFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("test"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		// suppress a possible restart
-		eg.failGlobal(new SuppressRestartsException(new Exception("Test")));
-
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
 	/**
 	 * Tests that a failing scheduleOrUpdateConsumers call with a non-existing execution attempt
 	 * id, will not fail the execution graph.
 	 */
 	@Test
 	public void testFailingScheduleOrUpdateConsumers() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
+		final SchedulerBase scheduler = SchedulerTestingUtils.newSchedulerBuilder(new JobGraph()).build();
+		scheduler.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
+		scheduler.startScheduling();
+
+		final ExecutionGraph eg = scheduler.getExecutionGraph();

Review comment:
       `scheduleOrUpdateConsumers` is still used by lazy-from-sources scheduling with schedulerNG. So we still need this test.
   It is not needed by pipelined region scheduling though. Actually the partition cache mechanism is not needed anymore for pipelined region scheduling.
   But maybe we still need keep this mechanism for other possible scheduling strategy which may schedule a task before all its upstream task finishes.
   WDYT?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * 021cac170ea26cddfd8af0a2bec5fea4e6a76b69 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756) 
   * fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42 UNKNOWN
   * 20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] tillrohrmann commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
tillrohrmann commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r532473677



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphVariousFailuesTest.java
##########
@@ -20,86 +20,31 @@
 
 import org.apache.flink.api.common.JobStatus;
 import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.SuppressRestartsException;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
 import org.apache.flink.runtime.io.network.partition.ResultPartitionID;
 import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobGraph;
+import org.apache.flink.runtime.scheduler.SchedulerBase;
+import org.apache.flink.runtime.scheduler.SchedulerTestingUtils;
 import org.apache.flink.util.TestLogger;
 
 import org.junit.Test;
 
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.fail;
 
-
 public class ExecutionGraphVariousFailuesTest extends TestLogger {
 
-	/**
-	 * Test that failing in state restarting will retrigger the restarting logic. This means that
-	 * it only goes into the state FAILED after the restart strategy says the job is no longer
-	 * restartable.
-	 */
-	@Test
-	public void testFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(2));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("Test 1"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-
-		// we should restart since we have two restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 2"));
-
-		// we should restart since we have one restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 3"));
-
-		// after depleting all our restart attempts we should go into Failed
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
-	/**
-	 * Tests that a {@link SuppressRestartsException} in state RESTARTING stops the restarting
-	 * immediately and sets the execution graph's state to FAILED.
-	 */
-	@Test
-	public void testSuppressRestartFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("test"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		// suppress a possible restart
-		eg.failGlobal(new SuppressRestartsException(new Exception("Test")));
-
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
 	/**
 	 * Tests that a failing scheduleOrUpdateConsumers call with a non-existing execution attempt
 	 * id, will not fail the execution graph.
 	 */
 	@Test
 	public void testFailingScheduleOrUpdateConsumers() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
+		final SchedulerBase scheduler = SchedulerTestingUtils.newSchedulerBuilder(new JobGraph()).build();
+		scheduler.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
+		scheduler.startScheduling();
+
+		final ExecutionGraph eg = scheduler.getExecutionGraph();

Review comment:
       Would it make sense to annotate tests which are specifically written for the non-pipelined region scheduler with a special annotation? That way it would be much easier to find them when we throw out the old scheduling strategies. Since we are touching these tests anyway.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r514123249



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionVertexSchedulingTest.java
##########
@@ -1,126 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.jobmanager.scheduler.LocationPreferenceConstraint;
-import org.apache.flink.runtime.jobmaster.LogicalSlot;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlotBuilder;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.Test;
-
-import java.util.Collections;
-import java.util.concurrent.CompletableFuture;
-
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.getExecutionVertex;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.fail;
-
-public class ExecutionVertexSchedulingTest extends TestLogger {

Review comment:
       Can be removed because it is testing the legacy scheduling code path `ExecutionVertex#scheduleForExecution()`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331",
       "triggerID" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10374",
       "triggerID" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10435",
       "triggerID" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "89bea5233d5efb9db88eacc21b445a617a8c3c27",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10899",
       "triggerID" : "89bea5233d5efb9db88eacc21b445a617a8c3c27",
       "triggerType" : "PUSH"
     }, {
       "hash" : "dfd82cd0de7a46432eae70f1f86aafb823ed7ff2",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10935",
       "triggerID" : "dfd82cd0de7a46432eae70f1f86aafb823ed7ff2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf675a9aff300f1d9e7af38bbae24a44377d404d",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10957",
       "triggerID" : "cf675a9aff300f1d9e7af38bbae24a44377d404d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42 UNKNOWN
   * dfd82cd0de7a46432eae70f1f86aafb823ed7ff2 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10935) 
   * cf675a9aff300f1d9e7af38bbae24a44377d404d Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10957) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r514115275



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java
##########
@@ -1,570 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.runtime.checkpoint.JobManagerTaskRestore;
-import org.apache.flink.runtime.checkpoint.TaskStateSnapshot;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.JobVertexID;
-import org.apache.flink.runtime.jobgraph.tasks.AbstractInvokable;
-import org.apache.flink.runtime.jobmanager.scheduler.LocationPreferenceConstraint;
-import org.apache.flink.runtime.jobmaster.LogicalSlot;
-import org.apache.flink.runtime.jobmaster.SlotOwner;
-import org.apache.flink.runtime.jobmaster.SlotRequestId;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlot;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlotBuilder;
-import org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.shuffle.PartitionDescriptor;
-import org.apache.flink.runtime.shuffle.ProducerDescriptor;
-import org.apache.flink.runtime.shuffle.ShuffleDescriptor;
-import org.apache.flink.runtime.shuffle.ShuffleMaster;
-import org.apache.flink.runtime.taskmanager.LocalTaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.ClassRule;
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.util.Collection;
-import java.util.Collections;
-import java.util.Set;
-import java.util.concurrent.CompletableFuture;
-import java.util.concurrent.CountDownLatch;
-import java.util.concurrent.ExecutionException;
-
-import static org.apache.flink.runtime.io.network.partition.ResultPartitionType.PIPELINED;
-import static org.apache.flink.runtime.jobgraph.DistributionPattern.POINTWISE;
-import static org.hamcrest.Matchers.equalTo;
-import static org.hamcrest.Matchers.hasSize;
-import static org.hamcrest.Matchers.is;
-import static org.hamcrest.Matchers.notNullValue;
-import static org.hamcrest.Matchers.nullValue;
-import static org.hamcrest.Matchers.sameInstance;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertThat;
-import static org.junit.Assert.assertTrue;
-
-/**
- * Tests for the {@link Execution}.
- */
-public class ExecutionTest extends TestLogger {

Review comment:
       Removed cases which test the legacy scheduling code paths, e.g. allocateResourcesForExecution()
   `ExecutionSlotAllocator`s have taken over slot allocation and there are tests for them.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331",
       "triggerID" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42 UNKNOWN
   * 20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331) 
   * e7941f905ec697ad09a3f1010f90a2a69a512ce0 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] tillrohrmann commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
tillrohrmann commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r543323092



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java
##########
@@ -82,202 +82,12 @@
 	private final TestingComponentMainThreadExecutor testMainThreadUtil =
 		EXECUTOR_RESOURCE.getComponentMainThreadTestExecutor();
 
-	/**
-	 * Tests that slots are released if we cannot assign the allocated resource to the
-	 * Execution.
-	 */
-	@Test
-	public void testSlotReleaseOnFailedResourceAssignment() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final CompletableFuture<LogicalSlot> slotFuture = new CompletableFuture<>();
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, slotFuture);
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final LogicalSlot otherSlot = new TestingLogicalSlotBuilder().createTestingLogicalSlot();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertFalse(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		// assign a different resource to the execution
-		assertTrue(execution.tryAssignResource(otherSlot));
-
-		// completing now the future should cause the slot to be released
-		slotFuture.complete(slot);
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
 	private TestingLogicalSlot createTestingLogicalSlot(SlotOwner slotOwner) {
 		return new TestingLogicalSlotBuilder()
 			.setSlotOwner(slotOwner)
 			.createTestingLogicalSlot();
 	}
 
-	/**
-	 * Tests that the slot is released in case of a execution cancellation when having
-	 * a slot assigned and being in state SCHEDULED.
-	 */
-	@Test
-	public void testSlotReleaseOnExecutionCancellationInScheduled() throws Exception {
-		final JobVertex jobVertex = createNoOpJobVertex();
-		final JobVertexID jobVertexId = jobVertex.getID();
-
-		final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner();
-
-		final LogicalSlot slot = createTestingLogicalSlot(slotOwner);
-
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1);
-		slotProvider.addSlot(jobVertexId, 0, CompletableFuture.completedFuture(slot));
-
-		ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph(
-			slotProvider,
-			new NoRestartStrategy(),
-			jobVertex);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);
-
-		final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt();
-
-		CompletableFuture<Execution> allocationFuture = execution.allocateResourcesForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ALL,
-			Collections.emptySet());
-
-		assertTrue(allocationFuture.isDone());
-
-		assertEquals(ExecutionState.SCHEDULED, execution.getState());
-
-		assertEquals(slot, execution.getAssignedResource());
-
-		// cancelling the execution should move it into state CANCELED
-		execution.cancel();
-		assertEquals(ExecutionState.CANCELED, execution.getState());
-
-		assertEquals(slot, slotOwner.getReturnedSlotFuture().get());
-	}
-
-	/**
-	 * Tests that the slot is released in case of a execution cancellation when being in state
-	 * RUNNING.
-	 */
-	@Test
-	public void testSlotReleaseOnExecutionCancellationInRunning() throws Exception {

Review comment:
       Thanks for the information @zhuzhurk. I think you are right.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * 1e959ffb3e7837247842ae4ade724a999ad7ca3b Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * 9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279) 
   * 659fb7eddb0acfa0ef49f76c5fafca21c389f3c0 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r543428414



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/ExecutionGraphCheckpointCoordinatorTest.java
##########
@@ -159,7 +164,7 @@ private ExecutionGraph createExecutionGraphAndEnableCheckpointing(
 			false,
 			0);
 
-		executionGraph.enableCheckpointing(
+		scheduler.getExecutionGraph().enableCheckpointing(

Review comment:
       Good idea. A `TestingCheckpointRecoveryFactory` can really help in this case.

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/ExecutionGraphCheckpointCoordinatorTest.java
##########
@@ -159,7 +164,7 @@ private ExecutionGraph createExecutionGraphAndEnableCheckpointing(
 			false,
 			0);
 
-		executionGraph.enableCheckpointing(
+		scheduler.getExecutionGraph().enableCheckpointing(

Review comment:
       done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r533130210



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphNotEnoughResourceTest.java
##########
@@ -113,29 +119,28 @@ public void testRestartWithSlotSharingAndNotEnoughResources() throws Exception {
 			final JobGraph jobGraph = new JobGraph(TEST_JOB_ID, "Test Job", source, sink);
 			jobGraph.setScheduleMode(ScheduleMode.EAGER);
 
-			TestRestartStrategy restartStrategy = new TestRestartStrategy(numRestarts, false);
+			RestartBackoffTimeStrategy restartStrategy = new FixedDelayRestartBackoffTimeStrategy.FixedDelayRestartBackoffTimeStrategyFactory(numRestarts, 0).create();
 
-			final ExecutionGraph eg = TestingExecutionGraphBuilder
-				.newBuilder()
-				.setJobGraph(jobGraph)
-				.setSlotProvider(scheduler)
-				.setRestartStrategy(restartStrategy)
-				.setAllocationTimeout(Time.milliseconds(1L))
+			final SchedulerBase schedulerNG = SchedulerTestingUtils
+				.newSchedulerBuilderWithDefaultSlotAllocator(jobGraph, scheduler, Time.milliseconds(1))
+				.setRestartBackoffTimeStrategy(restartStrategy)
+				.setSchedulingStrategyFactory(new EagerSchedulingStrategy.Factory())
+				.setFailoverStrategyFactory(new RestartAllFailoverStrategy.Factory())
 				.build();
+			final ExecutionGraph eg = schedulerNG.getExecutionGraph();

Review comment:
       Now I think we can remove this test because it is actually testing job failure on slot allocation which is already covered by DefaultSchedulerTest#failJobIfNotEnoughResources() and PipelinedRegionSchedulingITCase#testFailsOnInsufficientSlot()




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
flinkbot commented on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708560922


   Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress of the review.
   
   
   ## Automated Checks
   Last check on commit 09d8deb89416f53dfe8b5c16fb9d723cbd98612c (Wed Oct 14 17:50:08 UTC 2020)
   
   **Warnings:**
    * No documentation files were touched! Remember to keep the Flink docs up to date!
   
   
   <sub>Mention the bot in a comment to re-run the automated checks.</sub>
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process.<details>
    The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`)
    - `@flinkbot approve all` to approve all aspects
    - `@flinkbot approve-until architecture` to approve everything until `architecture`
    - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention
    - `@flinkbot disapprove architecture` to remove an approval you gave earlier
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] tillrohrmann commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
tillrohrmann commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r532473677



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphVariousFailuesTest.java
##########
@@ -20,86 +20,31 @@
 
 import org.apache.flink.api.common.JobStatus;
 import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.SuppressRestartsException;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
 import org.apache.flink.runtime.io.network.partition.ResultPartitionID;
 import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobGraph;
+import org.apache.flink.runtime.scheduler.SchedulerBase;
+import org.apache.flink.runtime.scheduler.SchedulerTestingUtils;
 import org.apache.flink.util.TestLogger;
 
 import org.junit.Test;
 
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.fail;
 
-
 public class ExecutionGraphVariousFailuesTest extends TestLogger {
 
-	/**
-	 * Test that failing in state restarting will retrigger the restarting logic. This means that
-	 * it only goes into the state FAILED after the restart strategy says the job is no longer
-	 * restartable.
-	 */
-	@Test
-	public void testFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(2));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("Test 1"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-
-		// we should restart since we have two restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 2"));
-
-		// we should restart since we have one restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 3"));
-
-		// after depleting all our restart attempts we should go into Failed
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
-	/**
-	 * Tests that a {@link SuppressRestartsException} in state RESTARTING stops the restarting
-	 * immediately and sets the execution graph's state to FAILED.
-	 */
-	@Test
-	public void testSuppressRestartFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("test"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		// suppress a possible restart
-		eg.failGlobal(new SuppressRestartsException(new Exception("Test")));
-
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
 	/**
 	 * Tests that a failing scheduleOrUpdateConsumers call with a non-existing execution attempt
 	 * id, will not fail the execution graph.
 	 */
 	@Test
 	public void testFailingScheduleOrUpdateConsumers() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
+		final SchedulerBase scheduler = SchedulerTestingUtils.newSchedulerBuilder(new JobGraph()).build();
+		scheduler.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
+		scheduler.startScheduling();
+
+		final ExecutionGraph eg = scheduler.getExecutionGraph();

Review comment:
       Would it make sense to annotate tests which are specifically written for the non-pipelined region scheduler with a special annotation? That way it would be much easier to find them when we throw out the old scheduling strategies.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r532462500



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphVariousFailuesTest.java
##########
@@ -20,86 +20,31 @@
 
 import org.apache.flink.api.common.JobStatus;
 import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.SuppressRestartsException;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
 import org.apache.flink.runtime.io.network.partition.ResultPartitionID;
 import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobGraph;
+import org.apache.flink.runtime.scheduler.SchedulerBase;
+import org.apache.flink.runtime.scheduler.SchedulerTestingUtils;
 import org.apache.flink.util.TestLogger;
 
 import org.junit.Test;
 
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.fail;
 
-
 public class ExecutionGraphVariousFailuesTest extends TestLogger {
 
-	/**
-	 * Test that failing in state restarting will retrigger the restarting logic. This means that
-	 * it only goes into the state FAILED after the restart strategy says the job is no longer
-	 * restartable.
-	 */
-	@Test
-	public void testFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(2));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("Test 1"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-
-		// we should restart since we have two restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 2"));
-
-		// we should restart since we have one restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 3"));
-
-		// after depleting all our restart attempts we should go into Failed
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
-	/**
-	 * Tests that a {@link SuppressRestartsException} in state RESTARTING stops the restarting
-	 * immediately and sets the execution graph's state to FAILED.
-	 */
-	@Test
-	public void testSuppressRestartFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("test"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		// suppress a possible restart
-		eg.failGlobal(new SuppressRestartsException(new Exception("Test")));
-
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
 	/**
 	 * Tests that a failing scheduleOrUpdateConsumers call with a non-existing execution attempt
 	 * id, will not fail the execution graph.
 	 */
 	@Test
 	public void testFailingScheduleOrUpdateConsumers() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
+		final SchedulerBase scheduler = SchedulerTestingUtils.newSchedulerBuilder(new JobGraph()).build();
+		scheduler.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
+		scheduler.startScheduling();
+
+		final ExecutionGraph eg = scheduler.getExecutionGraph();

Review comment:
       `scheduleOrUpdateConsumers` is still used by lazy-from-sources scheduling with schedulerNG. So we still need this test.
   It is not needed by pipelined region scheduling though. Actually the partition cache mechanism is not needed anymore for pipelined region scheduling.
   But maybe we still need keep this mechanism for other possible future SchedulingStrategy which may schedule a task before all its upstream task finishes.
   WDYT?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r512792107



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphDeploymentTest.java
##########
@@ -566,174 +552,6 @@ public void testSettingIllegalMaxNumberOfCheckpointsToRetain() throws Exception
 			eg.getCheckpointCoordinator().getCheckpointStore().getMaxNumberOfRetainedCheckpoints());
 	}
 
-	/**
-	 * Tests that eager scheduling will wait until all input locations have been set before
-	 * scheduling a task.
-	 */
-	@Test
-	public void testEagerSchedulingWaitsOnAllInputPreferredLocations() throws Exception {
-		final int parallelism = 2;
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-
-		final Time timeout = Time.hours(1L);
-		final JobVertexID sourceVertexId = new JobVertexID();
-		final JobVertex sourceVertex = new JobVertex("Test source", sourceVertexId);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-		sourceVertex.setParallelism(parallelism);
-
-		final JobVertexID sinkVertexId = new JobVertexID();
-		final JobVertex sinkVertex = new JobVertex("Test sink", sinkVertexId);
-		sinkVertex.setInvokableClass(NoOpInvokable.class);
-		sinkVertex.setParallelism(parallelism);
-
-		sinkVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
-
-		final Map<JobVertexID, CompletableFuture<LogicalSlot>[]> slotFutures = new HashMap<>(2);
-
-		for (JobVertexID jobVertexID : Arrays.asList(sourceVertexId, sinkVertexId)) {
-			CompletableFuture<LogicalSlot>[] slotFutureArray = new CompletableFuture[parallelism];
-
-			for (int i = 0; i < parallelism; i++) {
-				slotFutureArray[i] = new CompletableFuture<>();
-			}
-
-			slotFutures.put(jobVertexID, slotFutureArray);
-			slotProvider.addSlots(jobVertexID, slotFutureArray);
-		}
-
-		final ScheduledExecutorService scheduledExecutorService = new ScheduledThreadPoolExecutor(3);
-
-		final JobGraph jobGraph = new JobGraph(sourceVertex, sinkVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final ExecutionGraph executionGraph = TestingExecutionGraphBuilder
-			.newBuilder()
-			.setJobGraph(jobGraph)
-			.setSlotProvider(slotProvider)
-			.setIoExecutor(scheduledExecutorService)
-			.setFutureExecutor(scheduledExecutorService)
-			.setAllocationTimeout(timeout)
-			.setRpcTimeout(timeout)
-			.build();
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		executionGraph.scheduleForExecution();
-
-		// all tasks should be in state SCHEDULED
-		for (ExecutionVertex executionVertex : executionGraph.getAllExecutionVertices()) {
-			assertEquals(ExecutionState.SCHEDULED, executionVertex.getCurrentExecutionAttempt().getState());
-		}
-
-		// wait until the source vertex slots have been requested
-		assertTrue(slotProvider.getSlotRequestedFuture(sourceVertexId, 0).get());
-		assertTrue(slotProvider.getSlotRequestedFuture(sourceVertexId, 1).get());
-
-		// check that the sinks have not requested their slots because they need the location
-		// information of the sources
-		assertFalse(slotProvider.getSlotRequestedFuture(sinkVertexId, 0).isDone());
-		assertFalse(slotProvider.getSlotRequestedFuture(sinkVertexId, 1).isDone());
-
-		final TaskManagerLocation localTaskManagerLocation = new LocalTaskManagerLocation();
-
-		final LogicalSlot sourceSlot1 = createSlot(localTaskManagerLocation, 0);
-		final LogicalSlot sourceSlot2 = createSlot(localTaskManagerLocation, 1);
-
-		final LogicalSlot sinkSlot1 = createSlot(localTaskManagerLocation, 0);
-		final LogicalSlot sinkSlot2 = createSlot(localTaskManagerLocation, 1);
-
-		slotFutures.get(sourceVertexId)[0].complete(sourceSlot1);
-		slotFutures.get(sourceVertexId)[1].complete(sourceSlot2);
-
-		// wait until the sink vertex slots have been requested after we completed the source slots
-		assertTrue(slotProvider.getSlotRequestedFuture(sinkVertexId, 0).get());
-		assertTrue(slotProvider.getSlotRequestedFuture(sinkVertexId, 1).get());
-
-		slotFutures.get(sinkVertexId)[0].complete(sinkSlot1);
-		slotFutures.get(sinkVertexId)[1].complete(sinkSlot2);
-
-		for (ExecutionVertex executionVertex : executionGraph.getAllExecutionVertices()) {
-			ExecutionGraphTestUtils.waitUntilExecutionState(executionVertex.getCurrentExecutionAttempt(), ExecutionState.DEPLOYING, 5000L);
-		}
-	}
-
-	/**
-	 * Tests that the {@link ExecutionGraph} is deployed in topological order.
-	 */
-	@Test
-	public void testExecutionGraphIsDeployedInTopologicalOrder() throws Exception {

Review comment:
       Can be removed because we already have `DefaultSchedulerTest#scheduledVertexOrderFromSchedulingStrategyIsRespected()`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r514115275



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java
##########
@@ -1,570 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.runtime.checkpoint.JobManagerTaskRestore;
-import org.apache.flink.runtime.checkpoint.TaskStateSnapshot;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.JobVertexID;
-import org.apache.flink.runtime.jobgraph.tasks.AbstractInvokable;
-import org.apache.flink.runtime.jobmanager.scheduler.LocationPreferenceConstraint;
-import org.apache.flink.runtime.jobmaster.LogicalSlot;
-import org.apache.flink.runtime.jobmaster.SlotOwner;
-import org.apache.flink.runtime.jobmaster.SlotRequestId;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlot;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlotBuilder;
-import org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.shuffle.PartitionDescriptor;
-import org.apache.flink.runtime.shuffle.ProducerDescriptor;
-import org.apache.flink.runtime.shuffle.ShuffleDescriptor;
-import org.apache.flink.runtime.shuffle.ShuffleMaster;
-import org.apache.flink.runtime.taskmanager.LocalTaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.ClassRule;
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.util.Collection;
-import java.util.Collections;
-import java.util.Set;
-import java.util.concurrent.CompletableFuture;
-import java.util.concurrent.CountDownLatch;
-import java.util.concurrent.ExecutionException;
-
-import static org.apache.flink.runtime.io.network.partition.ResultPartitionType.PIPELINED;
-import static org.apache.flink.runtime.jobgraph.DistributionPattern.POINTWISE;
-import static org.hamcrest.Matchers.equalTo;
-import static org.hamcrest.Matchers.hasSize;
-import static org.hamcrest.Matchers.is;
-import static org.hamcrest.Matchers.notNullValue;
-import static org.hamcrest.Matchers.nullValue;
-import static org.hamcrest.Matchers.sameInstance;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertThat;
-import static org.junit.Assert.assertTrue;
-
-/**
- * Tests for the {@link Execution}.
- */
-public class ExecutionTest extends TestLogger {

Review comment:
       Removed cases which test the legacy scheduling code paths, e.g. allocateResourcesForExecution()
   `ExecutionSlotAllocator`s have taken over slot allocation and there are tests for them.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] azagrebin commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
azagrebin commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r532697063



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphSchedulingTest.java
##########
@@ -1,637 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.api.common.time.Time;
-import org.apache.flink.configuration.Configuration;
-import org.apache.flink.metrics.groups.UnregisteredMetricsGroup;
-import org.apache.flink.runtime.blob.VoidBlobWriter;
-import org.apache.flink.runtime.checkpoint.StandaloneCheckpointRecoveryFactory;
-import org.apache.flink.runtime.clusterframework.types.AllocationID;
-import org.apache.flink.runtime.clusterframework.types.ResourceID;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.instance.SimpleSlotContext;
-import org.apache.flink.runtime.io.network.partition.NoOpJobMasterPartitionTracker;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.ScheduleMode;
-import org.apache.flink.runtime.jobmanager.scheduler.Locality;
-import org.apache.flink.runtime.jobmanager.slots.DummySlotOwner;
-import org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway;
-import org.apache.flink.runtime.jobmanager.slots.TestingSlotOwner;
-import org.apache.flink.runtime.jobmaster.LogicalSlot;
-import org.apache.flink.runtime.jobmaster.SlotOwner;
-import org.apache.flink.runtime.jobmaster.SlotRequestId;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlot;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlotBuilder;
-import org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.shuffle.NettyShuffleMaster;
-import org.apache.flink.runtime.taskmanager.TaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.runtime.testutils.DirectScheduledExecutorService;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.After;
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.net.InetAddress;
-import java.util.Set;
-import java.util.concurrent.ArrayBlockingQueue;
-import java.util.concurrent.BlockingQueue;
-import java.util.concurrent.CompletableFuture;
-import java.util.concurrent.ConcurrentHashMap;
-import java.util.concurrent.ConcurrentMap;
-import java.util.concurrent.ScheduledExecutorService;
-import java.util.concurrent.TimeUnit;
-import java.util.concurrent.TimeoutException;
-
-import static org.hamcrest.Matchers.empty;
-import static org.hamcrest.Matchers.is;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertThat;
-
-/**
- * Tests for the scheduling of the execution graph. This tests that
- * for example the order of deployments is correct and that bulk slot allocation
- * works properly.
- */
-public class ExecutionGraphSchedulingTest extends TestLogger {

Review comment:
       Not sure whether we test how slot release/timeout affects the new scheduling, basically failed allocation assignments,
   also `testCancellationOfIncompleteScheduling`

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionVertexCancelTest.java
##########
@@ -244,104 +250,6 @@ public void testSendCancelAndReceiveFail() throws Exception {
 		assertEquals(vertices.length - 1, exec.getVertex().getExecutionGraph().getRegisteredExecutions().size());
 	}
 
-	// --------------------------------------------------------------------------------------------
-	//  Actions after a vertex has been canceled or while canceling
-	// --------------------------------------------------------------------------------------------
-
-	@Test
-	public void testScheduleOrDeployAfterCancel() {
-		try {
-			final ExecutionVertex vertex = getExecutionVertex();
-			setVertexState(vertex, ExecutionState.CANCELED);
-
-			assertEquals(ExecutionState.CANCELED, vertex.getExecutionState());
-
-			// 1)
-			// scheduling after being canceled should be tolerated (no exception) because
-			// it can occur as the result of races
-			{
-				vertex.scheduleForExecution(
-					TestingSlotProviderStrategy.from(new ProgrammedSlotProvider(1)),
-					LocationPreferenceConstraint.ALL,
-					Collections.emptySet());
-
-				assertEquals(ExecutionState.CANCELED, vertex.getExecutionState());
-			}
-
-			// 2)
-			// deploying after canceling from CREATED needs to raise an exception, because
-			// the scheduler (or any caller) needs to know that the slot should be released
-			try {
-
-				final LogicalSlot slot = new TestingLogicalSlotBuilder().createTestingLogicalSlot();
-
-				vertex.deployToSlot(slot);
-				fail("Method should throw an exception");
-			}
-			catch (IllegalStateException e) {
-				assertEquals(ExecutionState.CANCELED, vertex.getExecutionState());
-			}
-		}
-		catch (Exception e) {
-			e.printStackTrace();
-			fail(e.getMessage());
-		}
-	}
-
-	@Test
-	public void testActionsWhileCancelling() {

Review comment:
       the test here look to me more like checking the valid state transitions.
   are the same tests not actual for the new scheduler (`scheduler.updateTaskExecutionState`)?

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionVertexInputConstraintTest.java
##########
@@ -1,276 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.InputDependencyConstraint;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutor;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.utils.SimpleSlotProvider;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionID;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.JobVertexID;
-import org.apache.flink.runtime.jobgraph.tasks.AbstractInvokable;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.taskmanager.TaskExecutionState;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.Test;
-
-import java.time.Duration;
-import java.util.Arrays;
-import java.util.List;
-
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.isInExecutionState;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.waitForAllExecutionsPredicate;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.waitUntilExecutionVertexState;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.waitUntilJobStatus;
-import static org.hamcrest.Matchers.lessThan;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertThat;
-import static org.junit.Assert.assertTrue;
-
-/**
- * Tests for the inputs constraint for {@link ExecutionVertex}.
- */
-public class ExecutionVertexInputConstraintTest extends TestLogger {

Review comment:
       is performance test not actual for the `InputDependencyConstraintChecker`?

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphRestartTest.java
##########
@@ -1,866 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.ExecutionConfig;
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.api.common.restartstrategy.RestartStrategies;
-import org.apache.flink.api.common.time.Time;
-import org.apache.flink.runtime.clusterframework.types.AllocationID;
-import org.apache.flink.runtime.clusterframework.types.ResourceProfile;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutor;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.execution.SuppressRestartsException;
-import org.apache.flink.runtime.executiongraph.failover.FailoverStrategy;
-import org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
-import org.apache.flink.runtime.executiongraph.restart.RestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.NotCancelAckingTaskGateway;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.executiongraph.utils.SimpleSlotProvider;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.ScheduleMode;
-import org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup;
-import org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway;
-import org.apache.flink.runtime.jobmaster.JobMasterId;
-import org.apache.flink.runtime.jobmaster.slotpool.LocationPreferenceSlotSelectionStrategy;
-import org.apache.flink.runtime.jobmaster.slotpool.Scheduler;
-import org.apache.flink.runtime.jobmaster.slotpool.SchedulerImpl;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotPool;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.jobmaster.slotpool.TestingSlotPoolImpl;
-import org.apache.flink.runtime.resourcemanager.ResourceManagerGateway;
-import org.apache.flink.runtime.resourcemanager.utils.TestingResourceManagerGateway;
-import org.apache.flink.runtime.taskexecutor.slot.SlotOffer;
-import org.apache.flink.runtime.taskmanager.LocalTaskManagerLocation;
-import org.apache.flink.runtime.taskmanager.TaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.io.IOException;
-import java.util.ArrayList;
-import java.util.Iterator;
-import java.util.List;
-import java.util.concurrent.CompletableFuture;
-import java.util.function.Consumer;
-
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.completeCancellingForAllVertices;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.createNoOpVertex;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.finishAllVertices;
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.switchToRunning;
-import static org.hamcrest.Matchers.is;
-import static org.hamcrest.Matchers.lessThanOrEqualTo;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertNotEquals;
-import static org.junit.Assert.assertNotNull;
-import static org.junit.Assert.assertThat;
-import static org.junit.Assert.assertTrue;
-
-/**
- * Tests the restart behaviour of the {@link ExecutionGraph}.
- */
-public class ExecutionGraphRestartTest extends TestLogger {

Review comment:
       not sure the following tests are covered in new scheduler:
   ```
   testCancelWhileFailing
   testFailWhileCanceling
   testTaskFailingWhileGlobalFailing
   testFailingExecutionAfterRestart
   testFailExecution(Graph)AfterCancel
   testSuspendWhileRestarting
   testFailureWhileRestarting
   ```

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphNotEnoughResourceTest.java
##########
@@ -113,29 +119,28 @@ public void testRestartWithSlotSharingAndNotEnoughResources() throws Exception {
 			final JobGraph jobGraph = new JobGraph(TEST_JOB_ID, "Test Job", source, sink);
 			jobGraph.setScheduleMode(ScheduleMode.EAGER);
 
-			TestRestartStrategy restartStrategy = new TestRestartStrategy(numRestarts, false);
+			RestartBackoffTimeStrategy restartStrategy = new FixedDelayRestartBackoffTimeStrategy.FixedDelayRestartBackoffTimeStrategyFactory(numRestarts, 0).create();
 
-			final ExecutionGraph eg = TestingExecutionGraphBuilder
-				.newBuilder()
-				.setJobGraph(jobGraph)
-				.setSlotProvider(scheduler)
-				.setRestartStrategy(restartStrategy)
-				.setAllocationTimeout(Time.milliseconds(1L))
+			final SchedulerBase schedulerNG = SchedulerTestingUtils
+				.newSchedulerBuilderWithDefaultSlotAllocator(jobGraph, scheduler, Time.milliseconds(1))
+				.setRestartBackoffTimeStrategy(restartStrategy)
+				.setSchedulingStrategyFactory(new EagerSchedulingStrategy.Factory())
+				.setFailoverStrategyFactory(new RestartAllFailoverStrategy.Factory())
 				.build();
+			final ExecutionGraph eg = schedulerNG.getExecutionGraph();

Review comment:
       ok

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphVariousFailuesTest.java
##########
@@ -20,86 +20,31 @@
 
 import org.apache.flink.api.common.JobStatus;
 import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.SuppressRestartsException;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
 import org.apache.flink.runtime.io.network.partition.ResultPartitionID;
 import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobGraph;
+import org.apache.flink.runtime.scheduler.SchedulerBase;
+import org.apache.flink.runtime.scheduler.SchedulerTestingUtils;
 import org.apache.flink.util.TestLogger;
 
 import org.junit.Test;
 
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.fail;
 
-
 public class ExecutionGraphVariousFailuesTest extends TestLogger {
 
-	/**
-	 * Test that failing in state restarting will retrigger the restarting logic. This means that
-	 * it only goes into the state FAILED after the restart strategy says the job is no longer
-	 * restartable.
-	 */
-	@Test
-	public void testFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(2));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("Test 1"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-
-		// we should restart since we have two restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 2"));
-
-		// we should restart since we have one restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 3"));
-
-		// after depleting all our restart attempts we should go into Failed
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
-	/**
-	 * Tests that a {@link SuppressRestartsException} in state RESTARTING stops the restarting
-	 * immediately and sets the execution graph's state to FAILED.
-	 */
-	@Test
-	public void testSuppressRestartFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("test"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		// suppress a possible restart
-		eg.failGlobal(new SuppressRestartsException(new Exception("Test")));
-
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
 	/**
 	 * Tests that a failing scheduleOrUpdateConsumers call with a non-existing execution attempt
 	 * id, will not fail the execution graph.
 	 */
 	@Test
 	public void testFailingScheduleOrUpdateConsumers() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
+		final SchedulerBase scheduler = SchedulerTestingUtils.newSchedulerBuilder(new JobGraph()).build();
+		scheduler.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
+		scheduler.startScheduling();
+
+		final ExecutionGraph eg = scheduler.getExecutionGraph();

Review comment:
       I think I meant we could call `SchedulerBase::scheduleOrUpdateConsumers` to test.

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java
##########
@@ -82,202 +82,12 @@
 	private final TestingComponentMainThreadExecutor testMainThreadUtil =
 		EXECUTOR_RESOURCE.getComponentMainThreadTestExecutor();
 
-	/**
-	 * Tests that slots are released if we cannot assign the allocated resource to the
-	 * Execution.
-	 */
-	@Test
-	public void testSlotReleaseOnFailedResourceAssignment() throws Exception {

Review comment:
       I am not sure we test allocation cancelation with the new scheduler

##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/GlobalModVersionTest.java
##########
@@ -1,200 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.failover.FailoverStrategy;
-import org.apache.flink.runtime.executiongraph.failover.FailoverStrategy.Factory;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleSlotProvider;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.Test;
-
-import java.util.Random;
-
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.waitUntilExecutionState;
-import static org.junit.Assert.assertEquals;
-import static org.mockito.Mockito.any;
-import static org.mockito.Mockito.mock;
-import static org.mockito.Mockito.times;
-import static org.mockito.Mockito.verify;
-
-public class GlobalModVersionTest extends TestLogger {

Review comment:
       I am not sure these tests are not relevant for the new scheduling, like concurrent global/local failure attempts




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r533275742



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphNotEnoughResourceTest.java
##########
@@ -113,29 +119,28 @@ public void testRestartWithSlotSharingAndNotEnoughResources() throws Exception {
 			final JobGraph jobGraph = new JobGraph(TEST_JOB_ID, "Test Job", source, sink);
 			jobGraph.setScheduleMode(ScheduleMode.EAGER);
 
-			TestRestartStrategy restartStrategy = new TestRestartStrategy(numRestarts, false);
+			RestartBackoffTimeStrategy restartStrategy = new FixedDelayRestartBackoffTimeStrategy.FixedDelayRestartBackoffTimeStrategyFactory(numRestarts, 0).create();
 
-			final ExecutionGraph eg = TestingExecutionGraphBuilder
-				.newBuilder()
-				.setJobGraph(jobGraph)
-				.setSlotProvider(scheduler)
-				.setRestartStrategy(restartStrategy)
-				.setAllocationTimeout(Time.milliseconds(1L))
+			final SchedulerBase schedulerNG = SchedulerTestingUtils
+				.newSchedulerBuilderWithDefaultSlotAllocator(jobGraph, scheduler, Time.milliseconds(1))
+				.setRestartBackoffTimeStrategy(restartStrategy)
+				.setSchedulingStrategyFactory(new EagerSchedulingStrategy.Factory())
+				.setFailoverStrategyFactory(new RestartAllFailoverStrategy.Factory())
 				.build();
+			final ExecutionGraph eg = schedulerNG.getExecutionGraph();

Review comment:
       done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r532479406



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphNotEnoughResourceTest.java
##########
@@ -113,29 +119,28 @@ public void testRestartWithSlotSharingAndNotEnoughResources() throws Exception {
 			final JobGraph jobGraph = new JobGraph(TEST_JOB_ID, "Test Job", source, sink);
 			jobGraph.setScheduleMode(ScheduleMode.EAGER);
 
-			TestRestartStrategy restartStrategy = new TestRestartStrategy(numRestarts, false);
+			RestartBackoffTimeStrategy restartStrategy = new FixedDelayRestartBackoffTimeStrategy.FixedDelayRestartBackoffTimeStrategyFactory(numRestarts, 0).create();
 
-			final ExecutionGraph eg = TestingExecutionGraphBuilder
-				.newBuilder()
-				.setJobGraph(jobGraph)
-				.setSlotProvider(scheduler)
-				.setRestartStrategy(restartStrategy)
-				.setAllocationTimeout(Time.milliseconds(1L))
+			final SchedulerBase schedulerNG = SchedulerTestingUtils
+				.newSchedulerBuilderWithDefaultSlotAllocator(jobGraph, scheduler, Time.milliseconds(1))
+				.setRestartBackoffTimeStrategy(restartStrategy)
+				.setSchedulingStrategyFactory(new EagerSchedulingStrategy.Factory())
+				.setFailoverStrategyFactory(new RestartAllFailoverStrategy.Factory())
 				.build();
+			final ExecutionGraph eg = schedulerNG.getExecutionGraph();

Review comment:
       I agree that in the future we will make ExecutionGraph a pure topology. But I think we need to do it step by step. In my mind the first step could be factoring actions from ExecutionGraph components. And second step could be factor out all the fields unrelated to the topology. 
   The first step further consists of multiple sub-steps and FLINK-15626 is just one of them. So I'd like to limit the scope of this PR to avoid make this PR too complex with multiple goals.
   
   Regarding this specific test, you are right that it can be removed along with eager/lazy scheduling in after pipelined region scheduling is stable. So maybe let's keep the name and just remove it later?
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r533358362



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphVariousFailuesTest.java
##########
@@ -20,86 +20,31 @@
 
 import org.apache.flink.api.common.JobStatus;
 import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.SuppressRestartsException;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
 import org.apache.flink.runtime.io.network.partition.ResultPartitionID;
 import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobGraph;
+import org.apache.flink.runtime.scheduler.SchedulerBase;
+import org.apache.flink.runtime.scheduler.SchedulerTestingUtils;
 import org.apache.flink.util.TestLogger;
 
 import org.junit.Test;
 
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.fail;
 
-
 public class ExecutionGraphVariousFailuesTest extends TestLogger {
 
-	/**
-	 * Test that failing in state restarting will retrigger the restarting logic. This means that
-	 * it only goes into the state FAILED after the restart strategy says the job is no longer
-	 * restartable.
-	 */
-	@Test
-	public void testFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(2));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("Test 1"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-
-		// we should restart since we have two restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 2"));
-
-		// we should restart since we have one restart attempts left
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		eg.failGlobal(new Exception("Test 3"));
-
-		// after depleting all our restart attempts we should go into Failed
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
-	/**
-	 * Tests that a {@link SuppressRestartsException} in state RESTARTING stops the restarting
-	 * immediately and sets the execution graph's state to FAILED.
-	 */
-	@Test
-	public void testSuppressRestartFailureWhileRestarting() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-		ExecutionGraphTestUtils.switchAllVerticesToRunning(eg);
-
-		eg.failGlobal(new Exception("test"));
-		assertEquals(JobStatus.FAILING, eg.getState());
-
-		ExecutionGraphTestUtils.completeCancellingForAllVertices(eg);
-		assertEquals(JobStatus.RESTARTING, eg.getState());
-
-		// suppress a possible restart
-		eg.failGlobal(new SuppressRestartsException(new Exception("Test")));
-
-		assertEquals(JobStatus.FAILED, eg.getState());
-	}
-
 	/**
 	 * Tests that a failing scheduleOrUpdateConsumers call with a non-existing execution attempt
 	 * id, will not fail the execution graph.
 	 */
 	@Test
 	public void testFailingScheduleOrUpdateConsumers() throws Exception {
-		final ExecutionGraph eg = ExecutionGraphTestUtils.createSimpleTestGraph(new InfiniteDelayRestartStrategy(10));
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
+		final SchedulerBase scheduler = SchedulerTestingUtils.newSchedulerBuilder(new JobGraph()).build();
+		scheduler.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
+		scheduler.startScheduling();
+
+		final ExecutionGraph eg = scheduler.getExecutionGraph();

Review comment:
       @azagrebin I see. Agreed it's better to test upon `SchedulerBase::scheduleOrUpdateConsumers`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [WIP][FLINK-17760][tests] Rework tests to not rely on legacy scheduling logics in ExecutionGraph anymore

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * fef9bff28a988ffab789fc6bb0cbde754273a2e0 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331",
       "triggerID" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10374",
       "triggerID" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42 UNKNOWN
   * e7941f905ec697ad09a3f1010f90a2a69a512ce0 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10374) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331",
       "triggerID" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10374",
       "triggerID" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10435",
       "triggerID" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "89bea5233d5efb9db88eacc21b445a617a8c3c27",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10899",
       "triggerID" : "89bea5233d5efb9db88eacc21b445a617a8c3c27",
       "triggerType" : "PUSH"
     }, {
       "hash" : "dfd82cd0de7a46432eae70f1f86aafb823ed7ff2",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "dfd82cd0de7a46432eae70f1f86aafb823ed7ff2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42 UNKNOWN
   * 89bea5233d5efb9db88eacc21b445a617a8c3c27 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10899) 
   * dfd82cd0de7a46432eae70f1f86aafb823ed7ff2 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r544100796



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphSchedulingTest.java
##########
@@ -1,637 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.api.common.time.Time;
-import org.apache.flink.configuration.Configuration;
-import org.apache.flink.metrics.groups.UnregisteredMetricsGroup;
-import org.apache.flink.runtime.blob.VoidBlobWriter;
-import org.apache.flink.runtime.checkpoint.StandaloneCheckpointRecoveryFactory;
-import org.apache.flink.runtime.clusterframework.types.AllocationID;
-import org.apache.flink.runtime.clusterframework.types.ResourceID;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.instance.SimpleSlotContext;
-import org.apache.flink.runtime.io.network.partition.NoOpJobMasterPartitionTracker;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.ScheduleMode;
-import org.apache.flink.runtime.jobmanager.scheduler.Locality;
-import org.apache.flink.runtime.jobmanager.slots.DummySlotOwner;
-import org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway;
-import org.apache.flink.runtime.jobmanager.slots.TestingSlotOwner;
-import org.apache.flink.runtime.jobmaster.LogicalSlot;
-import org.apache.flink.runtime.jobmaster.SlotOwner;
-import org.apache.flink.runtime.jobmaster.SlotRequestId;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlot;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlotBuilder;
-import org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.shuffle.NettyShuffleMaster;
-import org.apache.flink.runtime.taskmanager.TaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.runtime.testutils.DirectScheduledExecutorService;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.After;
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.net.InetAddress;
-import java.util.Set;
-import java.util.concurrent.ArrayBlockingQueue;
-import java.util.concurrent.BlockingQueue;
-import java.util.concurrent.CompletableFuture;
-import java.util.concurrent.ConcurrentHashMap;
-import java.util.concurrent.ConcurrentMap;
-import java.util.concurrent.ScheduledExecutorService;
-import java.util.concurrent.TimeUnit;
-import java.util.concurrent.TimeoutException;
-
-import static org.hamcrest.Matchers.empty;
-import static org.hamcrest.Matchers.is;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertThat;
-
-/**
- * Tests for the scheduling of the execution graph. This tests that
- * for example the order of deployments is correct and that bulk slot allocation
- * works properly.
- */
-public class ExecutionGraphSchedulingTest extends TestLogger {

Review comment:
       True that we lacks tests for how slot release/timeout affects the new scheduling.
   Will add tests for them.
   Regarding `testCancellationOfIncompleteScheduling`, I think it's not needed. see https://github.com/apache/flink/pull/13641#discussion_r543404219.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r543404219



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphSchedulingTest.java
##########
@@ -1,637 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.api.common.time.Time;
-import org.apache.flink.configuration.Configuration;
-import org.apache.flink.metrics.groups.UnregisteredMetricsGroup;
-import org.apache.flink.runtime.blob.VoidBlobWriter;
-import org.apache.flink.runtime.checkpoint.StandaloneCheckpointRecoveryFactory;
-import org.apache.flink.runtime.clusterframework.types.AllocationID;
-import org.apache.flink.runtime.clusterframework.types.ResourceID;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.restart.NoRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleAckingTaskManagerGateway;
-import org.apache.flink.runtime.instance.SimpleSlotContext;
-import org.apache.flink.runtime.io.network.partition.NoOpJobMasterPartitionTracker;
-import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
-import org.apache.flink.runtime.jobgraph.DistributionPattern;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.jobgraph.ScheduleMode;
-import org.apache.flink.runtime.jobmanager.scheduler.Locality;
-import org.apache.flink.runtime.jobmanager.slots.DummySlotOwner;
-import org.apache.flink.runtime.jobmanager.slots.TaskManagerGateway;
-import org.apache.flink.runtime.jobmanager.slots.TestingSlotOwner;
-import org.apache.flink.runtime.jobmaster.LogicalSlot;
-import org.apache.flink.runtime.jobmaster.SlotOwner;
-import org.apache.flink.runtime.jobmaster.SlotRequestId;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlot;
-import org.apache.flink.runtime.jobmaster.TestingLogicalSlotBuilder;
-import org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot;
-import org.apache.flink.runtime.jobmaster.slotpool.SlotProvider;
-import org.apache.flink.runtime.shuffle.NettyShuffleMaster;
-import org.apache.flink.runtime.taskmanager.TaskManagerLocation;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.runtime.testutils.DirectScheduledExecutorService;
-import org.apache.flink.util.FlinkException;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.After;
-import org.junit.Test;
-
-import javax.annotation.Nonnull;
-
-import java.net.InetAddress;
-import java.util.Set;
-import java.util.concurrent.ArrayBlockingQueue;
-import java.util.concurrent.BlockingQueue;
-import java.util.concurrent.CompletableFuture;
-import java.util.concurrent.ConcurrentHashMap;
-import java.util.concurrent.ConcurrentMap;
-import java.util.concurrent.ScheduledExecutorService;
-import java.util.concurrent.TimeUnit;
-import java.util.concurrent.TimeoutException;
-
-import static org.hamcrest.Matchers.empty;
-import static org.hamcrest.Matchers.is;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertThat;
-
-/**
- * Tests for the scheduling of the execution graph. This tests that
- * for example the order of deployments is correct and that bulk slot allocation
- * works properly.
- */
-public class ExecutionGraphSchedulingTest extends TestLogger {
-
-	private final ScheduledExecutorService executor = new DirectScheduledExecutorService();
-
-	@After
-	public void shutdown() {
-		executor.shutdownNow();
-	}
-
-	// ------------------------------------------------------------------------
-	//  Tests
-	// ------------------------------------------------------------------------
-
-	/**
-	 * Tests that with scheduling futures and pipelined deployment, the target vertex will
-	 * not deploy its task before the source vertex does.
-	 */
-	@Test
-	public void testScheduleSourceBeforeTarget() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 1;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final CompletableFuture<LogicalSlot> sourceFuture = new CompletableFuture<>();
-		final CompletableFuture<LogicalSlot> targetFuture = new CompletableFuture<>();
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlot(sourceVertex.getID(), 0, sourceFuture);
-		slotProvider.addSlot(targetVertex.getID(), 0, targetFuture);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		//  set up two TaskManager gateways and slots
-
-		final InteractionsCountingTaskManagerGateway gatewaySource = createTaskManager();
-		final InteractionsCountingTaskManagerGateway gatewayTarget = createTaskManager();
-
-		final LogicalSlot sourceSlot = createTestingLogicalSlot(gatewaySource);
-		final LogicalSlot targetSlot = createTestingLogicalSlot(gatewayTarget);
-
-		eg.scheduleForExecution();
-
-		// job should be running
-		assertEquals(JobStatus.RUNNING, eg.getState());
-
-		// we fulfill the target slot before the source slot
-		// that should not cause a deployment or deployment related failure
-		targetFuture.complete(targetSlot);
-
-		assertThat(gatewayTarget.getSubmitTaskCount(), is(0));
-		assertEquals(JobStatus.RUNNING, eg.getState());
-
-		// now supply the source slot
-		sourceFuture.complete(sourceSlot);
-
-		// by now, all deployments should have happened
-		assertThat(gatewaySource.getSubmitTaskCount(), is(1));
-		assertThat(gatewayTarget.getSubmitTaskCount(), is(1));
-
-		assertEquals(JobStatus.RUNNING, eg.getState());
-	}
-
-	private TestingLogicalSlot createTestingLogicalSlot(InteractionsCountingTaskManagerGateway gatewaySource) {
-		return new TestingLogicalSlotBuilder()
-			.setTaskManagerGateway(gatewaySource)
-			.createTestingLogicalSlot();
-	}
-
-	/**
-	 * This test verifies that before deploying a pipelined connected component, the
-	 * full set of slots is available, and that not some tasks are deployed, and later the
-	 * system realizes that not enough resources are available.
-	 */
-	@Test
-	public void testDeployPipelinedConnectedComponentsTogether() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 8;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.ALL_TO_ALL, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] sourceFutures = new CompletableFuture[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] targetFutures = new CompletableFuture[parallelism];
-
-		//
-		//  Create the slots, futures, and the slot provider
-
-		final InteractionsCountingTaskManagerGateway[] sourceTaskManagers = new InteractionsCountingTaskManagerGateway[parallelism];
-		final InteractionsCountingTaskManagerGateway[] targetTaskManagers = new InteractionsCountingTaskManagerGateway[parallelism];
-
-		final LogicalSlot[] sourceSlots = new LogicalSlot[parallelism];
-		final LogicalSlot[] targetSlots = new LogicalSlot[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			sourceTaskManagers[i] = createTaskManager();
-			targetTaskManagers[i] = createTaskManager();
-
-			sourceSlots[i] = createTestingLogicalSlot(sourceTaskManagers[i]);
-			targetSlots[i] = createTestingLogicalSlot(targetTaskManagers[i]);
-
-			sourceFutures[i] = new CompletableFuture<>();
-			targetFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(sourceVertex.getID(), sourceFutures);
-		slotProvider.addSlots(targetVertex.getID(), targetFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-
-		//
-		//  we complete some of the futures
-
-		for (int i = 0; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-		}
-
-		//
-		//  kick off the scheduling
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		verifyNothingDeployed(eg, sourceTaskManagers);
-
-		//  complete the remaining sources
-		for (int i = 1; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-		}
-		verifyNothingDeployed(eg, sourceTaskManagers);
-
-		//  complete the targets except for one
-		for (int i = 1; i < parallelism; i++) {
-			targetFutures[i].complete(targetSlots[i]);
-		}
-		verifyNothingDeployed(eg, targetTaskManagers);
-
-		//  complete the last target slot future
-		targetFutures[0].complete(targetSlots[0]);
-
-		//
-		//  verify that all deployments have happened
-
-		for (InteractionsCountingTaskManagerGateway gateway : sourceTaskManagers) {
-			assertThat(gateway.getSubmitTaskCount(), is(1));
-		}
-		for (InteractionsCountingTaskManagerGateway gateway : targetTaskManagers) {
-			assertThat(gateway.getSubmitTaskCount(), is(1));
-		}
-	}
-
-	/**
-	 * This test verifies that if one slot future fails, the deployment will be aborted.
-	 */
-	@Test
-	public void testOneSlotFailureAbortsDeploy() throws Exception {
-
-		//                                            [pipelined]
-		//  we construct a simple graph    (source) ----------------> (target)
-
-		final int parallelism = 6;
-
-		final JobVertex sourceVertex = new JobVertex("source");
-		sourceVertex.setParallelism(parallelism);
-		sourceVertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobVertex targetVertex = new JobVertex("target");
-		targetVertex.setParallelism(parallelism);
-		targetVertex.setInvokableClass(NoOpInvokable.class);
-
-		targetVertex.connectNewDataSetAsInput(sourceVertex, DistributionPattern.POINTWISE, ResultPartitionType.PIPELINED);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", sourceVertex, targetVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		//
-		//  Create the slots, futures, and the slot provider
-
-		final InteractionsCountingTaskManagerGateway taskManager = createTaskManager();
-		final BlockingQueue<AllocationID> returnedSlots = new ArrayBlockingQueue<>(parallelism);
-		final TestingSlotOwner slotOwner = new TestingSlotOwner();
-		slotOwner.setReturnAllocatedSlotConsumer(
-			(LogicalSlot logicalSlot) -> returnedSlots.offer(logicalSlot.getAllocationId()));
-
-		final LogicalSlot[] sourceSlots = new LogicalSlot[parallelism];
-		final LogicalSlot[] targetSlots = new LogicalSlot[parallelism];
-
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] sourceFutures = new CompletableFuture[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] targetFutures = new CompletableFuture[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			sourceSlots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-			targetSlots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-
-			sourceFutures[i] = new CompletableFuture<>();
-			targetFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(sourceVertex.getID(), sourceFutures);
-		slotProvider.addSlots(targetVertex.getID(), targetFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-
-		//
-		//  we complete some of the futures
-
-		for (int i = 0; i < parallelism; i += 2) {
-			sourceFutures[i].complete(sourceSlots[i]);
-			targetFutures[i].complete(targetSlots[i]);
-		}
-
-		//  kick off the scheduling
-		eg.scheduleForExecution();
-
-		// fail one slot
-		sourceFutures[1].completeExceptionally(new TestRuntimeException());
-
-		// wait until the job failed as a whole
-		eg.getTerminationFuture().get(2000, TimeUnit.MILLISECONDS);
-
-		// wait until all slots are back
-		for (int i = 0; i < parallelism; i++) {
-			returnedSlots.poll(2000L, TimeUnit.MILLISECONDS);
-		}
-
-		// no deployment calls must have happened
-		assertThat(taskManager.getSubmitTaskCount(), is(0));
-
-		// all completed futures must have been returns
-		for (int i = 0; i < parallelism; i += 2) {
-			assertFalse(sourceSlots[i].isAlive());
-			assertFalse(targetSlots[i].isAlive());
-		}
-	}
-
-	/**
-	 * This tests makes sure that with eager scheduling no task is deployed if a single
-	 * slot allocation fails. Moreover we check that allocated slots will be returned.
-	 */
-	@Test
-	public void testEagerSchedulingWithSlotTimeout() throws Exception {
-
-		//  we construct a simple graph:    (task)
-
-		final int parallelism = 3;
-
-		final JobVertex vertex = new JobVertex("task");
-		vertex.setParallelism(parallelism);
-		vertex.setInvokableClass(NoOpInvokable.class);
-
-		final JobID jobId = new JobID();
-		final JobGraph jobGraph = new JobGraph(jobId, "test", vertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final BlockingQueue<AllocationID> returnedSlots = new ArrayBlockingQueue<>(2);
-		final TestingSlotOwner slotOwner = new TestingSlotOwner();
-		slotOwner.setReturnAllocatedSlotConsumer(
-			(LogicalSlot logicalSlot) -> returnedSlots.offer(logicalSlot.getAllocationId()));
-
-		final InteractionsCountingTaskManagerGateway taskManager = createTaskManager();
-		final LogicalSlot[] slots = new LogicalSlot[parallelism];
-		@SuppressWarnings({"unchecked", "rawtypes"})
-		final CompletableFuture<LogicalSlot>[] slotFutures = new CompletableFuture[parallelism];
-
-		for (int i = 0; i < parallelism; i++) {
-			slots[i] = createSingleLogicalSlot(slotOwner, taskManager, new SlotRequestId());
-			slotFutures[i] = new CompletableFuture<>();
-		}
-
-		ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-		slotProvider.addSlots(vertex.getID(), slotFutures);
-
-		final ExecutionGraph eg = createExecutionGraph(jobGraph, slotProvider);
-
-		//  we complete one future
-		slotFutures[1].complete(slots[1]);
-
-		//  kick off the scheduling
-		eg.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		eg.scheduleForExecution();
-
-		//  we complete another future
-		slotFutures[2].complete(slots[2]);
-
-		// check that the ExecutionGraph is not terminated yet
-		assertThat(eg.getTerminationFuture().isDone(), is(false));
-
-		// time out one of the slot futures
-		slotFutures[0].completeExceptionally(new TimeoutException("Test time out"));
-
-		assertThat(eg.getTerminationFuture().get(), is(JobStatus.FAILED));
-
-		// wait until all slots are back
-		for (int i = 0; i < parallelism - 1; i++) {
-			returnedSlots.poll(2000, TimeUnit.MILLISECONDS);
-		}
-
-		//  verify that no deployments have happened
-		assertThat(taskManager.getSubmitTaskCount(), is(0));
-	}
-
-	/**
-	 * Tests that an ongoing scheduling operation does not fail the {@link ExecutionGraph}
-	 * if it gets concurrently cancelled.
-	 */
-	@Test
-	public void testSchedulingOperationCancellationWhenCancel() throws Exception {
-		final JobVertex jobVertex = new JobVertex("NoOp JobVertex");
-		jobVertex.setInvokableClass(NoOpInvokable.class);
-		jobVertex.setParallelism(2);
-		final JobGraph jobGraph = new JobGraph(jobVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final CompletableFuture<LogicalSlot> slotFuture1 = new CompletableFuture<>();
-		final CompletableFuture<LogicalSlot> slotFuture2 = new CompletableFuture<>();
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(2);
-		slotProvider.addSlots(jobVertex.getID(), new CompletableFuture[]{slotFuture1, slotFuture2});
-		final ExecutionGraph executionGraph = createExecutionGraph(jobGraph, slotProvider);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		executionGraph.scheduleForExecution();
-
-		final TestingLogicalSlot slot = createTestingSlot();
-		final CompletableFuture<?> releaseFuture = slot.getReleaseFuture();
-		slotFuture1.complete(slot);
-
-		// cancel should change the state of all executions to CANCELLED
-		executionGraph.cancel();
-
-		// complete the now CANCELLED execution --> this should cause a failure
-		slotFuture2.complete(new TestingLogicalSlotBuilder().createTestingLogicalSlot());
-
-		Thread.sleep(1L);
-		// release the first slot to finish the cancellation
-		releaseFuture.complete(null);
-
-		// NOTE: This test will only occasionally fail without the fix since there is
-		// a race between the releaseFuture and the slotFuture2
-		assertThat(executionGraph.getTerminationFuture().get(), is(JobStatus.CANCELED));
-	}
-
-	/**
-	 * Tests that a partially completed eager scheduling operation fails if a
-	 * completed slot is released. See FLINK-9099.
-	 */
-	@Test
-	public void testSlotReleasingFailsSchedulingOperation() throws Exception {
-		final int parallelism = 2;
-
-		final JobVertex jobVertex = new JobVertex("Testing job vertex");
-		jobVertex.setInvokableClass(NoOpInvokable.class);
-		jobVertex.setParallelism(parallelism);
-		final JobGraph jobGraph = new JobGraph(jobVertex);
-		jobGraph.setScheduleMode(ScheduleMode.EAGER);
-
-		final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(parallelism);
-
-		final LogicalSlot slot = createSingleLogicalSlot(new DummySlotOwner(), new SimpleAckingTaskManagerGateway(), new SlotRequestId());
-		slotProvider.addSlot(jobVertex.getID(), 0, CompletableFuture.completedFuture(slot));
-
-		final CompletableFuture<LogicalSlot> slotFuture = new CompletableFuture<>();
-		slotProvider.addSlot(jobVertex.getID(), 1, slotFuture);
-
-		final ExecutionGraph executionGraph = createExecutionGraph(jobGraph, slotProvider);
-
-		executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread());
-		executionGraph.scheduleForExecution();
-
-		assertThat(executionGraph.getState(), is(JobStatus.RUNNING));
-
-		final ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertex.getID());
-		final ExecutionVertex[] taskVertices = executionJobVertex.getTaskVertices();
-		assertThat(taskVertices[0].getExecutionState(), is(ExecutionState.SCHEDULED));
-		assertThat(taskVertices[1].getExecutionState(), is(ExecutionState.SCHEDULED));
-
-		// fail the single allocated slot --> this should fail the scheduling operation
-		slot.releaseSlot(new FlinkException("Test failure"));
-
-		assertThat(executionGraph.getTerminationFuture().get(), is(JobStatus.FAILED));
-	}
-
-	/**
-	 * Tests that all slots are being returned to the {@link SlotOwner} if the
-	 * {@link ExecutionGraph} is being cancelled. See FLINK-9908
-	 */
-	@Test
-	public void testCancellationOfIncompleteScheduling() throws Exception {

Review comment:
       I think that on invocation of `DefaultScheduler.cancel()`, the job will be fully canceled and the JobMaster will shutdown and `close` the SlotPool. `SlotPoolImpl#close()` will release all slots, both allocated and available ones. So looks to me we the scheduler does not need to take care of slot releasing. Please correct me if I missed anything.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r533363159



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/GlobalModVersionTest.java
##########
@@ -1,200 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.runtime.executiongraph;
-
-import org.apache.flink.api.common.JobID;
-import org.apache.flink.api.common.JobStatus;
-import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapter;
-import org.apache.flink.runtime.execution.ExecutionState;
-import org.apache.flink.runtime.executiongraph.failover.FailoverStrategy;
-import org.apache.flink.runtime.executiongraph.failover.FailoverStrategy.Factory;
-import org.apache.flink.runtime.executiongraph.restart.InfiniteDelayRestartStrategy;
-import org.apache.flink.runtime.executiongraph.utils.SimpleSlotProvider;
-import org.apache.flink.runtime.jobgraph.JobGraph;
-import org.apache.flink.runtime.jobgraph.JobVertex;
-import org.apache.flink.runtime.testtasks.NoOpInvokable;
-import org.apache.flink.util.TestLogger;
-
-import org.junit.Test;
-
-import java.util.Random;
-
-import static org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.waitUntilExecutionState;
-import static org.junit.Assert.assertEquals;
-import static org.mockito.Mockito.any;
-import static org.mockito.Mockito.mock;
-import static org.mockito.Mockito.times;
-import static org.mockito.Mockito.verify;
-
-public class GlobalModVersionTest extends TestLogger {

Review comment:
       `globalModeVersion` does not make any difference when using `DefaultScheduler` so it is not needed anymore.
   Concurrent global/local failure is handled in `DefaultScheduler` with `ExecutionVertexVersioner` and its correctness is tested in `DefaultSchedulerTest`, i.e. `skipDeploymentIfVertexVersionOutdated()`, `releaseSlotIfVertexVersionOutdated` and `vertexIsNotAffectedByOutdatedDeployment`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r532454865



##########
File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java
##########
@@ -361,11 +168,7 @@ public void testTaskRestoreStateIsNulledAfterDeployment() throws Exception {
 		assertThat(execution.getTaskRestore(), is(notNullValue()));
 
 		// schedule the execution vertex and wait for its deployment
-		executionVertex.scheduleForExecution(
-			executionGraph.getSlotProviderStrategy(),
-			LocationPreferenceConstraint.ANY,
-			Collections.emptySet())
-			.get();
+		scheduler.startScheduling();

Review comment:
       The slot request will be immediately fulfilled and trigger the deployment of the Execution.
   This is also the previous assumption in this test.
   I'd like to avoid directly call `Execution::deploy` because this is also one of the actions that we'd like to factor out from Execution graph components, although we do not rework it right now.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #13641:
URL: https://github.com/apache/flink/pull/13641#issuecomment-708569491


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "09d8deb89416f53dfe8b5c16fb9d723cbd98612c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7630",
       "triggerID" : "fef9bff28a988ffab789fc6bb0cbde754273a2e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7734",
       "triggerID" : "1e959ffb3e7837247842ae4ade724a999ad7ca3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8284",
       "triggerID" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8279",
       "triggerID" : "9ea4cf7454f9b8e0c237ba8f540b224abdf9f7b0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8562",
       "triggerID" : "659fb7eddb0acfa0ef49f76c5fafca21c389f3c0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8653",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a253ac0cf1f9a7d382cd0e8c59a8c5ba6cb0a636",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8325",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f7c0de826fcaa9b28eb8054902d3a01fb93dfcc3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8620",
       "triggerID" : "719297248",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8784",
       "triggerID" : "150fae9485d03151f4622c8f026c760da4eea3e2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8807",
       "triggerID" : "517b53a68f3e9c8c0897cd7afba90b8a9befaa4f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=9756",
       "triggerID" : "021cac170ea26cddfd8af0a2bec5fea4e6a76b69",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42",
       "triggerType" : "PUSH"
     }, {
       "hash" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10331",
       "triggerID" : "20de9eefa4d4eb7c8b740cd88fc69f1e764bfa7a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10374",
       "triggerID" : "e7941f905ec697ad09a3f1010f90a2a69a512ce0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10435",
       "triggerID" : "52eaaffa8fb4cad5cca9a26234b92a33c37c0dfa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "89bea5233d5efb9db88eacc21b445a617a8c3c27",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10899",
       "triggerID" : "89bea5233d5efb9db88eacc21b445a617a8c3c27",
       "triggerType" : "PUSH"
     }, {
       "hash" : "dfd82cd0de7a46432eae70f1f86aafb823ed7ff2",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10935",
       "triggerID" : "dfd82cd0de7a46432eae70f1f86aafb823ed7ff2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cf675a9aff300f1d9e7af38bbae24a44377d404d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "cf675a9aff300f1d9e7af38bbae24a44377d404d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 09d8deb89416f53dfe8b5c16fb9d723cbd98612c UNKNOWN
   * fe1562c5cda8ecb15f6af1afdf7b6217e6c20c42 UNKNOWN
   * dfd82cd0de7a46432eae70f1f86aafb823ed7ff2 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10935) 
   * cf675a9aff300f1d9e7af38bbae24a44377d404d UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] zhuzhurk commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components

Posted by GitBox <gi...@apache.org>.
zhuzhurk commented on a change in pull request #13641:
URL: https://github.com/apache/flink/pull/13641#discussion_r532547199



##########
File path: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/metrics/RestartTimeGauge.java
##########
@@ -39,19 +41,21 @@
 
 	// ------------------------------------------------------------------------
 
-	private final ExecutionGraph eg;
+	private final Supplier<JobStatus> statusSupplier;
+	private final Function<JobStatus, Long> statusTimestampRetriever;
 
-	public RestartTimeGauge(ExecutionGraph executionGraph) {
-		this.eg = checkNotNull(executionGraph);
+	public RestartTimeGauge(Supplier<JobStatus> statusSupplier, Function<JobStatus, Long> statusTimestampRetriever) {

Review comment:
       sounds good.

##########
File path: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/metrics/RestartTimeGauge.java
##########
@@ -39,19 +41,21 @@
 
 	// ------------------------------------------------------------------------
 
-	private final ExecutionGraph eg;
+	private final Supplier<JobStatus> statusSupplier;
+	private final Function<JobStatus, Long> statusTimestampRetriever;
 
-	public RestartTimeGauge(ExecutionGraph executionGraph) {
-		this.eg = checkNotNull(executionGraph);
+	public RestartTimeGauge(Supplier<JobStatus> statusSupplier, Function<JobStatus, Long> statusTimestampRetriever) {

Review comment:
       done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org