You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@aurora.apache.org by sa...@apache.org on 2017/08/24 16:37:18 UTC

aurora git commit: Fix race condition where rescinds are received but not processed before offer is accepted

Repository: aurora
Updated Branches:
  refs/heads/master aae2b0dc7 -> 62e46cdea


Fix race condition where rescinds are received but not processed before offer is accepted

The current race condition for offers is possible:
```
1. Scheduler receives an offer and adds it to the executor queue for processing.
2. The executor processes the offer and adds it to the HostOffers list.
3. Scheduler receives a rescind for that offer and adds it to the executor queue for processing. However, there is a lot of load on the executor so there might be a delay between receiving the rescind and processing it.
4. Scheduler accepts the offer before the rescind is processed by the executor. This will result in launching a task with an invalid offer leading to TASK_LOST.
```
The following logs show this in action:

Mesos:
```
I0810 14:33:45.744372 19274 master.cpp:6065] Removing offer OFFER_X with revocable resources...
W0810 14:34:23.640905 19279 master.cpp:3696] Ignoring accept of offer OFFER_X since it is no longer valid
W0810 14:34:23.640923 19279 master.cpp:3709] ACCEPT call used invalid offers '[ OFFER_X ]': Offer OFFER_X is no longer valid
I0810 14:34:23.640974 19279 master.cpp:6253] Sending status update TASK_LOST for task TASK_Y with invalid offers: Offer OFFER_X is no longer valid'
```
Aurora:
```
I0810 14:28:45.676 [SchedulerImpl-0, MesosCallbackHandler$MesosCallbackHandlerImpl] Received offer: OFFER_X
I0810 14:34:23.635 [TaskGroupBatchWorker, VersionedSchedulerDriverService] Accepting offer OFFER_X with ops [LAUNCH]
I0810 14:34:24.186 [Thread-4471585, MesosCallbackHandler$MesosCallbackHandlerImpl] Received status update for task TASK_Y in state TASK_LOST from SOURCE_MASTER with REASON_INVALID_OFFERS: Task launched with invalid offers: Offer_X is no longer valid
I0810 14:34:32.972 [SchedulerImpl-0, MesosCallbackHandler$MesosCallbackHandlerImpl] Offer rescinded: OFFER_X
W0810 14:34:32.972 [SchedulerImpl-0, OfferManager$OfferManagerImpl] Failed to cancel offer: OFFER_X.
```
I would like to temporarily ban offers if we receive a rescind but the offer has not yet been added (ie. still in the executor queue). Then, when we actually process the offer we will not assign it to tasks since we know it has been rescinded already. When we ban the offer, we will also add a command to unban the offer to the executor queue so that future offers will not be affected. This solution should also avoid the race condition fixed in: https://issues.apache.org/jira/browse/AURORA-1933

Testing Done:
`./gradlew test`

Ran `./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh` successfully.

I will verify this patch on a live cluster as well before submitting.

Bugs closed: AURORA-1945

Reviewed at https://reviews.apache.org/r/61804/


Project: http://git-wip-us.apache.org/repos/asf/aurora/repo
Commit: http://git-wip-us.apache.org/repos/asf/aurora/commit/62e46cde
Tree: http://git-wip-us.apache.org/repos/asf/aurora/tree/62e46cde
Diff: http://git-wip-us.apache.org/repos/asf/aurora/diff/62e46cde

Branch: refs/heads/master
Commit: 62e46cdea5b3a2143e2fed601aca814346af750b
Parents: aae2b0d
Author: Jordan Ly <jo...@gmail.com>
Authored: Thu Aug 24 09:36:57 2017 -0700
Committer: Santhosh Kumar <ss...@twitter.com>
Committed: Thu Aug 24 09:36:57 2017 -0700

----------------------------------------------------------------------
 .../benchmark/fakes/FakeOfferManager.java       |  9 ++-
 .../scheduler/mesos/MesosCallbackHandler.java   | 31 +++++---
 .../aurora/scheduler/offers/OfferManager.java   | 64 ++++++++++++----
 .../aurora/scheduler/state/TaskAssigner.java    |  2 +-
 .../mesos/MesosCallbackHandlerTest.java         | 79 +++++++++++++++++++-
 .../scheduler/offers/OfferManagerImplTest.java  | 31 +++++++-
 .../state/FirstFitTaskAssignerTest.java         |  8 +-
 7 files changed, 186 insertions(+), 38 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/aurora/blob/62e46cde/src/jmh/java/org/apache/aurora/benchmark/fakes/FakeOfferManager.java
----------------------------------------------------------------------
diff --git a/src/jmh/java/org/apache/aurora/benchmark/fakes/FakeOfferManager.java b/src/jmh/java/org/apache/aurora/benchmark/fakes/FakeOfferManager.java
index 6f2ca35..201aa81 100644
--- a/src/jmh/java/org/apache/aurora/benchmark/fakes/FakeOfferManager.java
+++ b/src/jmh/java/org/apache/aurora/benchmark/fakes/FakeOfferManager.java
@@ -28,7 +28,12 @@ public class FakeOfferManager implements OfferManager {
   }
 
   @Override
-  public void cancelOffer(Protos.OfferID offerId) {
+  public boolean cancelOffer(Protos.OfferID offerId) {
+    return false;
+  }
+
+  @Override
+  public void banOffer(Protos.OfferID offerId) {
     // no-op
   }
 
@@ -38,7 +43,7 @@ public class FakeOfferManager implements OfferManager {
   }
 
   @Override
-  public void banOffer(Protos.OfferID offerId, TaskGroupKey groupKey) {
+  public void banOfferForTaskGroup(Protos.OfferID offerId, TaskGroupKey groupKey) {
     // no-op
   }
 

http://git-wip-us.apache.org/repos/asf/aurora/blob/62e46cde/src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java
----------------------------------------------------------------------
diff --git a/src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java b/src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java
index 2a42cac..68d19ec 100644
--- a/src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java
+++ b/src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java
@@ -228,15 +228,28 @@ public interface MesosCallbackHandler {
 
     @Override
     public void handleRescind(OfferID offerId) {
-      // NOTE: We need to use the executor here to prevent racing against processing the offers.
-      // If the callback thread rescinded the offers directly, it could rescind the offer
-      // before the thread for processing the offer got the storage lock and was able to write.
-      // Therefore we use the executor here to also process the rescind to maintain the ordering.
-      executor.execute(() -> {
-        log.info("Offer rescinded: {}", offerId.getValue());
-        offerManager.cancelOffer(offerId);
-        offersRescinded.incrementAndGet();
-      });
+      log.info("Offer rescinded: {}", offerId.getValue());
+
+      // For rescinds, we want to ensure they are processed quickly before we attempt to use an
+      // invalid offer. There are a few scenarios we want to be aware of:
+      //   1. We receive an offer, add it to OfferManager, and then get a rescind. In this scenario,
+      //      we can just remove the offer from the offers list.
+      //   2. We receive an offer, but before we add it to the OfferManager list we get a rescind.
+      //      In this scenario, we want to ensure that we do not use it/accept it when the executor
+      //      finally processes the offer. We will temporarily ban it and add a command for the
+      //      executor to unban it so future offers can be processed normally.
+      boolean offerCancelled = offerManager.cancelOffer(offerId);
+      if (!offerCancelled) {
+        log.info(
+            "Received rescind before adding offer: {}, temporarily banning.",
+            offerId.getValue());
+        offerManager.banOffer(offerId);
+        executor.execute(() -> {
+          log.info("Cancelling and unbanning offer: {}.", offerId.getValue());
+          offerManager.cancelOffer(offerId);
+        });
+      }
+      offersRescinded.incrementAndGet();
     }
 
     @Override

http://git-wip-us.apache.org/repos/asf/aurora/blob/62e46cde/src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java
----------------------------------------------------------------------
diff --git a/src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java b/src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java
index a55f8ad..5697d2e 100644
--- a/src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java
+++ b/src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java
@@ -25,11 +25,11 @@ import com.google.common.base.Optional;
 import com.google.common.collect.FluentIterable;
 import com.google.common.collect.HashMultimap;
 import com.google.common.collect.ImmutableList;
-import com.google.common.collect.ImmutableSet;
 import com.google.common.collect.Iterables;
 import com.google.common.collect.Maps;
 import com.google.common.collect.Multimap;
 import com.google.common.collect.Ordering;
+import com.google.common.collect.Sets;
 import com.google.common.eventbus.Subscribe;
 
 import org.apache.aurora.common.inject.TimedInterceptor.Timed;
@@ -70,9 +70,17 @@ public interface OfferManager extends EventSubscriber {
    * Invalidates an offer.  This indicates that the scheduler should not attempt to match any
    * tasks against the offer.
    *
-   * @param offerId Canceled offer.
+   * @param offerId Cancelled offer.
+   * @return A boolean on whether or not the offer was successfully cancelled.
    */
-  void cancelOffer(OfferID offerId);
+  boolean cancelOffer(OfferID offerId);
+
+  /**
+   * Exclude an offer from being matched against all tasks.
+   *
+   * @param offerId Offer ID to ban.
+   */
+  void banOffer(OfferID offerId);
 
   /**
    * Exclude an offer that results in a static mismatch from further attempts to match against all
@@ -81,7 +89,7 @@ public interface OfferManager extends EventSubscriber {
    * @param offerId Offer ID to exclude for the given {@code groupKey}.
    * @param groupKey Task group key to exclude.
    */
-  void banOffer(OfferID offerId, TaskGroupKey groupKey);
+  void banOfferForTaskGroup(OfferID offerId, TaskGroupKey groupKey);
 
   /**
    * Launches the task matched against the offer.
@@ -100,14 +108,14 @@ public interface OfferManager extends EventSubscriber {
   void hostAttributesChanged(HostAttributesChanged change);
 
   /**
-   * Gets the offers that the scheduler is holding.
+   * Gets the offers that the scheduler is holding, excluding banned offers.
    *
    * @return A snapshot of the offers that the scheduler is currently holding.
    */
   Iterable<HostOffer> getOffers();
 
   /**
-   * Gets all offers that are not statically banned for the given {@code groupKey}.
+   * Gets all offers that are not banned for the given {@code groupKey}.
    *
    * @param groupKey Task group key to check offers for.
    * @return A snapshot of all offers eligible for the given {@code groupKey}.
@@ -147,6 +155,8 @@ public interface OfferManager extends EventSubscriber {
     static final String STATICALLY_BANNED_OFFERS = "statically_banned_offers_size";
     @VisibleForTesting
     static final String OFFER_CANCEL_FAILURES = "offer_cancel_failures";
+    @VisibleForTesting
+    static final String GLOBALLY_BANNED_OFFERS = "globally_banned_offers_size";
 
     private final HostOffers hostOffers;
     private final AtomicLong offerRaces;
@@ -178,7 +188,7 @@ public interface OfferManager extends EventSubscriber {
       // temporarily hold two offers for the same host, which should be corrected when we return
       // them after the return delay.
       // There's also a chance that we return an offer for compaction ~simultaneously with the
-      // same-host offer being canceled/returned.  This is also fine.
+      // same-host offer being cancelled/returned.  This is also fine.
       Optional<HostOffer> sameSlave = hostOffers.get(offer.getOffer().getAgentId());
       if (sameSlave.isPresent()) {
         // If there are existing offers for the slave, decline all of them so the master can
@@ -195,13 +205,13 @@ public interface OfferManager extends EventSubscriber {
       }
     }
 
-    void removeAndDecline(OfferID id) {
+    private void removeAndDecline(OfferID id) {
       if (removeFromHostOffers(id)) {
         decline(id);
       }
     }
 
-    void decline(OfferID id) {
+    private void decline(OfferID id) {
       LOG.debug("Declining offer {}", id);
       driver.declineOffer(id, getOfferFilter());
     }
@@ -213,7 +223,7 @@ public interface OfferManager extends EventSubscriber {
     }
 
     @Override
-    public void cancelOffer(final OfferID offerId) {
+    public boolean cancelOffer(final OfferID offerId) {
       boolean success = removeFromHostOffers(offerId);
       if (!success) {
         // This will happen rarely when we race to process this rescind against accepting the offer
@@ -222,6 +232,12 @@ public interface OfferManager extends EventSubscriber {
         LOG.warn("Failed to cancel offer: {}.", offerId.getValue());
         this.offerCancelFailures.incrementAndGet();
       }
+      return success;
+    }
+
+    @Override
+    public void banOffer(OfferID offerId) {
+      hostOffers.addGlobalBan(offerId);
     }
 
     private boolean removeFromHostOffers(final OfferID offerId) {
@@ -287,16 +303,25 @@ public interface OfferManager extends EventSubscriber {
       // scheduling attempts. See VetoGroup for more details on static ban.
       private final Multimap<OfferID, TaskGroupKey> staticallyBannedOffers = HashMultimap.create();
 
+      // Keep track of globally banned offers that will never be matched to anything.
+      private final Set<OfferID> globallyBannedOffers = Sets.newHashSet();
+
       HostOffers(StatsProvider statsProvider, Ordering<HostOffer> offerOrder) {
         offers = new ConcurrentSkipListSet<>(offerOrder);
         // Potential gotcha - since this is a ConcurrentSkipListSet, size() is more expensive.
         // Could track this separately if it turns out to pose problems.
         statsProvider.exportSize(OUTSTANDING_OFFERS, offers);
         statsProvider.makeGauge(STATICALLY_BANNED_OFFERS, () -> staticallyBannedOffers.size());
+        statsProvider.makeGauge(GLOBALLY_BANNED_OFFERS, () -> globallyBannedOffers.size());
       }
 
       synchronized Optional<HostOffer> get(AgentID slaveId) {
-        return Optional.fromNullable(offersBySlave.get(slaveId));
+        HostOffer offer = offersBySlave.get(slaveId);
+        if (offer == null || globallyBannedOffers.contains(offer.getOffer().getId())) {
+          return Optional.absent();
+        }
+
+        return Optional.of(offer);
       }
 
       synchronized void add(HostOffer offer) {
@@ -312,8 +337,9 @@ public interface OfferManager extends EventSubscriber {
           offers.remove(removed);
           offersBySlave.remove(removed.getOffer().getAgentId());
           offersByHost.remove(removed.getOffer().getHostname());
-          staticallyBannedOffers.removeAll(id);
         }
+        staticallyBannedOffers.removeAll(id);
+        globallyBannedOffers.remove(id);
         return removed != null;
       }
 
@@ -327,12 +353,19 @@ public interface OfferManager extends EventSubscriber {
       }
 
       synchronized Iterable<HostOffer> getOffers() {
-        return ImmutableSet.copyOf(offers);
+        return Iterables.unmodifiableIterable(FluentIterable.from(offers).filter(
+            e -> !globallyBannedOffers.contains(e.getOffer().getId())
+        ));
       }
 
       synchronized Iterable<HostOffer> getWeaklyConsistentOffers(TaskGroupKey groupKey) {
         return Iterables.unmodifiableIterable(FluentIterable.from(offers).filter(
-            e -> !staticallyBannedOffers.containsEntry(e.getOffer().getId(), groupKey)));
+            e -> !staticallyBannedOffers.containsEntry(e.getOffer().getId(), groupKey)
+                && !globallyBannedOffers.contains(e.getOffer().getId())));
+      }
+
+      synchronized void addGlobalBan(OfferID offerId) {
+        globallyBannedOffers.add(offerId);
       }
 
       synchronized void addStaticGroupBan(OfferID offerId, TaskGroupKey groupKey) {
@@ -347,11 +380,12 @@ public interface OfferManager extends EventSubscriber {
         offersBySlave.clear();
         offersByHost.clear();
         staticallyBannedOffers.clear();
+        globallyBannedOffers.clear();
       }
     }
 
     @Override
-    public void banOffer(OfferID offerId, TaskGroupKey groupKey) {
+    public void banOfferForTaskGroup(OfferID offerId, TaskGroupKey groupKey) {
       hostOffers.addStaticGroupBan(offerId, groupKey);
     }
 

http://git-wip-us.apache.org/repos/asf/aurora/blob/62e46cde/src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java
----------------------------------------------------------------------
diff --git a/src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java b/src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java
index 25399e4..e35720f 100644
--- a/src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java
+++ b/src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java
@@ -196,7 +196,7 @@ public interface TaskAssigner {
       } else {
         if (Veto.identifyGroup(vetoes) == VetoGroup.STATIC) {
           // Never attempt to match this offer/groupKey pair again.
-          offerManager.banOffer(offer.getOffer().getId(), groupKey);
+          offerManager.banOfferForTaskGroup(offer.getOffer().getId(), groupKey);
         }
         LOG.debug("Agent {} vetoed task {}: {}", offer.getOffer().getHostname(), taskId, vetoes);
       }

http://git-wip-us.apache.org/repos/asf/aurora/blob/62e46cde/src/test/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandlerTest.java
----------------------------------------------------------------------
diff --git a/src/test/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandlerTest.java b/src/test/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandlerTest.java
index b5fa1c8..4d1a676 100644
--- a/src/test/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandlerTest.java
+++ b/src/test/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandlerTest.java
@@ -13,6 +13,9 @@
  */
 package org.apache.aurora.scheduler.mesos;
 
+import java.util.concurrent.Executor;
+import java.util.concurrent.Executors;
+
 import com.google.common.base.Optional;
 import com.google.common.collect.ImmutableList;
 import com.google.common.collect.ImmutableSet;
@@ -48,6 +51,9 @@ import static org.apache.aurora.gen.MaintenanceMode.DRAINING;
 import static org.apache.aurora.gen.MaintenanceMode.NONE;
 import static org.easymock.EasyMock.expect;
 import static org.easymock.EasyMock.expectLastCall;
+import static org.easymock.EasyMock.replay;
+import static org.easymock.EasyMock.strictMock;
+import static org.easymock.EasyMock.verify;
 import static org.junit.Assert.assertEquals;
 
 public class MesosCallbackHandlerTest extends EasyMockTest {
@@ -177,6 +183,10 @@ public class MesosCallbackHandlerTest extends EasyMockTest {
   }
 
   private void createHandler(boolean mockLogger) {
+    createHandler(mockLogger, MoreExecutors.directExecutor());
+  }
+
+  private void createHandler(boolean mockLogger, Executor customExecutor) {
     if (mockLogger) {
       injectedLog = createMock(Logger.class);
     } else {
@@ -189,14 +199,13 @@ public class MesosCallbackHandlerTest extends EasyMockTest {
         statusHandler,
         offerManager,
         eventSink,
-        MoreExecutors.directExecutor(),
+        customExecutor,
         injectedLog,
         statsProvider,
         driver,
         clock,
         controller,
         DRAIN_THRESHOLD);
-
   }
 
   @Test
@@ -302,7 +311,7 @@ public class MesosCallbackHandlerTest extends EasyMockTest {
 
   @Test
   public void testRescind() {
-    offerManager.cancelOffer(OFFER_ID);
+    expect(offerManager.cancelOffer(OFFER_ID)).andReturn(true);
 
     control.replay();
 
@@ -310,6 +319,44 @@ public class MesosCallbackHandlerTest extends EasyMockTest {
     assertEquals(1L, statsProvider.getLongValue("offers_rescinded"));
   }
 
+  /**
+   * Ensure that if we get a rescind and the offer has not been added yet, we will ban it and
+   * eventually unban.
+   */
+  @Test
+  public void testRescindBeforeAdd() throws InterruptedException {
+    // We want to observe the order of the offerManager calls to we create a strict mock.
+    offerManager = strictMock(OfferManager.class);
+
+    long delayInMilliseconds = 50;
+    createHandler(false, new DelayExecutor(delayInMilliseconds));
+
+    expect(offerManager.cancelOffer(OFFER_ID)).andReturn(false);
+    offerManager.banOffer(OFFER_ID);
+    storageUtil.expectOperations();
+    expectOfferAttributesSaved(HOST_OFFER);
+    offerManager.addOffer(HOST_OFFER);
+    expect(offerManager.cancelOffer(OFFER_ID)).andReturn(true);
+
+    control.replay();
+    replay(offerManager);
+
+    // Offer comes in, it will be put on the executor queue to add.
+    handler.handleOffers(ImmutableList.of(HOST_OFFER.getOffer()));
+
+    // Rescind comes in asynchronously, and we do not see HOST_OFFER in available list so we will
+    // temporarily ban it and add a command to the executor to unban it later. As the executor
+    // processes commands, we will try to add HOST_OFFER but it should be already banned.
+    // Eventually, we unban the offer.
+    handler.handleRescind(OFFER_ID);
+
+    // 2 commands executed (addOffer and unbanOffer) so we wait the length of 3.
+    Thread.sleep(delayInMilliseconds * 3);
+
+    assertEquals(1L, statsProvider.getLongValue("offers_rescinded"));
+    verify(offerManager);
+  }
+
   @Test
   public void testError() {
     shutdownCommand.execute();
@@ -487,4 +534,30 @@ public class MesosCallbackHandlerTest extends EasyMockTest {
     handler.handleInverseOffer(ImmutableList.of(INVERSE_OFFER));
     assertEquals(1L, statsProvider.getLongValue("scheduler_inverse_offers"));
   }
+
+  /**
+   * Test executor that will execute commands after waiting {@code delayTimeInMilliseconds}.
+   */
+  private static final class DelayExecutor implements Executor {
+    private final Executor executor = Executors.newSingleThreadExecutor();
+    private final long delayTimeInMilliseconds;
+
+    DelayExecutor(long delayTimeInMilliseconds) {
+      this.delayTimeInMilliseconds = delayTimeInMilliseconds;
+    }
+
+    @Override
+    public void execute(Runnable command) {
+      executor.execute(() -> {
+        try {
+          Thread.sleep(delayTimeInMilliseconds);
+        } catch (InterruptedException e) {
+          // Do not interrupt this thread.
+          throw new RuntimeException("DelayExecutor sleep interrupted.", e);
+        }
+
+        command.run();
+      });
+    }
+  }
 }

http://git-wip-us.apache.org/repos/asf/aurora/blob/62e46cde/src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java
----------------------------------------------------------------------
diff --git a/src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java b/src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java
index be02449..b92ad35 100644
--- a/src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java
+++ b/src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java
@@ -50,6 +50,7 @@ import static org.apache.aurora.gen.MaintenanceMode.DRAINING;
 import static org.apache.aurora.gen.MaintenanceMode.NONE;
 import static org.apache.aurora.scheduler.base.TaskTestUtil.JOB;
 import static org.apache.aurora.scheduler.base.TaskTestUtil.makeTask;
+import static org.apache.aurora.scheduler.offers.OfferManager.OfferManagerImpl.GLOBALLY_BANNED_OFFERS;
 import static org.apache.aurora.scheduler.offers.OfferManager.OfferManagerImpl.OFFER_ACCEPT_RACES;
 import static org.apache.aurora.scheduler.offers.OfferManager.OfferManagerImpl.OFFER_CANCEL_FAILURES;
 import static org.apache.aurora.scheduler.offers.OfferManager.OfferManagerImpl.OUTSTANDING_OFFERS;
@@ -250,14 +251,14 @@ public class OfferManagerImplTest extends EasyMockTest {
     control.replay();
 
     // Static ban ignored when now offers.
-    offerManager.banOffer(OFFER_A_ID, GROUP_KEY);
+    offerManager.banOfferForTaskGroup(OFFER_A_ID, GROUP_KEY);
     assertEquals(0L, statsProvider.getLongValue(STATICALLY_BANNED_OFFERS));
     offerManager.addOffer(OFFER_A);
     assertEquals(OFFER_A, Iterables.getOnlyElement(offerManager.getOffers(GROUP_KEY)));
     assertEquals(OFFER_A, Iterables.getOnlyElement(offerManager.getOffers()));
 
     // Add static ban.
-    offerManager.banOffer(OFFER_A_ID, GROUP_KEY);
+    offerManager.banOfferForTaskGroup(OFFER_A_ID, GROUP_KEY);
     assertEquals(1L, statsProvider.getLongValue(STATICALLY_BANNED_OFFERS));
     assertEquals(OFFER_A, Iterables.getOnlyElement(offerManager.getOffers()));
     assertTrue(Iterables.isEmpty(offerManager.getOffers(GROUP_KEY)));
@@ -274,7 +275,7 @@ public class OfferManagerImplTest extends EasyMockTest {
     control.replay();
 
     offerManager.addOffer(OFFER_A);
-    offerManager.banOffer(OFFER_A_ID, GROUP_KEY);
+    offerManager.banOfferForTaskGroup(OFFER_A_ID, GROUP_KEY);
     assertEquals(OFFER_A, Iterables.getOnlyElement(offerManager.getOffers()));
     assertTrue(Iterables.isEmpty(offerManager.getOffers(GROUP_KEY)));
     assertEquals(1L, statsProvider.getLongValue(STATICALLY_BANNED_OFFERS));
@@ -296,7 +297,7 @@ public class OfferManagerImplTest extends EasyMockTest {
     control.replay();
 
     offerManager.addOffer(OFFER_A);
-    offerManager.banOffer(OFFER_A_ID, GROUP_KEY);
+    offerManager.banOfferForTaskGroup(OFFER_A_ID, GROUP_KEY);
     assertEquals(OFFER_A, Iterables.getOnlyElement(offerManager.getOffers()));
     assertTrue(Iterables.isEmpty(offerManager.getOffers(GROUP_KEY)));
     assertEquals(1L, statsProvider.getLongValue(STATICALLY_BANNED_OFFERS));
@@ -383,6 +384,28 @@ public class OfferManagerImplTest extends EasyMockTest {
     assertEquals(0L, statsProvider.getLongValue(OUTSTANDING_OFFERS));
   }
 
+  @Test
+  public void testBanAndUnbanOffer() throws Exception {
+    driver.declineOffer(OFFER_A.getOffer().getId(), OFFER_FILTER);
+    expectLastCall().times(2);
+    control.replay();
+
+    // After adding a banned offer, user can see it is in OUTSTANDING_OFFERS but cannot retrieve it.
+    offerManager.banOffer(OFFER_A_ID);
+    offerManager.addOffer(OFFER_A);
+    assertEquals(1L, statsProvider.getLongValue(OUTSTANDING_OFFERS));
+    assertEquals(1L, statsProvider.getLongValue(GLOBALLY_BANNED_OFFERS));
+    assertTrue(Iterables.isEmpty(offerManager.getOffers(GROUP_KEY)));
+    clock.advance(RETURN_DELAY);
+
+    offerManager.cancelOffer(OFFER_A_ID);
+    offerManager.addOffer(OFFER_A);
+    assertEquals(1L, statsProvider.getLongValue(OUTSTANDING_OFFERS));
+    assertEquals(0L, statsProvider.getLongValue(GLOBALLY_BANNED_OFFERS));
+    assertEquals(OFFER_A, Iterables.getOnlyElement(offerManager.getOffers(GROUP_KEY)));
+    clock.advance(RETURN_DELAY);
+  }
+
   private static HostOffer setUnavailability(HostOffer offer, Long startMs) {
     Unavailability unavailability = Unavailability.newBuilder()
         .setStart(TimeInfo.newBuilder().setNanoseconds(startMs * 1000L)).build();

http://git-wip-us.apache.org/repos/asf/aurora/blob/62e46cde/src/test/java/org/apache/aurora/scheduler/state/FirstFitTaskAssignerTest.java
----------------------------------------------------------------------
diff --git a/src/test/java/org/apache/aurora/scheduler/state/FirstFitTaskAssignerTest.java b/src/test/java/org/apache/aurora/scheduler/state/FirstFitTaskAssignerTest.java
index 25c1137..a74efbd 100644
--- a/src/test/java/org/apache/aurora/scheduler/state/FirstFitTaskAssignerTest.java
+++ b/src/test/java/org/apache/aurora/scheduler/state/FirstFitTaskAssignerTest.java
@@ -200,7 +200,7 @@ public class FirstFitTaskAssignerTest extends EasyMockTest {
   public void testAssignVetoesWithStaticBan() throws Exception {
     expectNoUpdateReservations(1);
     expect(offerManager.getOffers(GROUP_KEY)).andReturn(ImmutableSet.of(OFFER));
-    offerManager.banOffer(MESOS_OFFER.getId(), GROUP_KEY);
+    offerManager.banOfferForTaskGroup(MESOS_OFFER.getId(), GROUP_KEY);
     expect(tierManager.getTier(TASK.getAssignedTask().getTask())).andReturn(DEV_TIER);
     expect(filter.filter(UNUSED, resourceRequest))
         .andReturn(ImmutableSet.of(Veto.constraintMismatch("denied")));
@@ -359,7 +359,7 @@ public class FirstFitTaskAssignerTest extends EasyMockTest {
             mismatched.getAttributes()),
         resourceRequest))
         .andReturn(ImmutableSet.of(Veto.constraintMismatch("constraint mismatch")));
-    offerManager.banOffer(mismatched.getOffer().getId(), GROUP_KEY);
+    offerManager.banOfferForTaskGroup(mismatched.getOffer().getId(), GROUP_KEY);
     expect(filter.filter(
         new UnusedResource(
             bagFromMesosResources(MESOS_OFFER.getResourcesList()), OFFER.getAttributes()),
@@ -436,7 +436,7 @@ public class FirstFitTaskAssignerTest extends EasyMockTest {
     expect(filter.filter(UNUSED, resourceRequest))
         .andReturn(ImmutableSet.of(Veto.insufficientResources("cpu", 1)));
     expect(tierManager.getTier(TASK.getAssignedTask().getTask())).andReturn(DEV_TIER);
-    offerManager.banOffer(MESOS_OFFER.getId(), GROUP_KEY);
+    offerManager.banOfferForTaskGroup(MESOS_OFFER.getId(), GROUP_KEY);
     expectLastCall();
 
     control.replay();
@@ -479,7 +479,7 @@ public class FirstFitTaskAssignerTest extends EasyMockTest {
         .andReturn(ImmutableSet.of());
     expect(filter.filter(UNUSED, resources))
         .andReturn(ImmutableSet.of(Veto.constraintMismatch("lol")));
-    offerManager.banOffer(MESOS_OFFER.getId(), GROUP_KEY);
+    offerManager.banOfferForTaskGroup(MESOS_OFFER.getId(), GROUP_KEY);
 
     control.replay();