You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@aurora.apache.org by "Jordan Ly (JIRA)" <ji...@apache.org> on 2017/09/26 23:11:00 UTC
[jira] [Updated] (AURORA-1949) PreemptionVictimFilterImpl
comparator violates transitivity causing exceptions
[ https://issues.apache.org/jira/browse/AURORA-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jordan Ly updated AURORA-1949:
------------------------------
Description:
The PreemptionVictimFilterImpl uses a comparator to sort ResourceBags in order to preempt the biggest tasks first when searching for a victim. However, the current implementation causes an exception which causes the Scheduler to fail:
{noformat}
SEVERE: Service PreemptorService [FAILED] has failed in the RUNNING state.
java.lang.IllegalArgumentException: Comparison method violates its general contract!
at java.util.TimSort.mergeLo(TimSort.java:777)
at java.util.TimSort.mergeAt(TimSort.java:514)
at java.util.TimSort.mergeCollapse(TimSort.java:441)
at java.util.TimSort.sort(TimSort.java:245)
at java.util.Arrays.sort(Arrays.java:1438)
at com.google.common.collect.Ordering.immutableSortedCopy(Ordering.java:882)
at org.apache.aurora.scheduler.preemptor.PreemptionVictimFilter$PreemptionVictimFilterImpl.filterPreemptionVictims(PreemptionVictimFilter.java:210)
at org.apache.aurora.scheduler.preemptor.PendingTaskProcessor.lambda$run$0(PendingTaskProcessor.java:178)
at org.apache.aurora.scheduler.storage.db.DbStorage.read(DbStorage.java:147)
at org.mybatis.guice.transactional.TransactionalMethodInterceptor.invoke(TransactionalMethodInterceptor.java:101)
at org.apache.aurora.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:83)
at org.apache.aurora.scheduler.storage.log.LogStorage.read(LogStorage.java:562)
at org.apache.aurora.scheduler.storage.CallOrderEnforcingStorage.read(CallOrderEnforcingStorage.java:113)
at org.apache.aurora.scheduler.preemptor.PendingTaskProcessor.run(PendingTaskProcessor.java:135)
at org.apache.aurora.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:83)
at org.apache.aurora.scheduler.preemptor.PreemptorModule$PreemptorService.runOneIteration(PreemptorModule.java:205)
at com.google.common.util.concurrent.AbstractScheduledService$ServiceDelegate$Task.run(AbstractScheduledService.java:188)
at com.google.common.util.concurrent.Callables$4.run(Callables.java:122)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{noformat}
Looking at the code, it seems it violates transitivity:
{code:java}
@VisibleForTesting
static final Ordering<ResourceBag> ORDER = new Ordering<ResourceBag>() {
@Override
public int compare(ResourceBag left, ResourceBag right) {
Set<ResourceType> types = ImmutableSet.<ResourceType>builder()
.addAll(left.streamResourceVectors().map(e -> e.getKey()).iterator())
.addAll(right.streamResourceVectors().map(e -> e.getKey()).iterator())
.build();
boolean allZero = true;
boolean allGreaterOrEqual = true;
boolean allLessOrEqual = true;
for (ResourceType type : types) {
int compare = left.valueOf(type).compareTo(right.valueOf(type));
if (compare != 0) {
allZero = false;
}
if (compare < 0) {
allGreaterOrEqual = false;
}
if (compare > 0) {
allLessOrEqual = false;
}
}
if (allZero) {
return 0;
}
if (allGreaterOrEqual) {
return 1;
}
if (allLessOrEqual) {
return -1;
}
return 0;
}
};
{code}
The example below illustrates the error:
{noformat}
Resource: X Y Z
Bag A: 2 0 2
Bag B: 1 2 1
Bag C: 2 2 1
{noformat}
We can see that A = B, B < C, and C = A which would cause an exception.
There are a couple of routes we can take to solve this. What we really want is to be able to define partial ordering (comparator does total ordering) so we can do a topological sort. Additionally, we can give priority to particular resources (Dominant Resource Fairness).
was:
The PreemptionVictimFilterImpl uses a comparator to sort ResourceBags in order to preempt the biggest tasks first when searching for a victim. However, the current implementation causes an exception which causes the Scheduler to fail:
{noformat}
SEVERE: Service PreemptorService [FAILED] has failed in the RUNNING state.
java.lang.IllegalArgumentException: Comparison method violates its general contract!
at java.util.TimSort.mergeLo(TimSort.java:777)
at java.util.TimSort.mergeAt(TimSort.java:514)
at java.util.TimSort.mergeCollapse(TimSort.java:441)
at java.util.TimSort.sort(TimSort.java:245)
at java.util.Arrays.sort(Arrays.java:1438)
at com.google.common.collect.Ordering.immutableSortedCopy(Ordering.java:882)
at org.apache.aurora.scheduler.preemptor.PreemptionVictimFilter$PreemptionVictimFilterImpl.filterPreemptionVictims(PreemptionVictimFilter.java:210)
at org.apache.aurora.scheduler.preemptor.PendingTaskProcessor.lambda$run$0(PendingTaskProcessor.java:178)
at org.apache.aurora.scheduler.storage.db.DbStorage.read(DbStorage.java:147)
at org.mybatis.guice.transactional.TransactionalMethodInterceptor.invoke(TransactionalMethodInterceptor.java:101)
at org.apache.aurora.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:83)
at org.apache.aurora.scheduler.storage.log.LogStorage.read(LogStorage.java:562)
at org.apache.aurora.scheduler.storage.CallOrderEnforcingStorage.read(CallOrderEnforcingStorage.java:113)
at org.apache.aurora.scheduler.preemptor.PendingTaskProcessor.run(PendingTaskProcessor.java:135)
at org.apache.aurora.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:83)
at org.apache.aurora.scheduler.preemptor.PreemptorModule$PreemptorService.runOneIteration(PreemptorModule.java:205)
at com.google.common.util.concurrent.AbstractScheduledService$ServiceDelegate$Task.run(AbstractScheduledService.java:188)
at com.google.common.util.concurrent.Callables$4.run(Callables.java:122)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{noformat}
Looking at the code, it seems it violates transitivity:
{code:java}
@VisibleForTesting
static final Ordering<ResourceBag> ORDER = new Ordering<ResourceBag>() {
@Override
public int compare(ResourceBag left, ResourceBag right) {
Set<ResourceType> types = ImmutableSet.<ResourceType>builder()
.addAll(left.streamResourceVectors().map(e -> e.getKey()).iterator())
.addAll(right.streamResourceVectors().map(e -> e.getKey()).iterator())
.build();
boolean allZero = true;
boolean allGreaterOrEqual = true;
boolean allLessOrEqual = true;
for (ResourceType type : types) {
int compare = left.valueOf(type).compareTo(right.valueOf(type));
if (compare != 0) {
allZero = false;
}
if (compare < 0) {
allGreaterOrEqual = false;
}
if (compare > 0) {
allLessOrEqual = false;
}
}
if (allZero) {
return 0;
}
if (allGreaterOrEqual) {
return 1;
}
if (allLessOrEqual) {
return -1;
}
return 0;
}
};
{code}
The example below illustrates the error:
{noformat}
Resource: X Y Z
Bag A: 2 0 2
Bag B: 1 2 1
Bag C: 2 2 1
{noformat}
We can see that A = B, B < C, and C = A which would cause an exception.
There are a couple of routes we can take to solve this. What we really want is to be able to define partial ordering (comparator does total ordering) so we can do a topological sort. Additionally, we can give priority to particular resources (Dominant Resource Fairness).
> PreemptionVictimFilterImpl comparator violates transitivity causing exceptions
> ------------------------------------------------------------------------------
>
> Key: AURORA-1949
> URL: https://issues.apache.org/jira/browse/AURORA-1949
> Project: Aurora
> Issue Type: Bug
> Components: Scheduler
> Reporter: Jordan Ly
> Priority: Critical
>
> The PreemptionVictimFilterImpl uses a comparator to sort ResourceBags in order to preempt the biggest tasks first when searching for a victim. However, the current implementation causes an exception which causes the Scheduler to fail:
> {noformat}
> SEVERE: Service PreemptorService [FAILED] has failed in the RUNNING state.
> java.lang.IllegalArgumentException: Comparison method violates its general contract!
> at java.util.TimSort.mergeLo(TimSort.java:777)
> at java.util.TimSort.mergeAt(TimSort.java:514)
> at java.util.TimSort.mergeCollapse(TimSort.java:441)
> at java.util.TimSort.sort(TimSort.java:245)
> at java.util.Arrays.sort(Arrays.java:1438)
> at com.google.common.collect.Ordering.immutableSortedCopy(Ordering.java:882)
> at org.apache.aurora.scheduler.preemptor.PreemptionVictimFilter$PreemptionVictimFilterImpl.filterPreemptionVictims(PreemptionVictimFilter.java:210)
> at org.apache.aurora.scheduler.preemptor.PendingTaskProcessor.lambda$run$0(PendingTaskProcessor.java:178)
> at org.apache.aurora.scheduler.storage.db.DbStorage.read(DbStorage.java:147)
> at org.mybatis.guice.transactional.TransactionalMethodInterceptor.invoke(TransactionalMethodInterceptor.java:101)
> at org.apache.aurora.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:83)
> at org.apache.aurora.scheduler.storage.log.LogStorage.read(LogStorage.java:562)
> at org.apache.aurora.scheduler.storage.CallOrderEnforcingStorage.read(CallOrderEnforcingStorage.java:113)
> at org.apache.aurora.scheduler.preemptor.PendingTaskProcessor.run(PendingTaskProcessor.java:135)
> at org.apache.aurora.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:83)
> at org.apache.aurora.scheduler.preemptor.PreemptorModule$PreemptorService.runOneIteration(PreemptorModule.java:205)
> at com.google.common.util.concurrent.AbstractScheduledService$ServiceDelegate$Task.run(AbstractScheduledService.java:188)
> at com.google.common.util.concurrent.Callables$4.run(Callables.java:122)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> Looking at the code, it seems it violates transitivity:
> {code:java}
> @VisibleForTesting
> static final Ordering<ResourceBag> ORDER = new Ordering<ResourceBag>() {
> @Override
> public int compare(ResourceBag left, ResourceBag right) {
> Set<ResourceType> types = ImmutableSet.<ResourceType>builder()
> .addAll(left.streamResourceVectors().map(e -> e.getKey()).iterator())
> .addAll(right.streamResourceVectors().map(e -> e.getKey()).iterator())
> .build();
> boolean allZero = true;
> boolean allGreaterOrEqual = true;
> boolean allLessOrEqual = true;
> for (ResourceType type : types) {
> int compare = left.valueOf(type).compareTo(right.valueOf(type));
> if (compare != 0) {
> allZero = false;
> }
> if (compare < 0) {
> allGreaterOrEqual = false;
> }
> if (compare > 0) {
> allLessOrEqual = false;
> }
> }
> if (allZero) {
> return 0;
> }
> if (allGreaterOrEqual) {
> return 1;
> }
> if (allLessOrEqual) {
> return -1;
> }
> return 0;
> }
> };
> {code}
> The example below illustrates the error:
> {noformat}
> Resource: X Y Z
> Bag A: 2 0 2
> Bag B: 1 2 1
> Bag C: 2 2 1
> {noformat}
> We can see that A = B, B < C, and C = A which would cause an exception.
> There are a couple of routes we can take to solve this. What we really want is to be able to define partial ordering (comparator does total ordering) so we can do a topological sort. Additionally, we can give priority to particular resources (Dominant Resource Fairness).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)