You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@aurora.apache.org by David McLaughlin <da...@dmclaughlin.com> on 2017/02/18 00:13:53 UTC
Review Request 56797: Move task conversion during reconciliation into
the delayed closure.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56797/
-----------------------------------------------------------
Review request for Aurora, Mehrdad Nurolahzade and Zameer Manji.
Repository: aurora
Description
-------
This is a small change to relieve GC pressure while explicit reconciliation runs. It moves the IScheduledTask -> TaskStatus conversion into the batch processing closure so that any object allocation and collection overhead is delayed until the batch is actually processed. It has a noticable effect on GC for large amounts of RUNNING tasks.
Diffs
-----
src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java ec7ccafcd360c00beceb067963bc430b6b8d8256
Diff: https://reviews.apache.org/r/56797/diff/
Testing
-------
This is running in prod at Twitter. Our post-snapshot stop the world GC hit is reduced dramatically maybe about 80% of the time with this change.
Thanks,
David McLaughlin
Re: Review Request 56797: Move task conversion during reconciliation
into the delayed closure.
Posted by Reza Motamedi <re...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56797/#review166041
-----------------------------------------------------------
Ship it!
Ship It!
- Reza Motamedi
On Feb. 18, 2017, 12:13 a.m., David McLaughlin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56797/
> -----------------------------------------------------------
>
> (Updated Feb. 18, 2017, 12:13 a.m.)
>
>
> Review request for Aurora, Mehrdad Nurolahzade and Zameer Manji.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> This is a small change to relieve GC pressure while explicit reconciliation runs. It moves the IScheduledTask -> TaskStatus conversion into the batch processing closure so that any object allocation and collection overhead is delayed until the batch is actually processed. It has a noticable effect on GC for large amounts of RUNNING tasks.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java ec7ccafcd360c00beceb067963bc430b6b8d8256
>
> Diff: https://reviews.apache.org/r/56797/diff/
>
>
> Testing
> -------
>
> This is running in prod at Twitter. Our post-snapshot stop the world GC hit is reduced dramatically maybe about 80% of the time with this change.
>
>
> Thanks,
>
> David McLaughlin
>
>
Re: Review Request 56797: Move task conversion during reconciliation
into the delayed closure.
Posted by Mehrdad Nurolahzade <me...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56797/#review166014
-----------------------------------------------------------
Ship it!
This brings up the discussion we had around `TaskHistoryPruner` design alternatives ([rb](https://reviews.apache.org/r/56575/)):
1. Load all expired tasks at once, filter and delete.
2. Load in smaller batch sizes (perhaps per job), filter, and delete (maybe also add a `Thread.sleep()` pause).
The take away lesson here is converting tasks from `ISchedulerTask` to `TaskStatus` in smaller batches, with delays in between, releaves heap pressure. By the same logic, I would assume pruning expired tasks in batches (option 2 above) would produce less heap pressure (even though is not as efficient).
- Mehrdad Nurolahzade
On Feb. 17, 2017, 4:13 p.m., David McLaughlin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56797/
> -----------------------------------------------------------
>
> (Updated Feb. 17, 2017, 4:13 p.m.)
>
>
> Review request for Aurora, Mehrdad Nurolahzade and Zameer Manji.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> This is a small change to relieve GC pressure while explicit reconciliation runs. It moves the IScheduledTask -> TaskStatus conversion into the batch processing closure so that any object allocation and collection overhead is delayed until the batch is actually processed. It has a noticable effect on GC for large amounts of RUNNING tasks.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java ec7ccafcd360c00beceb067963bc430b6b8d8256
>
> Diff: https://reviews.apache.org/r/56797/diff/
>
>
> Testing
> -------
>
> This is running in prod at Twitter. Our post-snapshot stop the world GC hit is reduced dramatically maybe about 80% of the time with this change.
>
>
> Thanks,
>
> David McLaughlin
>
>
Re: Review Request 56797: Move task conversion during reconciliation
into the delayed closure.
Posted by Zameer Manji <zm...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56797/#review166007
-----------------------------------------------------------
Ship it!
Ship It!
- Zameer Manji
On Feb. 17, 2017, 4:13 p.m., David McLaughlin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56797/
> -----------------------------------------------------------
>
> (Updated Feb. 17, 2017, 4:13 p.m.)
>
>
> Review request for Aurora, Mehrdad Nurolahzade and Zameer Manji.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> This is a small change to relieve GC pressure while explicit reconciliation runs. It moves the IScheduledTask -> TaskStatus conversion into the batch processing closure so that any object allocation and collection overhead is delayed until the batch is actually processed. It has a noticable effect on GC for large amounts of RUNNING tasks.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java ec7ccafcd360c00beceb067963bc430b6b8d8256
>
> Diff: https://reviews.apache.org/r/56797/diff/
>
>
> Testing
> -------
>
> This is running in prod at Twitter. Our post-snapshot stop the world GC hit is reduced dramatically maybe about 80% of the time with this change.
>
>
> Thanks,
>
> David McLaughlin
>
>
Re: Review Request 56797: Move task conversion during reconciliation
into the delayed closure.
Posted by Santhosh Kumar Shanmugham <sa...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56797/#review166013
-----------------------------------------------------------
Ship it!
Ship It!
- Santhosh Kumar Shanmugham
On Feb. 17, 2017, 4:13 p.m., David McLaughlin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56797/
> -----------------------------------------------------------
>
> (Updated Feb. 17, 2017, 4:13 p.m.)
>
>
> Review request for Aurora, Mehrdad Nurolahzade and Zameer Manji.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> This is a small change to relieve GC pressure while explicit reconciliation runs. It moves the IScheduledTask -> TaskStatus conversion into the batch processing closure so that any object allocation and collection overhead is delayed until the batch is actually processed. It has a noticable effect on GC for large amounts of RUNNING tasks.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java ec7ccafcd360c00beceb067963bc430b6b8d8256
>
> Diff: https://reviews.apache.org/r/56797/diff/
>
>
> Testing
> -------
>
> This is running in prod at Twitter. Our post-snapshot stop the world GC hit is reduced dramatically maybe about 80% of the time with this change.
>
>
> Thanks,
>
> David McLaughlin
>
>
Re: Review Request 56797: Move task conversion during reconciliation
into the delayed closure.
Posted by Aurora ReviewBot <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56797/#review166008
-----------------------------------------------------------
Master (4ab4b2b) is red with this patch.
./build-support/jenkins/build.sh
Test coverage missing for org/apache/aurora/scheduler/events/Webhook
Test coverage missing for org/apache/aurora/scheduler/events/WebhookInfo
Test coverage missing for org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl
Test coverage missing for org/apache/aurora/scheduler/storage/log/EntrySerializer$EntrySerializerImpl$1
Test coverage missing for org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl$8
Test coverage missing for org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl$7
Test coverage missing for org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl$4
Test coverage missing for org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl$3
Test coverage missing for org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl$6
Test coverage missing for org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl$5
Test coverage missing for org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl$2
Test coverage missing for org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl$1
Test coverage missing for org/apache/aurora/scheduler/storage/log/LogStorage$Settings
Test coverage missing for org/apache/aurora/scheduler/storage/log/LogStorage$ScheduledExecutorSchedulingService
Test coverage missing for org/apache/aurora/scheduler/storage/log/LogStorageModule
Test coverage missing for org/apache/aurora/scheduler/storage/backup/TemporaryStorage$TemporaryStorageFactory$1
Test coverage missing for org/apache/aurora/scheduler/storage/backup/BackupModule
Test coverage missing for org/apache/aurora/scheduler/storage/backup/Recovery$RecoveryImpl
Test coverage missing for org/apache/aurora/scheduler/storage/backup/TemporaryStorage$TemporaryStorageFactory
Test coverage missing for org/apache/aurora/scheduler/storage/backup/Recovery$RecoveryImpl$PendingRecovery
Test coverage missing for org/apache/aurora/scheduler/TaskVars
Test coverage missing for org/apache/aurora/scheduler/SchedulerLifecycle$DefaultDelayedActions
Test coverage missing for org/apache/aurora/scheduler/TierManager$TierManagerImpl$TierConfig
Test coverage missing for org/apache/aurora/scheduler/TaskVars$Counter
Test coverage missing for org/apache/aurora/scheduler/TaskVars$1
Test coverage missing for org/apache/aurora/scheduler/SchedulerModule$TaskEventBatchWorker
Test coverage missing for org/apache/aurora/scheduler/HostOffer$1
Test coverage missing for org/apache/aurora/scheduler/SchedulerModule
Test coverage missing for org/apache/aurora/scheduler/TaskIdGenerator$TaskIdGeneratorImpl
Test coverage missing for org/apache/aurora/scheduler/SchedulerModule$1
Test coverage missing for org/apache/aurora/scheduler/TaskStatusHandlerImpl
Test coverage missing for org/apache/aurora/scheduler/TaskStatusHandlerImpl$1
* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output.
==============================================================================
BUILD FAILED
Total time: 5 mins 40.989 secs
I will refresh this build result if you post a review containing "@ReviewBot retry"
- Aurora ReviewBot
On Feb. 18, 2017, 12:13 a.m., David McLaughlin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56797/
> -----------------------------------------------------------
>
> (Updated Feb. 18, 2017, 12:13 a.m.)
>
>
> Review request for Aurora, Mehrdad Nurolahzade and Zameer Manji.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> This is a small change to relieve GC pressure while explicit reconciliation runs. It moves the IScheduledTask -> TaskStatus conversion into the batch processing closure so that any object allocation and collection overhead is delayed until the batch is actually processed. It has a noticable effect on GC for large amounts of RUNNING tasks.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java ec7ccafcd360c00beceb067963bc430b6b8d8256
>
> Diff: https://reviews.apache.org/r/56797/diff/
>
>
> Testing
> -------
>
> This is running in prod at Twitter. Our post-snapshot stop the world GC hit is reduced dramatically maybe about 80% of the time with this change.
>
>
> Thanks,
>
> David McLaughlin
>
>