You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@aurora.apache.org by Zameer Manji <zm...@apache.org> on 2016/12/01 23:12:00 UTC
Review Request 54269: Improve scheduling throughput via logging
changes.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54269/
-----------------------------------------------------------
Review request for Aurora, David McLaughlin, Joshua Cohen, and Stephan Erb.
Bugs: AURORA-1831
https://issues.apache.org/jira/browse/AURORA-1831
Repository: aurora
Description
-------
This patch makes two logging performance changes.
First, it reduces the cost of logging by replacing the costly class and line
patterns with the cheaper logger pattern. We lose line numbers and inner class
information for much cheaper logging.
Before
````
I1201 15:08:40.560 [AsyncProcessor-0, StateMachine$Builder:389] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
````
After
````
I1201 15:06:47.181 [AsyncProcessor-0, StateMachine] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
````
Second, it reduces the verbosity of the `TaskStateMachine` logging where it logs
the work command from the transition. I don't think there is any operator value
in logging this (unlike the task state transitions) so I have lowered it to
debug.
Performance Before:
````
Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 2.510 � 0.557 ops/s
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 0.272 � 0.030 ops/s
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.053 � 0.011 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 10 2.446 � 0.698 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 10 0.246 � 0.018 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 10 0.041 � 0.006 ops/s
````
Performance After:
````
Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 14.520 � 5.696 ops/s
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 1.290 � 0.361 ops/s
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.254 � 0.097 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 5 7.303 � 5.662 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 5 0.726 � 0.624 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 5 0.124 � 0.058 ops/s
````
There is a performance improvement in the smaller case and no noticable
degredation in the larger cases. I also verified on a small cluster that the
improvements exist for the larger cases. I am unable to reduce the error bars
locally on the `InsertPendingTasksBenchmark`. I suspect the bencmark needs to be
tweaked to be more consistent.
Diffs
-----
src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java 23f256a7d467c79dcd5d906f76af4f0261dfd81d
src/main/resources/logback.xml 84c175cec811fd95db44fd8dfcd514385606042d
Diff: https://reviews.apache.org/r/54269/diff/
Testing
-------
Thanks,
Zameer Manji
Re: Review Request 54269: Improve scheduling throughput via logging
changes.
Posted by Zameer Manji <zm...@apache.org>.
> On Dec. 1, 2016, 3:32 p.m., Mehrdad Nurolahzade wrote:
> > src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java, line 474
> > <https://reviews.apache.org/r/54269/diff/1/?file=1574443#file1574443line474>
> >
> > Did you experiment with switching state transition logs off? https://github.com/apache/aurora/blob/master/commons/src/main/java/org/apache/aurora/common/util/StateMachine.java#L389
Yes, and there is another throughput improvement.
However, I find these statements invaluable for debugging issues. I frequently grep a single task id in Aurora's logs as well as the Mesos master and agents to see the timeline for a single task.
For now, I don't think we should reduce the log level.
- Zameer
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54269/#review157673
-----------------------------------------------------------
On Dec. 1, 2016, 3:13 p.m., Zameer Manji wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54269/
> -----------------------------------------------------------
>
> (Updated Dec. 1, 2016, 3:13 p.m.)
>
>
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Stephan Erb.
>
>
> Bugs: AURORA-1831
> https://issues.apache.org/jira/browse/AURORA-1831
>
>
> Repository: aurora
>
>
> Description
> -------
>
> This patch makes two logging performance changes.
>
> First, it reduces the cost of logging by replacing the costly class and line
> patterns with the cheaper logger pattern. We lose line numbers and inner class
> information for much cheaper logging.
>
> Before
> ````
> I1201 15:08:40.560 [AsyncProcessor-0, StateMachine$Builder:389] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
> ````
>
> After
> ````
> I1201 15:06:47.181 [AsyncProcessor-0, StateMachine] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
> ````
>
> Second, it reduces the verbosity of the `TaskStateMachine` logging where it logs
> the work command from the transition. I don't think there is any operator value
> in logging this (unlike the task state transitions) so I have lowered it to
> debug.
>
> Performance Before:
>
> ````
> Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 2.510 � 0.557 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 0.272 � 0.030 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.053 � 0.011 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 10 2.446 � 0.698 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 10 0.246 � 0.018 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 10 0.041 � 0.006 ops/s
> ````
>
> Performance After:
> ````
> Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 14.520 � 5.696 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 1.290 � 0.361 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.254 � 0.097 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 5 7.303 � 5.662 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 5 0.726 � 0.624 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 5 0.124 � 0.058 ops/s
> ````
>
> There is a performance improvement in the smaller case and no noticable
> degredation in the larger cases. I also verified on a small cluster that the
> improvements exist for the larger cases. I am unable to reduce the error bars
> locally on the `InsertPendingTasksBenchmark`. I suspect the bencmark needs to be
> tweaked to be more consistent.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java 23f256a7d467c79dcd5d906f76af4f0261dfd81d
> src/main/resources/logback.xml 84c175cec811fd95db44fd8dfcd514385606042d
>
> Diff: https://reviews.apache.org/r/54269/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Zameer Manji
>
>
Re: Review Request 54269: Improve scheduling throughput via logging
changes.
Posted by Mehrdad Nurolahzade <me...@nurolahzade.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54269/#review157673
-----------------------------------------------------------
src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java (line 474)
<https://reviews.apache.org/r/54269/#comment228304>
Did you experiment with switching state transition logs off? https://github.com/apache/aurora/blob/master/commons/src/main/java/org/apache/aurora/common/util/StateMachine.java#L389
- Mehrdad Nurolahzade
On Dec. 1, 2016, 3:13 p.m., Zameer Manji wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54269/
> -----------------------------------------------------------
>
> (Updated Dec. 1, 2016, 3:13 p.m.)
>
>
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Stephan Erb.
>
>
> Bugs: AURORA-1831
> https://issues.apache.org/jira/browse/AURORA-1831
>
>
> Repository: aurora
>
>
> Description
> -------
>
> This patch makes two logging performance changes.
>
> First, it reduces the cost of logging by replacing the costly class and line
> patterns with the cheaper logger pattern. We lose line numbers and inner class
> information for much cheaper logging.
>
> Before
> ````
> I1201 15:08:40.560 [AsyncProcessor-0, StateMachine$Builder:389] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
> ````
>
> After
> ````
> I1201 15:06:47.181 [AsyncProcessor-0, StateMachine] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
> ````
>
> Second, it reduces the verbosity of the `TaskStateMachine` logging where it logs
> the work command from the transition. I don't think there is any operator value
> in logging this (unlike the task state transitions) so I have lowered it to
> debug.
>
> Performance Before:
>
> ````
> Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 2.510 � 0.557 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 0.272 � 0.030 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.053 � 0.011 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 10 2.446 � 0.698 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 10 0.246 � 0.018 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 10 0.041 � 0.006 ops/s
> ````
>
> Performance After:
> ````
> Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 14.520 � 5.696 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 1.290 � 0.361 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.254 � 0.097 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 5 7.303 � 5.662 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 5 0.726 � 0.624 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 5 0.124 � 0.058 ops/s
> ````
>
> There is a performance improvement in the smaller case and no noticable
> degredation in the larger cases. I also verified on a small cluster that the
> improvements exist for the larger cases. I am unable to reduce the error bars
> locally on the `InsertPendingTasksBenchmark`. I suspect the bencmark needs to be
> tweaked to be more consistent.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java 23f256a7d467c79dcd5d906f76af4f0261dfd81d
> src/main/resources/logback.xml 84c175cec811fd95db44fd8dfcd514385606042d
>
> Diff: https://reviews.apache.org/r/54269/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Zameer Manji
>
>
Re: Review Request 54269: Improve scheduling throughput via logging
changes.
Posted by David McLaughlin <da...@dmclaughlin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54269/#review157676
-----------------------------------------------------------
Ship it!
Ship It!
- David McLaughlin
On Dec. 1, 2016, 11:13 p.m., Zameer Manji wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54269/
> -----------------------------------------------------------
>
> (Updated Dec. 1, 2016, 11:13 p.m.)
>
>
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Stephan Erb.
>
>
> Bugs: AURORA-1831
> https://issues.apache.org/jira/browse/AURORA-1831
>
>
> Repository: aurora
>
>
> Description
> -------
>
> This patch makes two logging performance changes.
>
> First, it reduces the cost of logging by replacing the costly class and line
> patterns with the cheaper logger pattern. We lose line numbers and inner class
> information for much cheaper logging.
>
> Before
> ````
> I1201 15:08:40.560 [AsyncProcessor-0, StateMachine$Builder:389] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
> ````
>
> After
> ````
> I1201 15:06:47.181 [AsyncProcessor-0, StateMachine] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
> ````
>
> Second, it reduces the verbosity of the `TaskStateMachine` logging where it logs
> the work command from the transition. I don't think there is any operator value
> in logging this (unlike the task state transitions) so I have lowered it to
> debug.
>
> Performance Before:
>
> ````
> Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 2.510 � 0.557 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 0.272 � 0.030 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.053 � 0.011 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 10 2.446 � 0.698 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 10 0.246 � 0.018 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 10 0.041 � 0.006 ops/s
> ````
>
> Performance After:
> ````
> Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 14.520 � 5.696 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 1.290 � 0.361 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.254 � 0.097 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 5 7.303 � 5.662 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 5 0.726 � 0.624 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 5 0.124 � 0.058 ops/s
> ````
>
> There is a performance improvement in the smaller case and no noticable
> degredation in the larger cases. I also verified on a small cluster that the
> improvements exist for the larger cases. I am unable to reduce the error bars
> locally on the `InsertPendingTasksBenchmark`. I suspect the bencmark needs to be
> tweaked to be more consistent.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java 23f256a7d467c79dcd5d906f76af4f0261dfd81d
> src/main/resources/logback.xml 84c175cec811fd95db44fd8dfcd514385606042d
>
> Diff: https://reviews.apache.org/r/54269/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Zameer Manji
>
>
Re: Review Request 54269: Improve scheduling throughput via logging
changes.
Posted by Zameer Manji <zm...@apache.org>.
> On Dec. 2, 2016, 11:29 a.m., Joshua Cohen wrote:
> > src/main/resources/logback.xml, line 26
> > <https://reviews.apache.org/r/54269/diff/1/?file=1574444#file1574444line26>
> >
> > Can you add a comment here explaining why we're not using `%class` and `%line`? They're definitely valuable when debugging, so I can imagine someone coming back in the future to change this.
Done.
- Zameer
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54269/#review157799
-----------------------------------------------------------
On Dec. 1, 2016, 3:13 p.m., Zameer Manji wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54269/
> -----------------------------------------------------------
>
> (Updated Dec. 1, 2016, 3:13 p.m.)
>
>
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Stephan Erb.
>
>
> Bugs: AURORA-1831
> https://issues.apache.org/jira/browse/AURORA-1831
>
>
> Repository: aurora
>
>
> Description
> -------
>
> This patch makes two logging performance changes.
>
> First, it reduces the cost of logging by replacing the costly class and line
> patterns with the cheaper logger pattern. We lose line numbers and inner class
> information for much cheaper logging.
>
> Before
> ````
> I1201 15:08:40.560 [AsyncProcessor-0, StateMachine$Builder:389] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
> ````
>
> After
> ````
> I1201 15:06:47.181 [AsyncProcessor-0, StateMachine] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
> ````
>
> Second, it reduces the verbosity of the `TaskStateMachine` logging where it logs
> the work command from the transition. I don't think there is any operator value
> in logging this (unlike the task state transitions) so I have lowered it to
> debug.
>
> Performance Before:
>
> ````
> Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 2.510 � 0.557 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 0.272 � 0.030 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.053 � 0.011 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 10 2.446 � 0.698 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 10 0.246 � 0.018 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 10 0.041 � 0.006 ops/s
> ````
>
> Performance After:
> ````
> Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 14.520 � 5.696 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 1.290 � 0.361 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.254 � 0.097 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 5 7.303 � 5.662 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 5 0.726 � 0.624 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 5 0.124 � 0.058 ops/s
> ````
>
> There is a performance improvement in the smaller case and no noticable
> degredation in the larger cases. I also verified on a small cluster that the
> improvements exist for the larger cases. I am unable to reduce the error bars
> locally on the `InsertPendingTasksBenchmark`. I suspect the bencmark needs to be
> tweaked to be more consistent.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java 23f256a7d467c79dcd5d906f76af4f0261dfd81d
> src/main/resources/logback.xml 84c175cec811fd95db44fd8dfcd514385606042d
>
> Diff: https://reviews.apache.org/r/54269/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Zameer Manji
>
>
Re: Review Request 54269: Improve scheduling throughput via logging
changes.
Posted by Joshua Cohen <jc...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54269/#review157799
-----------------------------------------------------------
Ship it!
src/main/resources/logback.xml (line 26)
<https://reviews.apache.org/r/54269/#comment228400>
Can you add a comment here explaining why we're not using `%class` and `%line`? They're definitely valuable when debugging, so I can imagine someone coming back in the future to change this.
- Joshua Cohen
On Dec. 1, 2016, 11:13 p.m., Zameer Manji wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54269/
> -----------------------------------------------------------
>
> (Updated Dec. 1, 2016, 11:13 p.m.)
>
>
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Stephan Erb.
>
>
> Bugs: AURORA-1831
> https://issues.apache.org/jira/browse/AURORA-1831
>
>
> Repository: aurora
>
>
> Description
> -------
>
> This patch makes two logging performance changes.
>
> First, it reduces the cost of logging by replacing the costly class and line
> patterns with the cheaper logger pattern. We lose line numbers and inner class
> information for much cheaper logging.
>
> Before
> ````
> I1201 15:08:40.560 [AsyncProcessor-0, StateMachine$Builder:389] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
> ````
>
> After
> ````
> I1201 15:06:47.181 [AsyncProcessor-0, StateMachine] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
> ````
>
> Second, it reduces the verbosity of the `TaskStateMachine` logging where it logs
> the work command from the transition. I don't think there is any operator value
> in logging this (unlike the task state transitions) so I have lowered it to
> debug.
>
> Performance Before:
>
> ````
> Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 2.510 � 0.557 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 0.272 � 0.030 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.053 � 0.011 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 10 2.446 � 0.698 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 10 0.246 � 0.018 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 10 0.041 � 0.006 ops/s
> ````
>
> Performance After:
> ````
> Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 14.520 � 5.696 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 1.290 � 0.361 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.254 � 0.097 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 5 7.303 � 5.662 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 5 0.726 � 0.624 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 5 0.124 � 0.058 ops/s
> ````
>
> There is a performance improvement in the smaller case and no noticable
> degredation in the larger cases. I also verified on a small cluster that the
> improvements exist for the larger cases. I am unable to reduce the error bars
> locally on the `InsertPendingTasksBenchmark`. I suspect the bencmark needs to be
> tweaked to be more consistent.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java 23f256a7d467c79dcd5d906f76af4f0261dfd81d
> src/main/resources/logback.xml 84c175cec811fd95db44fd8dfcd514385606042d
>
> Diff: https://reviews.apache.org/r/54269/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Zameer Manji
>
>
Re: Review Request 54269: Improve scheduling throughput via logging
changes.
Posted by Zameer Manji <zm...@apache.org>.
> On Dec. 1, 2016, 5:41 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java, line 474
> > <https://reviews.apache.org/r/54269/diff/1/?file=1574443#file1574443line474>
> >
> > Please use the logging builtin formatting rather than doing the string concatenate all the time.
Good catch, done!
On Dec. 1, 2016, 5:41 p.m., Zameer Manji wrote:
> > Please mention the logger config change in the release notes.
Done.
- Zameer
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54269/#review157687
-----------------------------------------------------------
On Dec. 1, 2016, 3:13 p.m., Zameer Manji wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54269/
> -----------------------------------------------------------
>
> (Updated Dec. 1, 2016, 3:13 p.m.)
>
>
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Stephan Erb.
>
>
> Bugs: AURORA-1831
> https://issues.apache.org/jira/browse/AURORA-1831
>
>
> Repository: aurora
>
>
> Description
> -------
>
> This patch makes two logging performance changes.
>
> First, it reduces the cost of logging by replacing the costly class and line
> patterns with the cheaper logger pattern. We lose line numbers and inner class
> information for much cheaper logging.
>
> Before
> ````
> I1201 15:08:40.560 [AsyncProcessor-0, StateMachine$Builder:389] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
> ````
>
> After
> ````
> I1201 15:06:47.181 [AsyncProcessor-0, StateMachine] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
> ````
>
> Second, it reduces the verbosity of the `TaskStateMachine` logging where it logs
> the work command from the transition. I don't think there is any operator value
> in logging this (unlike the task state transitions) so I have lowered it to
> debug.
>
> Performance Before:
>
> ````
> Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 2.510 � 0.557 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 0.272 � 0.030 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.053 � 0.011 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 10 2.446 � 0.698 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 10 0.246 � 0.018 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 10 0.041 � 0.006 ops/s
> ````
>
> Performance After:
> ````
> Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 14.520 � 5.696 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 1.290 � 0.361 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.254 � 0.097 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 5 7.303 � 5.662 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 5 0.726 � 0.624 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 5 0.124 � 0.058 ops/s
> ````
>
> There is a performance improvement in the smaller case and no noticable
> degredation in the larger cases. I also verified on a small cluster that the
> improvements exist for the larger cases. I am unable to reduce the error bars
> locally on the `InsertPendingTasksBenchmark`. I suspect the bencmark needs to be
> tweaked to be more consistent.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java 23f256a7d467c79dcd5d906f76af4f0261dfd81d
> src/main/resources/logback.xml 84c175cec811fd95db44fd8dfcd514385606042d
>
> Diff: https://reviews.apache.org/r/54269/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Zameer Manji
>
>
Re: Review Request 54269: Improve scheduling throughput via logging
changes.
Posted by Stephan Erb <se...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54269/#review157687
-----------------------------------------------------------
Fix it, then Ship it!
Lgtm. Two comments below.
src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java (line 474)
<https://reviews.apache.org/r/54269/#comment228312>
Please use the logging builtin formatting rather than doing the string concatenate all the time.
Please mention the logger config change in the release notes.
- Stephan Erb
On Dez. 2, 2016, 12:13 vorm., Zameer Manji wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54269/
> -----------------------------------------------------------
>
> (Updated Dez. 2, 2016, 12:13 vorm.)
>
>
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Stephan Erb.
>
>
> Bugs: AURORA-1831
> https://issues.apache.org/jira/browse/AURORA-1831
>
>
> Repository: aurora
>
>
> Description
> -------
>
> This patch makes two logging performance changes.
>
> First, it reduces the cost of logging by replacing the costly class and line
> patterns with the cheaper logger pattern. We lose line numbers and inner class
> information for much cheaper logging.
>
> Before
> ````
> I1201 15:08:40.560 [AsyncProcessor-0, StateMachine$Builder:389] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
> ````
>
> After
> ````
> I1201 15:06:47.181 [AsyncProcessor-0, StateMachine] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
> ````
>
> Second, it reduces the verbosity of the `TaskStateMachine` logging where it logs
> the work command from the transition. I don't think there is any operator value
> in logging this (unlike the task state transitions) so I have lowered it to
> debug.
>
> Performance Before:
>
> ````
> Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 2.510 � 0.557 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 0.272 � 0.030 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.053 � 0.011 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 10 2.446 � 0.698 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 10 0.246 � 0.018 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 10 0.041 � 0.006 ops/s
> ````
>
> Performance After:
> ````
> Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 14.520 � 5.696 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 1.290 � 0.361 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.254 � 0.097 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 5 7.303 � 5.662 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 5 0.726 � 0.624 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 5 0.124 � 0.058 ops/s
> ````
>
> There is a performance improvement in the smaller case and no noticable
> degredation in the larger cases. I also verified on a small cluster that the
> improvements exist for the larger cases. I am unable to reduce the error bars
> locally on the `InsertPendingTasksBenchmark`. I suspect the bencmark needs to be
> tweaked to be more consistent.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java 23f256a7d467c79dcd5d906f76af4f0261dfd81d
> src/main/resources/logback.xml 84c175cec811fd95db44fd8dfcd514385606042d
>
> Diff: https://reviews.apache.org/r/54269/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Zameer Manji
>
>
Re: Review Request 54269: Improve scheduling throughput via logging
changes.
Posted by Aurora ReviewBot <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54269/#review157838
-----------------------------------------------------------
Master (3ea0331) is green with this patch.
./build-support/jenkins/build.sh
However, it appears that it might lack test coverage.
I will refresh this build result if you post a review containing "@ReviewBot retry"
- Aurora ReviewBot
On Dec. 2, 2016, 9:51 p.m., Zameer Manji wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54269/
> -----------------------------------------------------------
>
> (Updated Dec. 2, 2016, 9:51 p.m.)
>
>
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Stephan Erb.
>
>
> Bugs: AURORA-1831
> https://issues.apache.org/jira/browse/AURORA-1831
>
>
> Repository: aurora
>
>
> Description
> -------
>
> This patch makes two logging performance changes.
>
> First, it reduces the cost of logging by replacing the costly class and line
> patterns with the cheaper logger pattern. We lose line numbers and inner class
> information for much cheaper logging.
>
> Before
> ````
> I1201 15:08:40.560 [AsyncProcessor-0, StateMachine$Builder:389] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
> ````
>
> After
> ````
> I1201 15:06:47.181 [AsyncProcessor-0, StateMachine] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
> ````
>
> Second, it reduces the verbosity of the `TaskStateMachine` logging where it logs
> the work command from the transition. I don't think there is any operator value
> in logging this (unlike the task state transitions) so I have lowered it to
> debug.
>
> Performance Before:
>
> ````
> Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 2.510 � 0.557 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 0.272 � 0.030 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.053 � 0.011 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 10 2.446 � 0.698 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 10 0.246 � 0.018 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 10 0.041 � 0.006 ops/s
> ````
>
> Performance After:
> ````
> Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 14.520 � 5.696 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 1.290 � 0.361 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.254 � 0.097 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 5 7.303 � 5.662 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 5 0.726 � 0.624 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 5 0.124 � 0.058 ops/s
> ````
>
> There is a performance improvement in the smaller case and no noticable
> degredation in the larger cases. I also verified on a small cluster that the
> improvements exist for the larger cases. I am unable to reduce the error bars
> locally on the `InsertPendingTasksBenchmark`. I suspect the bencmark needs to be
> tweaked to be more consistent.
>
>
> Diffs
> -----
>
> RELEASE-NOTES.md 7a3d33106c08a1cf690b3fce922000433d61bc62
> src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java 23f256a7d467c79dcd5d906f76af4f0261dfd81d
> src/main/resources/logback.xml 84c175cec811fd95db44fd8dfcd514385606042d
>
> Diff: https://reviews.apache.org/r/54269/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Zameer Manji
>
>
Re: Review Request 54269: Improve scheduling throughput via logging
changes.
Posted by Zameer Manji <zm...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54269/
-----------------------------------------------------------
(Updated Dec. 2, 2016, 1:51 p.m.)
Review request for Aurora, David McLaughlin, Joshua Cohen, and Stephan Erb.
Bugs: AURORA-1831
https://issues.apache.org/jira/browse/AURORA-1831
Repository: aurora
Description
-------
This patch makes two logging performance changes.
First, it reduces the cost of logging by replacing the costly class and line
patterns with the cheaper logger pattern. We lose line numbers and inner class
information for much cheaper logging.
Before
````
I1201 15:08:40.560 [AsyncProcessor-0, StateMachine$Builder:389] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
````
After
````
I1201 15:06:47.181 [AsyncProcessor-0, StateMachine] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
````
Second, it reduces the verbosity of the `TaskStateMachine` logging where it logs
the work command from the transition. I don't think there is any operator value
in logging this (unlike the task state transitions) so I have lowered it to
debug.
Performance Before:
````
Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 2.510 � 0.557 ops/s
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 0.272 � 0.030 ops/s
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.053 � 0.011 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 10 2.446 � 0.698 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 10 0.246 � 0.018 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 10 0.041 � 0.006 ops/s
````
Performance After:
````
Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 14.520 � 5.696 ops/s
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 1.290 � 0.361 ops/s
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.254 � 0.097 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 5 7.303 � 5.662 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 5 0.726 � 0.624 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 5 0.124 � 0.058 ops/s
````
There is a performance improvement in the smaller case and no noticable
degredation in the larger cases. I also verified on a small cluster that the
improvements exist for the larger cases. I am unable to reduce the error bars
locally on the `InsertPendingTasksBenchmark`. I suspect the bencmark needs to be
tweaked to be more consistent.
Diffs (updated)
-----
RELEASE-NOTES.md 7a3d33106c08a1cf690b3fce922000433d61bc62
src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java 23f256a7d467c79dcd5d906f76af4f0261dfd81d
src/main/resources/logback.xml 84c175cec811fd95db44fd8dfcd514385606042d
Diff: https://reviews.apache.org/r/54269/diff/
Testing
-------
Thanks,
Zameer Manji
Re: Review Request 54269: Improve scheduling throughput via logging
changes.
Posted by Aurora ReviewBot <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54269/#review157674
-----------------------------------------------------------
Master (16e4651) is green with this patch.
./build-support/jenkins/build.sh
However, it appears that it might lack test coverage.
I will refresh this build result if you post a review containing "@ReviewBot retry"
- Aurora ReviewBot
On Dec. 1, 2016, 11:13 p.m., Zameer Manji wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54269/
> -----------------------------------------------------------
>
> (Updated Dec. 1, 2016, 11:13 p.m.)
>
>
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Stephan Erb.
>
>
> Bugs: AURORA-1831
> https://issues.apache.org/jira/browse/AURORA-1831
>
>
> Repository: aurora
>
>
> Description
> -------
>
> This patch makes two logging performance changes.
>
> First, it reduces the cost of logging by replacing the costly class and line
> patterns with the cheaper logger pattern. We lose line numbers and inner class
> information for much cheaper logging.
>
> Before
> ````
> I1201 15:08:40.560 [AsyncProcessor-0, StateMachine$Builder:389] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
> ````
>
> After
> ````
> I1201 15:06:47.181 [AsyncProcessor-0, StateMachine] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
> ````
>
> Second, it reduces the verbosity of the `TaskStateMachine` logging where it logs
> the work command from the transition. I don't think there is any operator value
> in logging this (unlike the task state transitions) so I have lowered it to
> debug.
>
> Performance Before:
>
> ````
> Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 2.510 � 0.557 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 0.272 � 0.030 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.053 � 0.011 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 10 2.446 � 0.698 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 10 0.246 � 0.018 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 10 0.041 � 0.006 ops/s
> ````
>
> Performance After:
> ````
> Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 14.520 � 5.696 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 1.290 � 0.361 ops/s
> StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.254 � 0.097 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 5 7.303 � 5.662 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 5 0.726 � 0.624 ops/s
> StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 5 0.124 � 0.058 ops/s
> ````
>
> There is a performance improvement in the smaller case and no noticable
> degredation in the larger cases. I also verified on a small cluster that the
> improvements exist for the larger cases. I am unable to reduce the error bars
> locally on the `InsertPendingTasksBenchmark`. I suspect the bencmark needs to be
> tweaked to be more consistent.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java 23f256a7d467c79dcd5d906f76af4f0261dfd81d
> src/main/resources/logback.xml 84c175cec811fd95db44fd8dfcd514385606042d
>
> Diff: https://reviews.apache.org/r/54269/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Zameer Manji
>
>
Re: Review Request 54269: Improve scheduling throughput via logging
changes.
Posted by Zameer Manji <zm...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54269/
-----------------------------------------------------------
(Updated Dec. 1, 2016, 3:13 p.m.)
Review request for Aurora, David McLaughlin, Joshua Cohen, and Stephan Erb.
Bugs: AURORA-1831
https://issues.apache.org/jira/browse/AURORA-1831
Repository: aurora
Description (updated)
-------
This patch makes two logging performance changes.
First, it reduces the cost of logging by replacing the costly class and line
patterns with the cheaper logger pattern. We lose line numbers and inner class
information for much cheaper logging.
Before
````
I1201 15:08:40.560 [AsyncProcessor-0, StateMachine$Builder:389] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
````
After
````
I1201 15:06:47.181 [AsyncProcessor-0, StateMachine] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
````
Second, it reduces the verbosity of the `TaskStateMachine` logging where it logs
the work command from the transition. I don't think there is any operator value
in logging this (unlike the task state transitions) so I have lowered it to
debug.
Performance Before:
````
Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 2.510 � 0.557 ops/s
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 0.272 � 0.030 ops/s
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.053 � 0.011 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 10 2.446 � 0.698 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 10 0.246 � 0.018 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 10 0.041 � 0.006 ops/s
````
Performance After:
````
Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 14.520 � 5.696 ops/s
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 1.290 � 0.361 ops/s
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.254 � 0.097 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 5 7.303 � 5.662 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 5 0.726 � 0.624 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 5 0.124 � 0.058 ops/s
````
There is a performance improvement in the smaller case and no noticable
degredation in the larger cases. I also verified on a small cluster that the
improvements exist for the larger cases. I am unable to reduce the error bars
locally on the `InsertPendingTasksBenchmark`. I suspect the bencmark needs to be
tweaked to be more consistent.
Diffs
-----
src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java 23f256a7d467c79dcd5d906f76af4f0261dfd81d
src/main/resources/logback.xml 84c175cec811fd95db44fd8dfcd514385606042d
Diff: https://reviews.apache.org/r/54269/diff/
Testing
-------
Thanks,
Zameer Manji
Re: Review Request 54269: Improve scheduling throughput via logging
changes.
Posted by Zameer Manji <zm...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54269/
-----------------------------------------------------------
(Updated Dec. 1, 2016, 3:13 p.m.)
Review request for Aurora, David McLaughlin, Joshua Cohen, and Stephan Erb.
Bugs: AURORA-1831
https://issues.apache.org/jira/browse/AURORA-1831
Repository: aurora
Description (updated)
-------
This patch makes two logging performance changes.
First, it reduces the cost of logging by replacing the costly class and line
patterns with the cheaper logger pattern. We lose line numbers and inner class
information for much cheaper logging.
Before
````
I1201 15:08:40.560 [AsyncProcessor-0, StateMachine$Builder:389] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
````
After
````
I1201 15:06:47.181 [AsyncProcessor-0, StateMachine] SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
````
Second, it reduces the verbosity of the `TaskStateMachine` logging where it logs
the work command from the transition. I don't think there is any operator value
in logging this (unlike the task state transitions) so I have lowered it to
debug.
Performance Before:
````
Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 2.510 � 0.557 ops/s
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 0.272 � 0.030 ops/s
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.053 � 0.011 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 10 2.446 � 0.698 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 10 0.246 � 0.018 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 10 0.041 � 0.006 ops/s
````
Performance After:
````
Benchmark (numPendingTasks) (numTasksToDelete) Mode Cnt Score Error Units
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 1000 thrpt 10 14.520 � 5.696 ops/s
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 10000 thrpt 10 1.290 � 0.361 ops/s
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A 50000 thrpt 10 0.254 � 0.097 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000 N/A thrpt 5 7.303 � 5.662 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000 N/A thrpt 5 0.726 � 0.624 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000 N/A thrpt 5 0.124 � 0.058 ops/s
````
There is a performance improvement in the smaller case and no noticable
degredation in the larger cases. I also verified on a small cluster that the
improvements exist for the larger cases. I am unable to reduce the error bars
locally on the `InsertPendingTasksBenchmark`. I suspect the bencmark needs to be
tweaked to be more consistent.
Diffs
-----
src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java 23f256a7d467c79dcd5d906f76af4f0261dfd81d
src/main/resources/logback.xml 84c175cec811fd95db44fd8dfcd514385606042d
Diff: https://reviews.apache.org/r/54269/diff/
Testing
-------
Thanks,
Zameer Manji