You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@aurora.apache.org by Zameer Manji <zm...@apache.org> on 2016/02/11 01:15:20 UTC

Review Request 43457: Increase throughput of DbTaskStore

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43457/
-----------------------------------------------------------

Review request for Aurora, Maxim Khutornenko and Bill Farner.


Repository: aurora


Description
-------

Profiling master indicated that the bottleneck was MyBatis populating ResultSets and populating the resulting objects. This patch removes subselects, which reduces the number of ResultSets and removes the population of an object via a constructor which is slower than populating an object via setters.


Diffs
-----

  src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssginedPort.java PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssignedTask.java 93722395ed9fcd22dcb12e34e648e6e410952d43 
  src/main/java/org/apache/aurora/scheduler/storage/db/views/DbScheduledTask.java 502a1fa6fc141df498f0f09af292ce24e269731d 
  src/main/resources/org/apache/aurora/scheduler/storage/db/TaskConfigMapper.xml b1394cf44b7ddafcbc47bb1968306d0b33293380 
  src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml ea469cce31544221c34ae05a1c65f71271985655 

Diff: https://reviews.apache.org/r/43457/diff/


Testing
-------

Master:
Benchmark                                      (numTasks)   Mode  Cnt   Score    Error  Units
TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  44.052 ± 14.689  ops/s
TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   0.179 ±  0.052  ops/s
TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.087 ±  0.022  ops/s

This Patch:
Benchmark                                      (numTasks)   Mode  Cnt   Score   Error  Units
TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  51.531 ± 7.236  ops/s
TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   7.370 ± 1.320  ops/s
TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   2.143 ± 1.234  ops/s


Thanks,

Zameer Manji

Re: Review Request 43457: Increase throughput of DbTaskStore

Posted by John Sirois <js...@apache.org>.


> On Feb. 10, 2016, 5:33 p.m., Maxim Khutornenko wrote:
> > src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml, line 83
> > <https://reviews.apache.org/r/43457/diff/1/?file=1240027#file1240027line83>
> >
> >     You may improve perf even further if you include `id` element here:
> >     
> >     "Very Important: id elements play a very important role in Nested Result mapping. You should always specify one or more properties that can be used to uniquely identify the results. The truth is that MyBatis will still work if you leave it out, but at a severe performance cost. Choose as few properties as possible that can uniquely identify the result. The primary key is an obvious choice (even if composite)."
> >     
> >     http://www.mybatis.org/mybatis-3/sqlmap-xml.html#Result_Maps
> 
> John Sirois wrote:
>     If Maxim's suggestion works, I'd rather have that fix on the books if just to help suggest <collection/> mappings should be scrutinized with care.  We're using MyBatis - I think - to be able to go lower-level than - say Hibernate - but clearly the same pitfalls are exist!

Erm, aha - LGPLv2... which I'm very happy with, but Apache is not :/


- John


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43457/#review118774
-----------------------------------------------------------


On Feb. 10, 2016, 5:35 p.m., Zameer Manji wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43457/
> -----------------------------------------------------------
> 
> (Updated Feb. 10, 2016, 5:35 p.m.)
> 
> 
> Review request for Aurora, John Sirois and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Profiling master indicated that the bottleneck was MyBatis populating ResultSets and populating the resulting objects. This patch removes subselects, which reduces the number of ResultSets and removes the population of an object via a constructor which is slower than populating an object via setters.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssginedPort.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssignedTask.java 93722395ed9fcd22dcb12e34e648e6e410952d43 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbScheduledTask.java 502a1fa6fc141df498f0f09af292ce24e269731d 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskConfigMapper.xml b1394cf44b7ddafcbc47bb1968306d0b33293380 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml ea469cce31544221c34ae05a1c65f71271985655 
> 
> Diff: https://reviews.apache.org/r/43457/diff/
> 
> 
> Testing
> -------
> 
> Master:
> Benchmark                                      (numTasks)   Mode  Cnt   Score    Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  44.052 ± 14.689  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   0.179 ±  0.052  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.087 ±  0.022  ops/s
> 
> This Patch:
> Benchmark                                      (numTasks)   Mode  Cnt   Score   Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  51.531 ± 7.236  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   7.370 ± 1.320  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   2.143 ± 1.234  ops/s
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>

Re: Review Request 43457: Increase throughput of DbTaskStore

Posted by Maxim Khutornenko <ma...@apache.org>.


> On Feb. 11, 2016, 12:33 a.m., Maxim Khutornenko wrote:
> > src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml, line 83
> > <https://reviews.apache.org/r/43457/diff/1/?file=1240027#file1240027line83>
> >
> >     You may improve perf even further if you include `id` element here:
> >     
> >     "Very Important: id elements play a very important role in Nested Result mapping. You should always specify one or more properties that can be used to uniquely identify the results. The truth is that MyBatis will still work if you leave it out, but at a severe performance cost. Choose as few properties as possible that can uniquely identify the result. The primary key is an obvious choice (even if composite)."
> >     
> >     http://www.mybatis.org/mybatis-3/sqlmap-xml.html#Result_Maps
> 
> John Sirois wrote:
>     If Maxim's suggestion works, I'd rather have that fix on the books if just to help suggest <collection/> mappings should be scrutinized with care.  We're using MyBatis - I think - to be able to go lower-level than - say Hibernate - but clearly the same pitfalls are exist!
> 
> John Sirois wrote:
>     Erm, aha - LGPLv2... which I'm very happy with, but Apache is not :/
> 
> Zameer Manji wrote:
>     There isn't a performance improvement here probably because of the `<id>` element in the `portMap` resultMap.

I'd still suggest having explicit ID columns for consistency sake with the rest of the codebase. Besides, having an ID element is a good perf protection against future changes.


- Maxim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43457/#review118774
-----------------------------------------------------------


On Feb. 11, 2016, 8:03 p.m., Zameer Manji wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43457/
> -----------------------------------------------------------
> 
> (Updated Feb. 11, 2016, 8:03 p.m.)
> 
> 
> Review request for Aurora, John Sirois and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Profiling master indicated that the bottleneck was MyBatis populating ResultSets and populating the resulting objects. This patch removes subselects, which reduces the number of ResultSets and removes the population of an object via a constructor which is slower than populating an object via setters.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssginedPort.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssignedTask.java 93722395ed9fcd22dcb12e34e648e6e410952d43 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbScheduledTask.java 502a1fa6fc141df498f0f09af292ce24e269731d 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskConfigMapper.xml b1394cf44b7ddafcbc47bb1968306d0b33293380 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml ea469cce31544221c34ae05a1c65f71271985655 
> 
> Diff: https://reviews.apache.org/r/43457/diff/
> 
> 
> Testing
> -------
> 
> Master:
> Benchmark                                      (numTasks)   Mode  Cnt   Score    Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  44.052 ± 14.689  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   0.179 ±  0.052  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.087 ±  0.022  ops/s
> 
> This Patch:
> Benchmark                                      (numTasks)   Mode  Cnt   Score   Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  51.531 ± 7.236  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   7.370 ± 1.320  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   2.143 ± 1.234  ops/s
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>

Re: Review Request 43457: Increase throughput of DbTaskStore

Posted by John Sirois <js...@apache.org>.


> On Feb. 10, 2016, 5:33 p.m., Maxim Khutornenko wrote:
> > src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml, line 83
> > <https://reviews.apache.org/r/43457/diff/1/?file=1240027#file1240027line83>
> >
> >     You may improve perf even further if you include `id` element here:
> >     
> >     "Very Important: id elements play a very important role in Nested Result mapping. You should always specify one or more properties that can be used to uniquely identify the results. The truth is that MyBatis will still work if you leave it out, but at a severe performance cost. Choose as few properties as possible that can uniquely identify the result. The primary key is an obvious choice (even if composite)."
> >     
> >     http://www.mybatis.org/mybatis-3/sqlmap-xml.html#Result_Maps

If Maxim's suggestion works, I'd rather have that fix on the books if just to help suggest <collection/> mappings should be scrutinized with care.  We're using MyBatis - I think - to be able to go lower-level than - say Hibernate - but clearly the same pitfalls are exist!


- John


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43457/#review118774
-----------------------------------------------------------


On Feb. 10, 2016, 5:35 p.m., Zameer Manji wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43457/
> -----------------------------------------------------------
> 
> (Updated Feb. 10, 2016, 5:35 p.m.)
> 
> 
> Review request for Aurora, John Sirois and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Profiling master indicated that the bottleneck was MyBatis populating ResultSets and populating the resulting objects. This patch removes subselects, which reduces the number of ResultSets and removes the population of an object via a constructor which is slower than populating an object via setters.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssginedPort.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssignedTask.java 93722395ed9fcd22dcb12e34e648e6e410952d43 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbScheduledTask.java 502a1fa6fc141df498f0f09af292ce24e269731d 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskConfigMapper.xml b1394cf44b7ddafcbc47bb1968306d0b33293380 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml ea469cce31544221c34ae05a1c65f71271985655 
> 
> Diff: https://reviews.apache.org/r/43457/diff/
> 
> 
> Testing
> -------
> 
> Master:
> Benchmark                                      (numTasks)   Mode  Cnt   Score    Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  44.052 ± 14.689  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   0.179 ±  0.052  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.087 ±  0.022  ops/s
> 
> This Patch:
> Benchmark                                      (numTasks)   Mode  Cnt   Score   Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  51.531 ± 7.236  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   7.370 ± 1.320  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   2.143 ± 1.234  ops/s
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>

Re: Review Request 43457: Increase throughput of DbTaskStore

Posted by Zameer Manji <zm...@apache.org>.


> On Feb. 10, 2016, 4:33 p.m., Maxim Khutornenko wrote:
> > src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml, line 83
> > <https://reviews.apache.org/r/43457/diff/1/?file=1240027#file1240027line83>
> >
> >     You may improve perf even further if you include `id` element here:
> >     
> >     "Very Important: id elements play a very important role in Nested Result mapping. You should always specify one or more properties that can be used to uniquely identify the results. The truth is that MyBatis will still work if you leave it out, but at a severe performance cost. Choose as few properties as possible that can uniquely identify the result. The primary key is an obvious choice (even if composite)."
> >     
> >     http://www.mybatis.org/mybatis-3/sqlmap-xml.html#Result_Maps
> 
> John Sirois wrote:
>     If Maxim's suggestion works, I'd rather have that fix on the books if just to help suggest <collection/> mappings should be scrutinized with care.  We're using MyBatis - I think - to be able to go lower-level than - say Hibernate - but clearly the same pitfalls are exist!
> 
> John Sirois wrote:
>     Erm, aha - LGPLv2... which I'm very happy with, but Apache is not :/

There isn't a performance improvement here probably because of the `<id>` element in the `portMap` resultMap.


> On Feb. 10, 2016, 4:33 p.m., Maxim Khutornenko wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/db/views/DbScheduledTask.java, line 42
> > <https://reviews.apache.org/r/43457/diff/1/?file=1240025#file1240025line42>
> >
> >     Will `ORDER BY te_timestamp` accomplish the same without explicit sorting in the mapper? If order invariant is perserved AND there is no perf impact query change may be preferrable (less code).

The `ORDER BY` does not accomplish the same thing for some reason. I will instead reaplce the sortedCopy with an inplace copy.


- Zameer


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43457/#review118774
-----------------------------------------------------------


On Feb. 10, 2016, 4:35 p.m., Zameer Manji wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43457/
> -----------------------------------------------------------
> 
> (Updated Feb. 10, 2016, 4:35 p.m.)
> 
> 
> Review request for Aurora, John Sirois and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Profiling master indicated that the bottleneck was MyBatis populating ResultSets and populating the resulting objects. This patch removes subselects, which reduces the number of ResultSets and removes the population of an object via a constructor which is slower than populating an object via setters.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssginedPort.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssignedTask.java 93722395ed9fcd22dcb12e34e648e6e410952d43 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbScheduledTask.java 502a1fa6fc141df498f0f09af292ce24e269731d 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskConfigMapper.xml b1394cf44b7ddafcbc47bb1968306d0b33293380 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml ea469cce31544221c34ae05a1c65f71271985655 
> 
> Diff: https://reviews.apache.org/r/43457/diff/
> 
> 
> Testing
> -------
> 
> Master:
> Benchmark                                      (numTasks)   Mode  Cnt   Score    Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  44.052 ± 14.689  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   0.179 ±  0.052  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.087 ±  0.022  ops/s
> 
> This Patch:
> Benchmark                                      (numTasks)   Mode  Cnt   Score   Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  51.531 ± 7.236  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   7.370 ± 1.320  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   2.143 ± 1.234  ops/s
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>

Re: Review Request 43457: Increase throughput of DbTaskStore

Posted by Maxim Khutornenko <ma...@apache.org>.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43457/#review118774
-----------------------------------------------------------




src/main/java/org/apache/aurora/scheduler/storage/db/views/DbScheduledTask.java (line 42)
<https://reviews.apache.org/r/43457/#comment180089>

    Will `ORDER BY te_timestamp` accomplish the same without explicit sorting in the mapper? If order invariant is perserved AND there is no perf impact query change may be preferrable (less code).



src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml (line 71)
<https://reviews.apache.org/r/43457/#comment180091>

    You may improve perf even further if you include `id` element here:
    
    "Very Important: id elements play a very important role in Nested Result mapping. You should always specify one or more properties that can be used to uniquely identify the results. The truth is that MyBatis will still work if you leave it out, but at a severe performance cost. Choose as few properties as possible that can uniquely identify the result. The primary key is an obvious choice (even if composite)."
    
    http://www.mybatis.org/mybatis-3/sqlmap-xml.html#Result_Maps


- Maxim Khutornenko


On Feb. 11, 2016, 12:15 a.m., Zameer Manji wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43457/
> -----------------------------------------------------------
> 
> (Updated Feb. 11, 2016, 12:15 a.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Bill Farner.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Profiling master indicated that the bottleneck was MyBatis populating ResultSets and populating the resulting objects. This patch removes subselects, which reduces the number of ResultSets and removes the population of an object via a constructor which is slower than populating an object via setters.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssginedPort.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssignedTask.java 93722395ed9fcd22dcb12e34e648e6e410952d43 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbScheduledTask.java 502a1fa6fc141df498f0f09af292ce24e269731d 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskConfigMapper.xml b1394cf44b7ddafcbc47bb1968306d0b33293380 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml ea469cce31544221c34ae05a1c65f71271985655 
> 
> Diff: https://reviews.apache.org/r/43457/diff/
> 
> 
> Testing
> -------
> 
> Master:
> Benchmark                                      (numTasks)   Mode  Cnt   Score    Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  44.052 ± 14.689  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   0.179 ±  0.052  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.087 ±  0.022  ops/s
> 
> This Patch:
> Benchmark                                      (numTasks)   Mode  Cnt   Score   Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  51.531 ± 7.236  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   7.370 ± 1.320  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   2.143 ± 1.234  ops/s
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>

Re: Review Request 43457: Increase throughput of DbTaskStore

Posted by Maxim Khutornenko <ma...@apache.org>.


> On Feb. 11, 2016, 12:27 a.m., Bill Farner wrote:
> > Please substitute Maxim for me on this review.

I am already in the list :)


- Maxim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43457/#review118775
-----------------------------------------------------------


On Feb. 11, 2016, 12:15 a.m., Zameer Manji wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43457/
> -----------------------------------------------------------
> 
> (Updated Feb. 11, 2016, 12:15 a.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Bill Farner.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Profiling master indicated that the bottleneck was MyBatis populating ResultSets and populating the resulting objects. This patch removes subselects, which reduces the number of ResultSets and removes the population of an object via a constructor which is slower than populating an object via setters.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssginedPort.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssignedTask.java 93722395ed9fcd22dcb12e34e648e6e410952d43 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbScheduledTask.java 502a1fa6fc141df498f0f09af292ce24e269731d 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskConfigMapper.xml b1394cf44b7ddafcbc47bb1968306d0b33293380 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml ea469cce31544221c34ae05a1c65f71271985655 
> 
> Diff: https://reviews.apache.org/r/43457/diff/
> 
> 
> Testing
> -------
> 
> Master:
> Benchmark                                      (numTasks)   Mode  Cnt   Score    Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  44.052 ± 14.689  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   0.179 ±  0.052  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.087 ±  0.022  ops/s
> 
> This Patch:
> Benchmark                                      (numTasks)   Mode  Cnt   Score   Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  51.531 ± 7.236  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   7.370 ± 1.320  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   2.143 ± 1.234  ops/s
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>

Re: Review Request 43457: Increase throughput of DbTaskStore

Posted by Zameer Manji <zm...@apache.org>.


> On Feb. 10, 2016, 4:27 p.m., Bill Farner wrote:
> > Please substitute Maxim for me on this review.
> 
> Maxim Khutornenko wrote:
>     I am already in the list :)

Maixm is already on this review. I have replaced you with John.


- Zameer


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43457/#review118775
-----------------------------------------------------------


On Feb. 10, 2016, 4:15 p.m., Zameer Manji wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43457/
> -----------------------------------------------------------
> 
> (Updated Feb. 10, 2016, 4:15 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Bill Farner.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Profiling master indicated that the bottleneck was MyBatis populating ResultSets and populating the resulting objects. This patch removes subselects, which reduces the number of ResultSets and removes the population of an object via a constructor which is slower than populating an object via setters.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssginedPort.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssignedTask.java 93722395ed9fcd22dcb12e34e648e6e410952d43 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbScheduledTask.java 502a1fa6fc141df498f0f09af292ce24e269731d 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskConfigMapper.xml b1394cf44b7ddafcbc47bb1968306d0b33293380 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml ea469cce31544221c34ae05a1c65f71271985655 
> 
> Diff: https://reviews.apache.org/r/43457/diff/
> 
> 
> Testing
> -------
> 
> Master:
> Benchmark                                      (numTasks)   Mode  Cnt   Score    Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  44.052 ± 14.689  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   0.179 ±  0.052  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.087 ±  0.022  ops/s
> 
> This Patch:
> Benchmark                                      (numTasks)   Mode  Cnt   Score   Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  51.531 ± 7.236  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   7.370 ± 1.320  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   2.143 ± 1.234  ops/s
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>

Re: Review Request 43457: Increase throughput of DbTaskStore

Posted by Bill Farner <wf...@apache.org>.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43457/#review118775
-----------------------------------------------------------



Please substitute Maxim for me on this review.

- Bill Farner


On Feb. 10, 2016, 4:15 p.m., Zameer Manji wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43457/
> -----------------------------------------------------------
> 
> (Updated Feb. 10, 2016, 4:15 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Bill Farner.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Profiling master indicated that the bottleneck was MyBatis populating ResultSets and populating the resulting objects. This patch removes subselects, which reduces the number of ResultSets and removes the population of an object via a constructor which is slower than populating an object via setters.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssginedPort.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssignedTask.java 93722395ed9fcd22dcb12e34e648e6e410952d43 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbScheduledTask.java 502a1fa6fc141df498f0f09af292ce24e269731d 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskConfigMapper.xml b1394cf44b7ddafcbc47bb1968306d0b33293380 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml ea469cce31544221c34ae05a1c65f71271985655 
> 
> Diff: https://reviews.apache.org/r/43457/diff/
> 
> 
> Testing
> -------
> 
> Master:
> Benchmark                                      (numTasks)   Mode  Cnt   Score    Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  44.052 ± 14.689  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   0.179 ±  0.052  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.087 ±  0.022  ops/s
> 
> This Patch:
> Benchmark                                      (numTasks)   Mode  Cnt   Score   Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  51.531 ± 7.236  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   7.370 ± 1.320  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   2.143 ± 1.234  ops/s
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>

Re: Review Request 43457: Increase throughput of DbTaskStore

Posted by John Sirois <js...@apache.org>.


> On Feb. 10, 2016, 5:23 p.m., John Sirois wrote:
> > > and removes the population of an object via a constructor which is slower than populating an object via setters.
> > 
> > For clarification, is this optimization the much smaller of the 2?  Naively - I'd hope construction would beat setters, but - if not - be _very close_.  If this latter bit was significant, my faith in the jvm is duly shaken.
> 
> Zameer Manji wrote:
>     Switching to Pair and populating via constructor:
>     ````
>     Benchmark                                      (numTasks)   Mode  Cnt   Score    Error  Units
>     TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  53.040 ± 17.152  ops/s
>     TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   3.939 ±  3.357  ops/s
>     TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.079 ±  0.020  ops/s
>     ````
>     
>     I suspect it is because if you populate via constructor you cannot use `<id>`.
> 
> John Sirois wrote:
>     Fwict only the 100000 case has a statistically significant win (the error bars in the 2nd run are quite massive) - is that worth it?  Does Aurora actually fetch 100K tasks for any operation in practice at Twitter?
> 
> Zameer Manji wrote:
>     Yes this change is worth it. At Twitter Aurora actually fetches at least 100K tasks for some operations. This is a side effect of some unscoped `fetchTask` queries we currently have.
> 
> John Sirois wrote:
>     I won't block, but you can still use id - see constructor/idArg: http://www.mybatis.org/mybatis-3/sqlmap-xml.html
>     It would be nice to either use Pairs or not use them and this will leave is with one use for task links.

OK - you apparently understood this already - it looks like id columns are only used internally by mybatis for property-style but they are expected to be a real constructor argument for constructor-style so constructor idArg is a no-go for Pair.


- John


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43457/#review118772
-----------------------------------------------------------


On Feb. 11, 2016, 1:03 p.m., Zameer Manji wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43457/
> -----------------------------------------------------------
> 
> (Updated Feb. 11, 2016, 1:03 p.m.)
> 
> 
> Review request for Aurora, John Sirois and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Profiling master indicated that the bottleneck was MyBatis populating ResultSets and populating the resulting objects. This patch removes subselects, which reduces the number of ResultSets and removes the population of an object via a constructor which is slower than populating an object via setters.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssginedPort.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssignedTask.java 93722395ed9fcd22dcb12e34e648e6e410952d43 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbScheduledTask.java 502a1fa6fc141df498f0f09af292ce24e269731d 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskConfigMapper.xml b1394cf44b7ddafcbc47bb1968306d0b33293380 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml ea469cce31544221c34ae05a1c65f71271985655 
> 
> Diff: https://reviews.apache.org/r/43457/diff/
> 
> 
> Testing
> -------
> 
> Master:
> Benchmark                                      (numTasks)   Mode  Cnt   Score    Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  44.052 ± 14.689  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   0.179 ±  0.052  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.087 ±  0.022  ops/s
> 
> This Patch:
> Benchmark                                      (numTasks)   Mode  Cnt   Score   Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  51.531 ± 7.236  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   7.370 ± 1.320  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   2.143 ± 1.234  ops/s
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>

Re: Review Request 43457: Increase throughput of DbTaskStore

Posted by John Sirois <js...@apache.org>.


> On Feb. 10, 2016, 5:23 p.m., John Sirois wrote:
> > > and removes the population of an object via a constructor which is slower than populating an object via setters.
> > 
> > For clarification, is this optimization the much smaller of the 2?  Naively - I'd hope construction would beat setters, but - if not - be _very close_.  If this latter bit was significant, my faith in the jvm is duly shaken.
> 
> Zameer Manji wrote:
>     Switching to Pair and populating via constructor:
>     ````
>     Benchmark                                      (numTasks)   Mode  Cnt   Score    Error  Units
>     TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  53.040 ± 17.152  ops/s
>     TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   3.939 ±  3.357  ops/s
>     TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.079 ±  0.020  ops/s
>     ````
>     
>     I suspect it is because if you populate via constructor you cannot use `<id>`.
> 
> John Sirois wrote:
>     Fwict only the 100000 case has a statistically significant win (the error bars in the 2nd run are quite massive) - is that worth it?  Does Aurora actually fetch 100K tasks for any operation in practice at Twitter?
> 
> Zameer Manji wrote:
>     Yes this change is worth it. At Twitter Aurora actually fetches at least 100K tasks for some operations. This is a side effect of some unscoped `fetchTask` queries we currently have.

I won't block, but you can still use id - see constructor/idArg: http://www.mybatis.org/mybatis-3/sqlmap-xml.html
It would be nice to either use Pairs or not use them and this will leave is with one use for task links.


- John


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43457/#review118772
-----------------------------------------------------------


On Feb. 11, 2016, 1:03 p.m., Zameer Manji wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43457/
> -----------------------------------------------------------
> 
> (Updated Feb. 11, 2016, 1:03 p.m.)
> 
> 
> Review request for Aurora, John Sirois and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Profiling master indicated that the bottleneck was MyBatis populating ResultSets and populating the resulting objects. This patch removes subselects, which reduces the number of ResultSets and removes the population of an object via a constructor which is slower than populating an object via setters.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssginedPort.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssignedTask.java 93722395ed9fcd22dcb12e34e648e6e410952d43 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbScheduledTask.java 502a1fa6fc141df498f0f09af292ce24e269731d 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskConfigMapper.xml b1394cf44b7ddafcbc47bb1968306d0b33293380 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml ea469cce31544221c34ae05a1c65f71271985655 
> 
> Diff: https://reviews.apache.org/r/43457/diff/
> 
> 
> Testing
> -------
> 
> Master:
> Benchmark                                      (numTasks)   Mode  Cnt   Score    Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  44.052 ± 14.689  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   0.179 ±  0.052  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.087 ±  0.022  ops/s
> 
> This Patch:
> Benchmark                                      (numTasks)   Mode  Cnt   Score   Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  51.531 ± 7.236  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   7.370 ± 1.320  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   2.143 ± 1.234  ops/s
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>

Re: Review Request 43457: Increase throughput of DbTaskStore

Posted by Zameer Manji <zm...@apache.org>.


> On Feb. 10, 2016, 4:23 p.m., John Sirois wrote:
> > > and removes the population of an object via a constructor which is slower than populating an object via setters.
> > 
> > For clarification, is this optimization the much smaller of the 2?  Naively - I'd hope construction would beat setters, but - if not - be _very close_.  If this latter bit was significant, my faith in the jvm is duly shaken.

Switching to Pair and populating via constructor:
````
Benchmark                                      (numTasks)   Mode  Cnt   Score    Error  Units
TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  53.040 ± 17.152  ops/s
TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   3.939 ±  3.357  ops/s
TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.079 ±  0.020  ops/s
````

I suspect it is because if you populate via constructor you cannot use `<id>`.


- Zameer


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43457/#review118772
-----------------------------------------------------------


On Feb. 10, 2016, 4:35 p.m., Zameer Manji wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43457/
> -----------------------------------------------------------
> 
> (Updated Feb. 10, 2016, 4:35 p.m.)
> 
> 
> Review request for Aurora, John Sirois and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Profiling master indicated that the bottleneck was MyBatis populating ResultSets and populating the resulting objects. This patch removes subselects, which reduces the number of ResultSets and removes the population of an object via a constructor which is slower than populating an object via setters.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssginedPort.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssignedTask.java 93722395ed9fcd22dcb12e34e648e6e410952d43 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbScheduledTask.java 502a1fa6fc141df498f0f09af292ce24e269731d 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskConfigMapper.xml b1394cf44b7ddafcbc47bb1968306d0b33293380 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml ea469cce31544221c34ae05a1c65f71271985655 
> 
> Diff: https://reviews.apache.org/r/43457/diff/
> 
> 
> Testing
> -------
> 
> Master:
> Benchmark                                      (numTasks)   Mode  Cnt   Score    Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  44.052 ± 14.689  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   0.179 ±  0.052  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.087 ±  0.022  ops/s
> 
> This Patch:
> Benchmark                                      (numTasks)   Mode  Cnt   Score   Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  51.531 ± 7.236  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   7.370 ± 1.320  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   2.143 ± 1.234  ops/s
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>

Re: Review Request 43457: Increase throughput of DbTaskStore

Posted by Zameer Manji <zm...@apache.org>.


> On Feb. 10, 2016, 4:23 p.m., John Sirois wrote:
> > > and removes the population of an object via a constructor which is slower than populating an object via setters.
> > 
> > For clarification, is this optimization the much smaller of the 2?  Naively - I'd hope construction would beat setters, but - if not - be _very close_.  If this latter bit was significant, my faith in the jvm is duly shaken.
> 
> Zameer Manji wrote:
>     Switching to Pair and populating via constructor:
>     ````
>     Benchmark                                      (numTasks)   Mode  Cnt   Score    Error  Units
>     TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  53.040 ± 17.152  ops/s
>     TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   3.939 ±  3.357  ops/s
>     TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.079 ±  0.020  ops/s
>     ````
>     
>     I suspect it is because if you populate via constructor you cannot use `<id>`.
> 
> John Sirois wrote:
>     Fwict only the 100000 case has a statistically significant win (the error bars in the 2nd run are quite massive) - is that worth it?  Does Aurora actually fetch 100K tasks for any operation in practice at Twitter?

Yes this change is worth it. At Twitter Aurora actually fetches at least 100K tasks for some operations. This is a side effect of some unscoped `fetchTask` queries we currently have.


- Zameer


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43457/#review118772
-----------------------------------------------------------


On Feb. 10, 2016, 4:35 p.m., Zameer Manji wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43457/
> -----------------------------------------------------------
> 
> (Updated Feb. 10, 2016, 4:35 p.m.)
> 
> 
> Review request for Aurora, John Sirois and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Profiling master indicated that the bottleneck was MyBatis populating ResultSets and populating the resulting objects. This patch removes subselects, which reduces the number of ResultSets and removes the population of an object via a constructor which is slower than populating an object via setters.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssginedPort.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssignedTask.java 93722395ed9fcd22dcb12e34e648e6e410952d43 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbScheduledTask.java 502a1fa6fc141df498f0f09af292ce24e269731d 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskConfigMapper.xml b1394cf44b7ddafcbc47bb1968306d0b33293380 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml ea469cce31544221c34ae05a1c65f71271985655 
> 
> Diff: https://reviews.apache.org/r/43457/diff/
> 
> 
> Testing
> -------
> 
> Master:
> Benchmark                                      (numTasks)   Mode  Cnt   Score    Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  44.052 ± 14.689  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   0.179 ±  0.052  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.087 ±  0.022  ops/s
> 
> This Patch:
> Benchmark                                      (numTasks)   Mode  Cnt   Score   Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  51.531 ± 7.236  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   7.370 ± 1.320  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   2.143 ± 1.234  ops/s
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>

Re: Review Request 43457: Increase throughput of DbTaskStore

Posted by John Sirois <js...@apache.org>.


> On Feb. 10, 2016, 5:23 p.m., John Sirois wrote:
> > > and removes the population of an object via a constructor which is slower than populating an object via setters.
> > 
> > For clarification, is this optimization the much smaller of the 2?  Naively - I'd hope construction would beat setters, but - if not - be _very close_.  If this latter bit was significant, my faith in the jvm is duly shaken.
> 
> Zameer Manji wrote:
>     Switching to Pair and populating via constructor:
>     ````
>     Benchmark                                      (numTasks)   Mode  Cnt   Score    Error  Units
>     TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  53.040 ± 17.152  ops/s
>     TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   3.939 ±  3.357  ops/s
>     TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.079 ±  0.020  ops/s
>     ````
>     
>     I suspect it is because if you populate via constructor you cannot use `<id>`.

Fwict only the 100000 case has a statistically significant win (the error bars in the 2nd run are quite massive) - is that worth it?  Does Aurora actually fetch 100K tasks for any operation in practice at Twitter?


- John


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43457/#review118772
-----------------------------------------------------------


On Feb. 10, 2016, 5:35 p.m., Zameer Manji wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43457/
> -----------------------------------------------------------
> 
> (Updated Feb. 10, 2016, 5:35 p.m.)
> 
> 
> Review request for Aurora, John Sirois and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Profiling master indicated that the bottleneck was MyBatis populating ResultSets and populating the resulting objects. This patch removes subselects, which reduces the number of ResultSets and removes the population of an object via a constructor which is slower than populating an object via setters.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssginedPort.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssignedTask.java 93722395ed9fcd22dcb12e34e648e6e410952d43 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbScheduledTask.java 502a1fa6fc141df498f0f09af292ce24e269731d 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskConfigMapper.xml b1394cf44b7ddafcbc47bb1968306d0b33293380 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml ea469cce31544221c34ae05a1c65f71271985655 
> 
> Diff: https://reviews.apache.org/r/43457/diff/
> 
> 
> Testing
> -------
> 
> Master:
> Benchmark                                      (numTasks)   Mode  Cnt   Score    Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  44.052 ± 14.689  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   0.179 ±  0.052  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.087 ±  0.022  ops/s
> 
> This Patch:
> Benchmark                                      (numTasks)   Mode  Cnt   Score   Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  51.531 ± 7.236  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   7.370 ± 1.320  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   2.143 ± 1.234  ops/s
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>

Re: Review Request 43457: Increase throughput of DbTaskStore

Posted by John Sirois <js...@apache.org>.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43457/#review118772
-----------------------------------------------------------



> and removes the population of an object via a constructor which is slower than populating an object via setters.

For clarification, is this optimization the much smaller of the 2?  Naively - I'd hope construction would beat setters, but - if not - be _very close_.  If this latter bit was significant, my faith in the jvm is duly shaken.

- John Sirois


On Feb. 10, 2016, 5:15 p.m., Zameer Manji wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43457/
> -----------------------------------------------------------
> 
> (Updated Feb. 10, 2016, 5:15 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Bill Farner.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Profiling master indicated that the bottleneck was MyBatis populating ResultSets and populating the resulting objects. This patch removes subselects, which reduces the number of ResultSets and removes the population of an object via a constructor which is slower than populating an object via setters.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssginedPort.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssignedTask.java 93722395ed9fcd22dcb12e34e648e6e410952d43 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbScheduledTask.java 502a1fa6fc141df498f0f09af292ce24e269731d 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskConfigMapper.xml b1394cf44b7ddafcbc47bb1968306d0b33293380 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml ea469cce31544221c34ae05a1c65f71271985655 
> 
> Diff: https://reviews.apache.org/r/43457/diff/
> 
> 
> Testing
> -------
> 
> Master:
> Benchmark                                      (numTasks)   Mode  Cnt   Score    Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  44.052 ± 14.689  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   0.179 ±  0.052  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.087 ±  0.022  ops/s
> 
> This Patch:
> Benchmark                                      (numTasks)   Mode  Cnt   Score   Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  51.531 ± 7.236  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   7.370 ± 1.320  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   2.143 ± 1.234  ops/s
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>