You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@aurora.apache.org by Stephan Erb <se...@apache.org> on 2018/06/14 12:14:24 UTC

Re: Review Request 66139: Speedup regular Thermos observer checkpoint refresh


> On March 21, 2018, 6:47 p.m., Santhosh Kumar Shanmugham wrote:
> > Ship It!
> 
> Santhosh Kumar Shanmugham wrote:
>     I am a little confused about the performance testing results that are posted here, since the pervious results indicated gains from 2secs to 0.2secs, while the current one is much lesser. Can you add a little bit of context regarding the results? Thanks.

Yes definitely. I somehow missed that this one here was still open.

* Code on master, give list of filesystem paths: Parse each path into its components, do some basic validation, join components back to a path string. Afterwards filter all paths that point to the same sandbox. What is expensive here is the parsing/splitting and the realpath computation needed for the duplication check.

* First version of the patch: Eliminates both sources of a slowdown and removes some code that was now used in the Aurora repository. 

* Second version of the patch: Just removes the slowdown of realpath by switching to a simpler symlink check. The parsing/splitting is still there. I decided to keep it in there as Reza indicated that you are still using this.

What came on top was that my bechmarking environment was not stable so the overall time varied across both patches.


- Stephan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66139/#review199681
-----------------------------------------------------------


On March 20, 2018, 11:41 p.m., Stephan Erb wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66139/
> -----------------------------------------------------------
> 
> (Updated March 20, 2018, 11:41 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Renan DelValle, and Reza Motamedi.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Profiling indicates that a significant part of the refresh time os spend in `os.path.realpath`. 
> This was introduced in https://reviews.apache.org/r/35580/ to properly handle the `latest`
> symlink in the Mesos folder layout. 
> 
> This patch takes a slightly different approach to solve this problem based on `os.path.islink`.
> The latter is faster as it just needs to look at a single folder rather than an entire path.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/executor/common/path_detector.py ed264d74ef5a5a7aa681a56b340f9b16504a88ad 
>   src/test/python/apache/aurora/executor/common/test_path_detector.py 7b5ef0cf552d22d4cfbf3357071de036551026dc 
> 
> 
> Diff: https://reviews.apache.org/r/66139/diff/2/
> 
> 
> Testing
> -------
> 
> I have tested this build on a node with 55 running tasks and 2004 finished ones.
> 
> Before this patch:
> 
>     D0320 22:20:44.887248 25771 task_observer.py:142] TaskObserver: finished checkpoint refresh in 0.92s
>     D0320 22:20:50.746316 25771 task_observer.py:142] TaskObserver: finished checkpoint refresh in 0.93s
>     D0320 22:20:56.590157 25771 task_observer.py:142] TaskObserver: finished checkpoint refresh in 0.89s
>  
> With this patch:
> 
>     D0320 22:18:53.545236 16250 task_observer.py:142] TaskObserver: finished checkpoint refresh in 0.48s
>     D0320 22:18:59.031919 16250 task_observer.py:142] TaskObserver: finished checkpoint refresh in 0.49s
>     D0320 22:19:04.512358 16250 task_observer.py:142] TaskObserver: finished checkpoint refresh in 0.48s
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>