You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Bernd Mathiske (JIRA)" <ji...@apache.org> on 2014/11/11 16:15:33 UTC

[jira] [Commented] (MESOS-2072) Fetcher cache eviction

    [ https://issues.apache.org/jira/browse/MESOS-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14206531#comment-14206531 ] 

Bernd Mathiske commented on MESOS-2072:
---------------------------------------

To pick eviction victims the fetcher cache should use something that approximates LRU, more precisely "least recently AND least frequently used". Even though random picking should perform better than what we have today, even the slightest attempt to avoid picking what has just been downloaded most recently should represent yet another noticeable improvement.


> Fetcher cache eviction
> ----------------------
>
>                 Key: MESOS-2072
>                 URL: https://issues.apache.org/jira/browse/MESOS-2072
>             Project: Mesos
>          Issue Type: Improvement
>          Components: fetcher, slave
>            Reporter: Bernd Mathiske
>            Assignee: Bernd Mathiske
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Delete files from the fetcher cache so that a given cache size is never exceeded. Succeed in doing so while concurrent downloads are on their way and new requests are pouring in.
> Idea: measure the size of each download before it begins, make enough room before the download. This means that only download mechanisms that divulge the size before the main download will be supported. AFAWK, those in use so far have this property. 
> The calculation of how much space to free needs to be under concurrency control, accumulating all space needed for competing, incomplete download requests. (The Python script that performs fetcher caching for Aurora does not seem to implement this. See https://gist.github.com/zmanji/f41df77510ef9d00265a, imagine several of these programs running concurrently, each one's _cache_eviction() call succeeding, each perceiving the SAME free space being available.)
> Ultimately, a conflict resolution strategy is needed if just the downloads underway already exceed the cache capacity. Then, as a fallback, direct download into the work directory will be used for some tasks. TBD how to pick which task gets treated how. 
> At first, only support copying of any downloaded files to the work directory for task execution. This isolates the task life cycle after starting a task from cache eviction considerations. 
> (Later, we can add symbolic links that avoid copying. But then eviction of fetched files used by ongoing tasks must be blocked, which adds complexity. another future extension is MESOS-1667 "Extract from URI while downloading into work dir").



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)