You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Gabor Bota (JIRA)" <ji...@apache.org> on 2018/05/07 08:43:00 UTC

[jira] [Commented] (HADOOP-13649) s3guard: implement time-based (TTL) expiry for LocalMetadataStore

    [ https://issues.apache.org/jira/browse/HADOOP-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16465636#comment-16465636 ] 

Gabor Bota commented on HADOOP-13649:
-------------------------------------

Thanks for the review.
 # I've created HADOOP-15423 to merge the two caches into one.
 # .expireAfterWrite() vs .expireAfterAccess()
 ** I think that access could be better in this situation, as long as there's no
modification in the underlying bucket from another client - so no one else is modifying the s3
bucket like deleting files while the cache is in use - that way we can
say that the cache is up to date.
This store is only used for testing right now, so I can say that's right to choose expireAfterAccess.
 # Locking
 ** The com.google.common.cache.LocalCache has locking for write (e.g put, replace, remove) but not for simple read (getIfPresent).
 ** LocalMetadataStore has a lock for read too: synchronized (this) in get().
 ** As the merge of the two caches will happen in HADOOP-15423, I think that's a topic to discuss further on that issue.
 # Performance testing
 ** I've done some performance testing to compare the cache vs hash performance.
 ** I hope that used sane parameters during the tests.
 ** Based on this, there will be some performance decrease with this implementation, but nothing significant with the basic test settings - in my tests I've modified the settings a little bit. Move() performance should improve when merging the caches - it will be interesting to compare what's happening after that change.
 ** Test results are in the following gist: [https://gist.github.com/bgaborg/2220fd53e553ec971c8edd1adf2493cd] 

> s3guard: implement time-based (TTL) expiry for LocalMetadataStore
> -----------------------------------------------------------------
>
>                 Key: HADOOP-13649
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13649
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.0.0-beta1
>            Reporter: Aaron Fabbri
>            Assignee: Gabor Bota
>            Priority: Minor
>         Attachments: HADOOP-13649.001.patch, HADOOP-13649.002.patch
>
>
> LocalMetadataStore is primarily a reference implementation for testing.  It may be useful in narrow circumstances where the workload can tolerate short-term lack of inter-node consistency:  Being in-memory, one JVM/node's LocalMetadataStore will not see another node's changes to the underlying filesystem.
> To put a bound on the time during which this inconsistency may occur, we should implement time-based (a.k.a. Time To Live / TTL)  expiration for LocalMetadataStore



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org