You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Gabor Bota (JIRA)" <ji...@apache.org> on 2018/05/07 08:43:00 UTC
[jira] [Commented] (HADOOP-13649) s3guard: implement time-based
(TTL) expiry for LocalMetadataStore
[ https://issues.apache.org/jira/browse/HADOOP-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16465636#comment-16465636 ]
Gabor Bota commented on HADOOP-13649:
-------------------------------------
Thanks for the review.
# I've created HADOOP-15423 to merge the two caches into one.
# .expireAfterWrite() vs .expireAfterAccess()
** I think that access could be better in this situation, as long as there's no
modification in the underlying bucket from another client - so no one else is modifying the s3
bucket like deleting files while the cache is in use - that way we can
say that the cache is up to date.
This store is only used for testing right now, so I can say that's right to choose expireAfterAccess.
# Locking
** The com.google.common.cache.LocalCache has locking for write (e.g put, replace, remove) but not for simple read (getIfPresent).
** LocalMetadataStore has a lock for read too: synchronized (this) in get().
** As the merge of the two caches will happen in HADOOP-15423, I think that's a topic to discuss further on that issue.
# Performance testing
** I've done some performance testing to compare the cache vs hash performance.
** I hope that used sane parameters during the tests.
** Based on this, there will be some performance decrease with this implementation, but nothing significant with the basic test settings - in my tests I've modified the settings a little bit. Move() performance should improve when merging the caches - it will be interesting to compare what's happening after that change.
** Test results are in the following gist: [https://gist.github.com/bgaborg/2220fd53e553ec971c8edd1adf2493cd]
> s3guard: implement time-based (TTL) expiry for LocalMetadataStore
> -----------------------------------------------------------------
>
> Key: HADOOP-13649
> URL: https://issues.apache.org/jira/browse/HADOOP-13649
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.0.0-beta1
> Reporter: Aaron Fabbri
> Assignee: Gabor Bota
> Priority: Minor
> Attachments: HADOOP-13649.001.patch, HADOOP-13649.002.patch
>
>
> LocalMetadataStore is primarily a reference implementation for testing. It may be useful in narrow circumstances where the workload can tolerate short-term lack of inter-node consistency: Being in-memory, one JVM/node's LocalMetadataStore will not see another node's changes to the underlying filesystem.
> To put a bound on the time during which this inconsistency may occur, we should implement time-based (a.k.a. Time To Live / TTL) expiration for LocalMetadataStore
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org