You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Nicolas Maquet (JIRA)" <ji...@apache.org> on 2016/02/15 22:38:18 UTC
[jira] [Created] (SAMZA-873) Avoid unnecessary flushes in
CachedStore
Nicolas Maquet created SAMZA-873:
------------------------------------
Summary: Avoid unnecessary flushes in CachedStore
Key: SAMZA-873
URL: https://issues.apache.org/jira/browse/SAMZA-873
Project: Samza
Issue Type: Improvement
Components: kv
Affects Versions: 0.10.1
Reporter: Nicolas Maquet
The class {{org.apache.samza.storage.kv.CachedStore}} is currently calling {{store.flush()}} when evicting dirty entries. This in turn causes RocksDB to flush its memtables much more than necessary, causing slowdowns.
In a mixed put / get workload, e.g. 2 gets for 1 put with an object cache size of 1000, RocksDB will flush its memtable roughly every 333 calls to put(); that is every time the eldest entry from the cache is dirty. In our benchmarks, this leads to a more than 20x drop in throughput.
The attached patch fixes the issue as follows:
- {{CachedStore.put()}} no longer flushes when evicting dirty entries. It calls {{store.putAll()}} with all dirty entries and resets the dirty list and count but does not call {{store.flush()}}.
- Likewise, {{CachedStore.cache.removeEldestEntry()}} no longer flushes when evicting dirty entries but calls {{store.putAll()}} on all dirty entries and resets the dirty list and count.
- {{CachedStore.flush()}}'s behaviour is unaffected.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)