You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2008/12/09 20:02:25 UTC
[Hadoop Wiki] Update of "Hbase/ColumnCaching" by jgray
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The following page has been changed by jgray:
http://wiki.apache.org/hadoop/Hbase/ColumnCaching
New page:
Key/Val Cache
Overall goal:
To early out at as many places as possible.
Overall structure:
The cache is implemented in Memcache.java as an extra step to find the requested columns
Add early outs in HStore.java to check if the data needed is returned from memCache.
Add logger for stats, to see hits and misses in the cache.
Structure of cache:
cacheKey(row+column) -> Value
//and maybe row:family -> fullFlag(to indicate that all columns of the latest timestamp is in cache)
and a hashset with the rows:families that are full
cacheKey class with byte[] row, byte[] column
and compareTo functions, to sort the inputs, should look like HStoreKey, but minimized since there is
no need for timestamp and versions and so on, to save memory.
//Need a way to remove data from cache, when it is full, easiest probably to have a seperate list(FIFO) with just the
//keys.
Start with soft references
Input From:
get - memCache
Check if key is in the cache, otherwise get it from mapfiles, put in cache on the way out
Need to check the the number of versions wanted is 1, to use the k/V cache
getFull(multiple columns) - memCache
Check if keys are in cache, otherwise get them from mapfiles, put in cache in way out
getFull(all columns) - memCache
Check if fullFlag is set, get all columns from cache, if set abort, otherwise fetch remaining from mapfiles, put them in cache and set flag on way out
deleteAll - done via add in memCache
Delete row from cache, return ?
deleteFamily - done via add in memCache
Delete row from cache, return ?
add - memCache
Check fullFlag, if set apply all the changes to row, otherwise do nothing to cache.
batchUpdate - done via add in memCache
Check fullFlag, if set apply all the changes to row, otherwise do nothing to cache.
Things to change:
In MemCache.getFull(), add early out if everything was found in memCache, do not need to check snapShot?