You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2020/11/16 17:37:21 UTC

[GitHub] [accumulo] milleruntime opened a new issue #1781: Optimize TableId cache performance

milleruntime opened a new issue #1781:
URL: https://github.com/apache/accumulo/issues/1781


   The TableId class uses a guava cache for all accesses to table ID and it currently uses `weakValues()`.  [According to the Guava javadoc](https://guava.dev/releases/28.0-jre/api/docs/com/google/common/cache/CacheBuilder.html#weakValues--), `weakValues` are poor for caching and `softValues` should be considered instead.  It would be good to compare the performance of the two to see which is better.  Since a cluster with high table turnover may see different performance than one that maintains the same set of tables, we may want to make this configurable/pluggable.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on issue #1781: Optimize TableId cache performance

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on issue #1781:
URL: https://github.com/apache/accumulo/issues/1781#issuecomment-728239006


   I will add some comments.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on issue #1781: Optimize TableId cache performance

Posted by GitBox <gi...@apache.org>.
milleruntime commented on issue #1781:
URL: https://github.com/apache/accumulo/issues/1781#issuecomment-728235040


   OK sounds good.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on issue #1781: Optimize TableId cache performance

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on issue #1781:
URL: https://github.com/apache/accumulo/issues/1781#issuecomment-728231263


   SoftReferences are cleaned up at the discretion of the Garbage Collector. They are kept around longer, which is better for caching, because it is more likely you'll get a cache hit. However, that's not how we're using this cache. We're using WeakReferences, specifically because we're only using it for deduplication/canonicalization, and *not* cache hit performance.  The javadoc for WeakReference says "Weak references are most often used to implement canonicalizing mappings.", which is how we are using it. The javadoc for SoftReference says "Soft references are most often used to implement memory-sensitive caches.", which is *not* how we are using it.
   
   I think what we have is correct. Also, this is a slight internal optimization so we don't have a ton of internal objects referring to the same table id, essentially the same as String interning. It is not something that would be worth making pluggable.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on issue #1781: Optimize TableId cache performance

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on issue #1781:
URL: https://github.com/apache/accumulo/issues/1781#issuecomment-728250376


   Added comments in #1782


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii edited a comment on issue #1781: Optimize TableId cache performance

Posted by GitBox <gi...@apache.org>.
ctubbsii edited a comment on issue #1781:
URL: https://github.com/apache/accumulo/issues/1781#issuecomment-728231263


   SoftReferences are cleaned up at the discretion of the Garbage Collector. They are kept around longer, which is better for caching, because it is more likely you'll get a cache hit. However, that's not how we're using this cache. We're using WeakReferences, specifically because we're only using it for deduplication/canonicalization, and *not* cache hit performance.  The javadoc for WeakReference says "Weak references are most often used to implement canonicalizing mappings.", which is how we are using it. The javadoc for SoftReference says "Soft references are most often used to implement memory-sensitive caches.", which is *not* how we are using it.
   
   I think what we have is correct. Also, this is a slight internal optimization so we don't have a ton of internal objects referring to the same table id (canonicalization for storage optimization), essentially the same as String interning. It is not something that is cached for performance/speed, as it is not speed-critical code, and I certainly don't think it would be worth making pluggable.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on issue #1781: Optimize TableId cache performance

Posted by GitBox <gi...@apache.org>.
milleruntime commented on issue #1781:
URL: https://github.com/apache/accumulo/issues/1781#issuecomment-728236123


   It is probably worth mentioning this in the code since there are no comments about the cacheing in the class.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime closed issue #1781: Optimize TableId cache performance

Posted by GitBox <gi...@apache.org>.
milleruntime closed issue #1781:
URL: https://github.com/apache/accumulo/issues/1781


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org