You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2021/06/14 07:43:46 UTC

[GitHub] [accumulo] ctubbsii commented on issue #1973: Exception during scan due to cached block eviction (Possible recurrence of issue described in #774)

ctubbsii commented on issue #1973:
URL: https://github.com/apache/accumulo/issues/1973#issuecomment-859241448


   I was looking into this, and I think I was wrong about the preventative measure. It looks like it is already the case that none of these properties will have an effect until the tserver is restarted anyway. The only one that could affect things is enabling or disabling the use of the cache, and that is a per-table setting that should be possible to turn on and off at runtime.
   
   Looking into the stack trace a bit more, I started wondering if the `cacheId` wasn't unique. Unfortunately, the exceptions don't include the `blockName`, which they should.
   
   Additionally, I saw that at least in one case, `SummaryReader.load()`, there is no `cacheId` being set at all, so it should be `null`, which seems prone to error, since that's clearly not unique. The `CacheableBuilder` object looks like a builder pattern, but doesn't enforce required fields, and doesn't have any validation function to ensure it was used correctly, so `cacheId` could easily be set to `null`. However, even if that is a problem for `SummaryReader.load()` (@keith-turner implemented summaries, so he might have some insight into that), it doesn't seem like it would cause the stack trace reported in this issue, which doesn't mention summaries at all.
   
   If the problem is the `cacheId` is not unique (`null` or otherwise), then the block name looked up could have lots of collisions for the same offset (in `CacheableBlockFile.Reader.getMetaBlock()`, the block name is determined with `String _lookup = this.cacheId + "R" + offset;`).
   
   There are a few things that can/should be done:
   
   1. Improve the exception messages to include the `blockName` to make debugging this easier; @keith-ratcliffe is this issue reproducible? If so, what blockName is printed with the IllegalStateExceptions?
   2. Ensure `CacheableBuilder` enforces the requirement for a non-null `cacheId` by adding a `build()` function with validation, like a true builder pattern (might not fix the problem)
   
   While digging into this, I noticed there were a few classes in the stack trace that I don't have the source to to track down further.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org