You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2018/06/20 19:38:11 UTC

[GitHub] ctubbsii commented on issue #535: WAL recovery enhancements and tooling

ctubbsii commented on issue #535: WAL recovery enhancements and tooling
URL: https://github.com/apache/accumulo/issues/535#issuecomment-398871117
 
 
   I spoke to @keith-turner at length about this, and we (mostly Keith) came to the conclusion that these two errors *might* occur if you stop writing to one tablet, but continue writing mutations to another, and the logs roll over. In both cases, these exceptions could be thrown as a false positive, when there is no data to recover for that tablet, because the WAL containing the `COMPACTION_START` event could have been garbage collected.
   
   Worse, this scenario may not be tested for, because our continuous ingest tests don't ever stop writing to a tablet.
   
   The workaround would be to inspect the WALs and verify that there is no data for the tablet which produced the exception during recovery, and remove the entries in the affected tablet, and to repeat for each affected tablet. This is not ideal, but if somebody can verify that this is what is happening (it's still just speculation right now), we could proceed with a fix for 1.9.2. The good news is that there shouldn't be any data loss, if this is what is happening. It's just an error when there's no data necessary to recover.
   
   Some possible fixes we discussed, if the issue can be verified:
   
   1. Check that there are no data events in the WALs for that tablet, before throwing the exception.
   2. Don't mark a WAL inactive prematurely, even if it has only a `COMPACTION_START` event with no data.
   
   More investigation is needed to verify the problem, and possible fixes, though.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services