You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "ivakegg (via GitHub)" <gi...@apache.org> on 2023/08/25 14:35:29 UTC

[GitHub] [accumulo] ivakegg commented on issue #3724: Consider adding metadata tablet transaction logging

ivakegg commented on issue #3724:
URL: https://github.com/apache/accumulo/issues/3724#issuecomment-1693463618

   Capturing the discussion from the accumulo channel in the ASF slack:
   
   > 
   Ivarator
     [18 hours ago](https://the-asf.slack.com/archives/CERNB8NDC/p1692908949704919?thread_ts=1692709052.736109&cid=CERNB8NDC)
   After many discussions, I am now thinking along the lines of some fault tolerant coding here.  I think the metadata/in-memory checker should keeps a transaction log on the side.  Whenever things are determined to be consistent it can essentially clear that log.  However when it finds an inconsistency, it can essentially replay that log against what it last knew to be correct to determine what the state of affairs should be.  Then the appropriate reparations can be made.
   
   
   HP
     [18 hours ago](https://the-asf.slack.com/archives/CERNB8NDC/p1692909248027599?thread_ts=1692709052.736109&cid=CERNB8NDC)
   It would be cool if it could be smart enough to log results of the decision like "Metadata still had a reference to a file we successfully major compacted"  which is basically what we've had to manually search for to confirm we haven't lost data.
   Still chewing on potential edge cases like when splits / merges occur...
   
   
   
   
   
   Ivarator
     [18 hours ago](https://the-asf.slack.com/archives/CERNB8NDC/p1692909353286359?thread_ts=1692709052.736109&cid=CERNB8NDC)
   agreed.... messages to help pinpoint the error.
   
   
   Keith Turner
     [18 hours ago](https://the-asf.slack.com/archives/CERNB8NDC/p1692910049730589?thread_ts=1692709052.736109&cid=CERNB8NDC)
   One way to implement this would be to have a per tablet set of files that were compacted away.  This set would be cleared on each successful consistency check.  Unsuccessful checks could use the set to compute better messages.
   
   
   Ivarator
     [18 hours ago](https://the-asf.slack.com/archives/CERNB8NDC/p1692910103192139?thread_ts=1692709052.736109&cid=CERNB8NDC)
   I expect files that are compacted away would certainly be part of this transaction log.
   
   
   Ivarator
     [18 hours ago](https://the-asf.slack.com/archives/CERNB8NDC/p1692910149781299?thread_ts=1692709052.736109&cid=CERNB8NDC)
   I will work this tomorrow I think.  In the meantime I look forward to any headway on #3721
   
   
   ctubbsii
   :mask:  [18 hours ago](https://the-asf.slack.com/archives/CERNB8NDC/p1692910270011849?thread_ts=1692709052.736109&cid=CERNB8NDC)
   A per-tablet set of files that were compacted away will complicate the no-chop merges feature, because a file might get compacted due to one range entry, but might need to stick around with a different range that has yet to be compacted.
   
   
   ctubbsii
   :mask:  [18 hours ago](https://the-asf.slack.com/archives/CERNB8NDC/p1692910335942879?thread_ts=1692709052.736109&cid=CERNB8NDC)
   That could be mitigated by forcing the compaction code to always compact all ranges for a given file, whenever that file is included in a compaction.
   
   
   Keith Turner
     [18 hours ago](https://the-asf.slack.com/archives/CERNB8NDC/p1692910359144889?thread_ts=1692709052.736109&cid=CERNB8NDC)
   For that case, it would probably key the set on file+range like it does when tracking the tablets current files
   
   
   ctubbsii
   :mask:  [18 hours ago](https://the-asf.slack.com/archives/CERNB8NDC/p1692910424802869?thread_ts=1692709052.736109&cid=CERNB8NDC)
   Yeah, key on StoredTabletFile. A lot of those internal structures were changed in 3.0, though. I'm not sure if this idea would target 2.1 or 3.1. If it's targeting 2.1, it may look very different in 3.1 (and because 3.0 is non-LTM, it wouldn't have the feature at all).
   
   
   Chris Shannon
     [4 hours ago](https://the-asf.slack.com/archives/CERNB8NDC/p1692960509933319?thread_ts=1692709052.736109&cid=CERNB8NDC)
   If we tracked the set of files based on StoredTabletFile I don't think it would be too much different with the no-chop-merge changes. The internal structure was changed in 3.0 but both 2.1 and 3.0 still only rely on the path as the key for uniqueness (well really it's the file metadata which is just the path.)  With my no-chop-merges branch and changes coming we add the concept of a range to go along with the path to determine a unique file but that is transparent and internal to the StoredTabletFile structure (technically it's part of all TabletFiles). So if we just use StoredTabletFile object when tracking in the set then I would think any code written would work for the most part without too many changes.
   
   
   Chris Shannon
     [4 hours ago](https://the-asf.slack.com/archives/CERNB8NDC/p1692960704319349?thread_ts=1692709052.736109&cid=CERNB8NDC)
   One of the benefits of treating each combination of path + range as a unique file (vs the original attempt of having one file track a collection of ranges) is most of the code just "works" transparently. Any code that uses a tablet file doesn't really care the paths are the same with different ranges, it just treats them as unique files so things like splits, compactions, etc just work without much modification and i would expect something similar here (edited)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org