You are viewing a plain text version of this content. The canonical link for it is here.
Posted to xindice-dev@xml.apache.org by Jim Wissner <ji...@jbrix.org> on 2003/03/18 04:19:52 UTC

Please - need help with fault tolerance issues...

Hello,

I made a post about fault tolerance in BTree and its subclasses, but didn't 
get any replies.  Since then I've done a fair bit of research into these 
classes.

There are points of vulnerability within the code that make it corruptible 
in repeatable ways.  Granted, it is under stressful circumstances, but for 
any degree of fault tolerance they must be taken into account.

The two primary scenarios I have been replicating are (1) IO error to due 
inadequate space to grow/write to disk (either disk full or quota reached), 
and (2) abnormal termination mid-write.

Forcing these scenarios typically renders the entire file unreadable, and 
results in total data loss (not counting manual data rescue).

What I'm trying to do is figure out a strategy for making the files 
tolerant to such faults, and self-healing.  As I said before, I very very 
very much welcome any ideas and hope that someone will voice their opinion, 
since most of you are much more well versed with the code than myself and 
must have thought of these issues.

Working from a page level, I think it is possible to make pages safe from 
the above problems by (in general) enlarging the file header by the page 
size + room for status and offset bytes, and then prior to each page write, 
copy there the offset and existing contents of the page to be written onto, 
and then update the status as "ok" only after the page is successfully 
written.  This has the negative side effect of decreasing write 
speed.  However this may be ok - there are many applications that would 
happily trade an acceptable percentage of write speed in exchange for fault 
tolerance.  Read speed would be unaffected.  If a crash occurs, on startup 
it is possible to reconstruct the page as it was before the failed write 
attempt occurred.

The question is, what does this mean for tree nodes (and values in the case 
of BTreeFiler) that may be written across multiple pages?  The only 
solution I have thought of so far is to use some kind of checksumming 
method, in which a checksum precedes a value, and thus can detect problems 
upon reading.  Setting aside the obvious performance issues of this, I'm 
not even sure if it would solve BTreeNodes that span multiple pages.

So my question is: is it even *possible* to retain tree integrity by 
ensuring page-level integrity?

Any answers/opinions/other ideas are GREATLY encouraged!!  I need the help 
of you experts!!

Also - are there any graphical docs illustrating the structure of the core 
files as they are maintained?  (that is, the BTree file and its components 
- file header/pages/nodes/etc).

Thanks,
Jim


--
jim@jbrix.org

Visit www.jbrix.org for:
   + SpeedJAVA jEdit Code Completion Plugin
   + Xybrix XML Application Framework
   + other great Open Source Software