You are viewing a plain text version of this content. The canonical link for it is here.
Posted to slide-dev@jakarta.apache.org by Michael Smith <ms...@xn.com.au> on 2001/06/13 03:55:31 UTC

Failed PUTs and repository inconsistencies

Hi all,

I'm using slide with the JDBCDescriptorsStore and FileContentStore.
Sometime yesterday, a user PUT a file through the webdav servlet to
slide. Network problems at this inconvenient time resulted in the socket
timing out. The result: the FileContentStore found that the length of
the file didn't match the amount of data it was actually able to read,
so it threw an IOException, later rethrown as a ServiceAccessException
and then finally as an WebdavException with status set to 500 (Internal
server error). 

All of that is reasonable, and pretty much as expected. 

However, the end result was an inconsistent repository that caused
ongoing problems - a bad thing. After writing the file failed, nothing
further is done - the exception is just propagated up the call stack
until it comes out at the webdav servlet. This means that the on-disk
data (in this case around 200 kB) and the getcontentlength property
(which said it was something like 350 kB, I think) were out of sync.
Slide predictably doesn't like this, and starts throwing exceptions as
soon as it has to read the file (which isn't unreasonable). 

Now - the question is, how do we fix this? A comment in
FileContentStore.java says "FIXME : Delete the file". That in itself is
pretty easy to do (merely need a single line addition of file.delete()
just before we throw the IOException, or perhaps where we catch the
IOException). However, that isn't sufficient - we still have all the
metadata lying around, which we have to do something with. 

There are two or three things we can do here: one is significantly
easier, but it's not clear to me which is correct.

1) Set the getcontentlength property to what we actually wrote to the
file. Simple, effective. Solves all the problems. We can still give an
internal server error, but the file will exist in a consistent state in
the repository. Easy, but not neccesarily correct.

2) Delete the revision, along with any other sub-parts. This means that
Content.create() will become significantly more complex, but won't
really affect anything else. However, this means we'll be left with a
revisionless object (which I think is treated as an empty collection,
then). More complex, but probably a worse solution. 

3) Go the whole way and revert everything to how it was before we
started the PUT. This is very difficult, because we have multiple stages
in the PUT (i.e. modify structure, set properties, create revision(s),
write content, etc), which are completely seperate. The webdav servlet
would have to catch errors at each stage and be prepared to 'undo' all
the modifications it has made so far. Does the transactions stuff help
here at all? (I haven't looked at that stuff in detail). This is far
more complex than either of the other solutions. It's the most
'complete' fix, but it isn't (to me) clearly superior to option 1.

I've attached a (fairly trivial) fix that implements option 1, since at
this point that's my preferred option - Remy, feel free to commit it (or
ask me to) if you think this is the right thing to do.


Michael


Index: FileContentStore.java
===================================================================
RCS file:
/home/cvs/jakarta-slide/src/stores/slidestore/reference/FileContentStore.java,v
retrieving revision 1.7
diff -u -r1.7 FileContentStore.java
--- FileContentStore.java	2001/03/19 16:49:37	1.7
+++ FileContentStore.java	2001/06/13 01:54:49
@@ -314,6 +314,8 @@
                 
                 if (contentLength != -1) {
                     if (position != contentLength) {
+			// set content length so that repository is consistent
+			revisionDescriptor.setContentLength(position);
                         if (position < contentLength) {
                             // Not enough bytes read !!!
                             throw new IOException("Not enough bytes
read");

Re: Failed PUTs and repository inconsistencies

Posted by Remy Maucherat <re...@apache.org>.
> > If you're using the JDBC store, the metadata added should be removed
when
> > the transaction itself is reverted. If they aren't, then there is a
problem.
> >
> > I think you should add some traces in start, rollback and commit in the
JDBC
> > and file stores to see if everything is done the way it should be. It
could
> > be a bug in the transaction manager.
>
> Ah. I see. As I said, I'm not aware of how the transactions stuff works
> in any useful detail. Anyway, see below (short version: it's the DB
> we're using).
>
>
> > Well, if you're using JDBC stores for everything, 3) should happen
without
> > the need for you to do anything.
> > The WebDAV servlet should do :
> > - start a transaction
> > - do all the operations needed by the put
> > - commit the transaction
> > If something fails in between, the entire transaction should be reverted
and
> > the namespace should end up in the same state it was before the PUT.
> >
> > Obviously, I like 3), since in theory it's not that hard to achieve with
the
> > appropriate stores :-)
>
> 'In theory' :-)
> If we're going to rely on the stores having transaction capabilities,
> then we really have to enforce this. But we can't do that realistically,
> because many things simply don't support transactions (a filesystem is a
> good example. It would be difficult at best to make a FileContentStore
> that acted correctly with respect to transactions). A hard problem.

Yes indeed. Without transactions, the store should use a best effort policy,
definitely.

> So, the answer is: slide's doing the right thing,

Well, I'll have to check that too, just to be sure :-)

> but the
> filecontentstore is broken w.r.t to transactions (not really fixable,
> but the important cases can be worked around), and the
> JDBCDescriptorsStore doesn't actually ensure that the database
> implements transactional capabilities (not really a problem. If it
> doesn't implement something, then it doesn't implement it - and there's
> nothing we can do about that).

Yes.

> > OTOH, it's totally overkill to try to do it ourselves, without special
> > capabilities at the store level.
>
> Agreed. It's not impossible, but it'd probably complicate much of the
> core slide code by an order of magnitude.

Yes.

> > Here, I think it's a problem with the TM, or maybe your database. Which
DB
> > are you using ? Is it supposed to handle transactions well ?
>
> We're using Mysql. It doesn't support transactions. At all.
> Well, supposedly the most recent version does, but only with special
> table-types, and that's just a really evil ugly hack as far as I can
> tell. Basically, it doesn't support transactions (it definately doesn't
> without extra stuff which slide doesn't do). Yes, it's quite silly to
> have a RDBMS without transaction support. Tell that to the mysql people
> :-)

Mmmm, doesn't look too good ...
I want a XADataSource :-)

> > > I've attached a (fairly trivial) fix that implements option 1, since
at
> > > this point that's my preferred option - Remy, feel free to commit it
(or
> > > ask me to) if you think this is the right thing to do.
> >
> > Go ahead. Anything which adds some robustness is good :)
> > BTW, you don't need to ask for the permission. If I don't like a patch
(or
> > if you don't like a patch), we can talk about it, and revert it if
> > necessary.
>
> I'll commit it then. I guess it's better than nothing.

A lot.

> I know I don't _need_ to ask for permission, but I don't just don't like
> committing things that I'm not completely sure about in most cases. This
> was probably something I should have committed anyway, though.

Whatever you like better :)

Remy


Re: Failed PUTs and repository inconsistencies

Posted by Michael Smith <ms...@xn.com.au>.
> If you're using the JDBC store, the metadata added should be removed when
> the transaction itself is reverted. If they aren't, then there is a problem.
> 
> I think you should add some traces in start, rollback and commit in the JDBC
> and file stores to see if everything is done the way it should be. It could
> be a bug in the transaction manager.

Ah. I see. As I said, I'm not aware of how the transactions stuff works
in any useful detail. Anyway, see below (short version: it's the DB
we're using).


> Well, if you're using JDBC stores for everything, 3) should happen without
> the need for you to do anything.
> The WebDAV servlet should do :
> - start a transaction
> - do all the operations needed by the put
> - commit the transaction
> If something fails in between, the entire transaction should be reverted and
> the namespace should end up in the same state it was before the PUT.
> 
> Obviously, I like 3), since in theory it's not that hard to achieve with the
> appropriate stores :-)

'In theory' :-)
If we're going to rely on the stores having transaction capabilities,
then we really have to enforce this. But we can't do that realistically,
because many things simply don't support transactions (a filesystem is a
good example. It would be difficult at best to make a FileContentStore
that acted correctly with respect to transactions). A hard problem. 

So, the answer is: slide's doing the right thing, but the
filecontentstore is broken w.r.t to transactions (not really fixable,
but the important cases can be worked around), and the
JDBCDescriptorsStore doesn't actually ensure that the database
implements transactional capabilities (not really a problem. If it
doesn't implement something, then it doesn't implement it - and there's
nothing we can do about that).

> 
> OTOH, it's totally overkill to try to do it ourselves, without special
> capabilities at the store level.

Agreed. It's not impossible, but it'd probably complicate much of the
core slide code by an order of magnitude.

> 
> Here, I think it's a problem with the TM, or maybe your database. Which DB
> are you using ? Is it supposed to handle transactions well ?

We're using Mysql. It doesn't support transactions. At all.
Well, supposedly the most recent version does, but only with special
table-types, and that's just a really evil ugly hack as far as I can
tell. Basically, it doesn't support transactions (it definately doesn't
without extra stuff which slide doesn't do). Yes, it's quite silly to
have a RDBMS without transaction support. Tell that to the mysql people
:-)

> 
> > I've attached a (fairly trivial) fix that implements option 1, since at
> > this point that's my preferred option - Remy, feel free to commit it (or
> > ask me to) if you think this is the right thing to do.
> 
> Go ahead. Anything which adds some robustness is good :)
> BTW, you don't need to ask for the permission. If I don't like a patch (or
> if you don't like a patch), we can talk about it, and revert it if
> necessary.

I'll commit it then. I guess it's better than nothing.
I know I don't _need_ to ask for permission, but I don't just don't like
committing things that I'm not completely sure about in most cases. This
was probably something I should have committed anyway, though. 

Michael

Re: Failed PUTs and repository inconsistencies

Posted by Remy Maucherat <re...@apache.org>.
> Hi all,
>
> I'm using slide with the JDBCDescriptorsStore and FileContentStore.
> Sometime yesterday, a user PUT a file through the webdav servlet to
> slide. Network problems at this inconvenient time resulted in the socket
> timing out. The result: the FileContentStore found that the length of
> the file didn't match the amount of data it was actually able to read,
> so it threw an IOException, later rethrown as a ServiceAccessException
> and then finally as an WebdavException with status set to 500 (Internal
> server error).
>
> All of that is reasonable, and pretty much as expected.
>
> However, the end result was an inconsistent repository that caused
> ongoing problems - a bad thing. After writing the file failed, nothing
> further is done - the exception is just propagated up the call stack
> until it comes out at the webdav servlet. This means that the on-disk
> data (in this case around 200 kB) and the getcontentlength property
> (which said it was something like 350 kB, I think) were out of sync.
> Slide predictably doesn't like this, and starts throwing exceptions as
> soon as it has to read the file (which isn't unreasonable).
>
> Now - the question is, how do we fix this? A comment in
> FileContentStore.java says "FIXME : Delete the file". That in itself is
> pretty easy to do (merely need a single line addition of file.delete()
> just before we throw the IOException, or perhaps where we catch the
> IOException). However, that isn't sufficient - we still have all the
> metadata lying around, which we have to do something with.

If you're using the JDBC store, the metadata added should be removed when
the transaction itself is reverted. If they aren't, then there is a problem.

I think you should add some traces in start, rollback and commit in the JDBC
and file stores to see if everything is done the way it should be. It could
be a bug in the transaction manager.

> There are two or three things we can do here: one is significantly
> easier, but it's not clear to me which is correct.
>
> 1) Set the getcontentlength property to what we actually wrote to the
> file. Simple, effective. Solves all the problems. We can still give an
> internal server error, but the file will exist in a consistent state in
> the repository. Easy, but not neccesarily correct.
>
> 2) Delete the revision, along with any other sub-parts. This means that
> Content.create() will become significantly more complex, but won't
> really affect anything else. However, this means we'll be left with a
> revisionless object (which I think is treated as an empty collection,
> then). More complex, but probably a worse solution.
>
> 3) Go the whole way and revert everything to how it was before we
> started the PUT. This is very difficult, because we have multiple stages
> in the PUT (i.e. modify structure, set properties, create revision(s),
> write content, etc), which are completely seperate. The webdav servlet
> would have to catch errors at each stage and be prepared to 'undo' all
> the modifications it has made so far. Does the transactions stuff help
> here at all? (I haven't looked at that stuff in detail). This is far
> more complex than either of the other solutions. It's the most
> 'complete' fix, but it isn't (to me) clearly superior to option 1.

Well, if you're using JDBC stores for everything, 3) should happen without
the need for you to do anything.
The WebDAV servlet should do :
- start a transaction
- do all the operations needed by the put
- commit the transaction
If something fails in between, the entire transaction should be reverted and
the namespace should end up in the same state it was before the PUT.

Obviously, I like 3), since in theory it's not that hard to achieve with the
appropriate stores :-)

OTOH, it's totally overkill to try to do it ourselves, without special
capabilities at the store level.

Here, I think it's a problem with the TM, or maybe your database. Which DB
are you using ? Is it supposed to handle transactions well ?

> I've attached a (fairly trivial) fix that implements option 1, since at
> this point that's my preferred option - Remy, feel free to commit it (or
> ask me to) if you think this is the right thing to do.

Go ahead. Anything which adds some robustness is good :)
BTW, you don't need to ask for the permission. If I don't like a patch (or
if you don't like a patch), we can talk about it, and revert it if
necessary.

Remy