You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Michael Ditum <mi...@mikeditum.co.uk> on 2019/08/29 18:49:34 UTC

Decompression of svndiff data failed

Hi,

We've been running a subversion server since 2003, over the years we've
come across the svndiff decompression error numerous times, it's almost
always when we're committing binary files to the repository. Historically
we've always noticed it quite quickly and fixed the issue by deleting the
offending file and re-committing it. This solved the immediate problem
where other people could not checkout or update the repo but it's left the
offending revisions in our repo.

As we're running a ridiculously old subversion server (1.4.4) on a
ridiculously old operating system (Fedora 7) I've decided it's time to
migrate to a new server. I'm also looking at making read only mirrors
available in our other offices to save time checking out applications over
the WAN and to provide geographically separate backups. In order to do this
I've been using svnsync to replicate a copy of the repository to a
different server.

As I've come across the svndiff errors I've been using the fsfsverify.py to
fix them and so far it's worked and I've got to revision 24953 of 83080.
Unfortunately I have now got to a revision that fsfsverify.py has been
unable to fix, it keeps on detecting problems and fixing them but does this
with ever changing offsets and amount until it gets to the point where it
just constantly cycles between 2 different sets of offfsets and amounts.

The error fsfsverify originally comes out with is...

Error InvalidCompressedStream: Invalid compressed instr stream at offset
154848839 (Error -3 while decompressing: incorrect header check)

Unfortunately the binary file that got corrupted was a copy of an open
office install package which makes the revision 150MB and thus is rather
tricky to fix by hand.

Apart from using fsfsverify I also tried recreating the diff by creating a
Fedora 7 VM, running svnsync on it to copy the repo up to that point and
then manually committing the file and copying the revision over to the copy
of my original repo. Whilst this allows svnsync to get past that revision I
then started having lots of problems with incorrect byte offsets in later
revisions and once I (think I correctly) fixed started getting checksum
errors. Once I (again think I correctly) fixed those I then got another
error. At which point I decided I was disappearing down a rabbit hole and
gave up.

Does anyone have any ideas on how I can fix this revision? As I mentioned
before, the file gets deleted a couple of revisions later so I don't really
care about the contents of the revision but I'm currently stuck and can't
get any further in my svnsync.

Thanks in advance for your help!

Mike

Re: Decompression of svndiff data failed

Posted by Nathan Hartman <ha...@gmail.com>.
On Thu, Aug 29, 2019 at 4:21 PM Nathan Hartman <ha...@gmail.com>
wrote:

> On Thu, Aug 29, 2019 at 3:15 PM Michael Ditum <mi...@mikeditum.co.uk>
> wrote:
>
>> Thanks for the response, I hadn't tried it! I've just given that a go and
>> unfortunately the dump command failed with...
>>
>> [mike@tigger svn]$ svnadmin dump svnroot > svnroot.dump
>> * Dumped revision 1.
>> ...snip...
>> * Dumped revision 24950.
>> * Dumped revision 24951.
>> * Dumped revision 24952.
>> svnadmin: E140001: zlib (uncompress): corrupt data: Decompression of
>> svndiff data failed
>>
>
> So it fails on the same revision.
>
> I need to think about this some more.
>

Here's an idea, but with the caveat that I never tried this myself, so I
don't know whether it works or how well. A google search turns up a script
called svn2svn.

It looks like this might be the original author, last updated 7 years ago:
https://github.com/dblock/svn2svn

And it looks like this might be a newer version, forked by a different
author and with the last updates dated 2016:
https://github.com/tonyduckles/svn2svn

The idea is to automatically check out each revision from the old
repository in sequence and commit it to a new repository, with the added
twist of skipping the revisions that fail to check out properly because of
the decompression error. Perhaps there's a way to commit a "placeholder" to
the new repository in those cases, so that all of your revision numbers
will remain identical after migration to the new server.

Assuming this works -- again, I've never done this! -- I like this idea
because it avoids doing delicate surgery on dumpfiles and things like that,
and because you would retain history and not lose information, even if you
might lose the revisions that you've never been able to checkout anyway.

One issue I see is that the newer "tonyduckles" version says it requires
minimum Subversion 1.6.

Maybe try to contact the author(s) of svn2svn and ask some questions?

Re: Decompression of svndiff data failed

Posted by Nathan Hartman <ha...@gmail.com>.
On Thu, Aug 29, 2019 at 3:15 PM Michael Ditum <mi...@mikeditum.co.uk> wrote:

> Thanks for the response, I hadn't tried it! I've just given that a go and
> unfortunately the dump command failed with...
>
> [mike@tigger svn]$ svnadmin dump svnroot > svnroot.dump
> * Dumped revision 1.
> ...snip...
> * Dumped revision 24950.
> * Dumped revision 24951.
> * Dumped revision 24952.
> svnadmin: E140001: zlib (uncompress): corrupt data: Decompression of
> svndiff data failed
>

So it fails on the same revision.

I need to think about this some more.

Re: Decompression of svndiff data failed

Posted by Michael Ditum <mi...@mikeditum.co.uk>.
Thanks for the response, I hadn't tried it! I've just given that a go and
unfortunately the dump command failed with...

[mike@tigger svn]$ svnadmin dump svnroot > svnroot.dump
* Dumped revision 1.
...snip...
* Dumped revision 24950.
* Dumped revision 24951.
* Dumped revision 24952.
svnadmin: E140001: zlib (uncompress): corrupt data: Decompression of
svndiff data failed

Mike

On Thu, 29 Aug 2019 at 20:04, Nathan Hartman <ha...@gmail.com>
wrote:

> On Thu, Aug 29, 2019 at 2:49 PM Michael Ditum <mi...@mikeditum.co.uk>
> wrote:
>
>> As we're running a ridiculously old subversion server (1.4.4) on a
>> ridiculously old operating system (Fedora 7) I've decided it's time to
>> migrate to a new server.
>>
>
> Have you tried to do a dump from 1.4.4 and load on a newer version of
> Subversion? If so, did this process succeed?
>
>

Re: Decompression of svndiff data failed

Posted by Nathan Hartman <ha...@gmail.com>.
On Thu, Aug 29, 2019 at 2:49 PM Michael Ditum <mi...@mikeditum.co.uk> wrote:

> As we're running a ridiculously old subversion server (1.4.4) on a
> ridiculously old operating system (Fedora 7) I've decided it's time to
> migrate to a new server.
>

Have you tried to do a dump from 1.4.4 and load on a newer version of
Subversion? If so, did this process succeed?

Re: Decompression of svndiff data failed

Posted by Michael Ditum <mi...@mikeditum.co.uk>.
The decompression errors only seem to happen when we're sending binary
data. For a couple of years our marketing team were storing all of their
files in subversion and this seems to be the vast majority of the revisions
I'm having to fix with fsfsverify.py. So it could possibly be that they
were using a TortoiseSVN version that was built with a buggy library.

That said the problematic revision that this thread is about was one
created by our engineering team and all of our work is done from linux
machines so would have been a different client (cli subversion client). So
it sounds more likely that it's something server side. We do still get the
issue periodically, so there's a chance it could be related to the NAS that
the repo is stored on. We did recently move to a new NAS as the old one was
getting a bit slow but we hadn't seen any corruption issues with the old
NAS, just slowness. I'm not sure if we've had any svndiff issues since
we've moved to the new NAS, I'll find out when I've finally got to the
point where it's syncing commits from the last month.

Also the issue would happen straight away. i.e. if someone tried to do an
svn up immediately after a commit had been made they would get the svndiff
error. So it seems like it was an error that occurred when the svnserve
process received it or wrote it to the file system. So it's not hard disk
corruption after the fact (but again doesn't rule out a NAS issue).

Sorry for not being able to provide more specific details. If I see it
happening again once we've moved over to a modern OS and Subversion server
I'll email back again.

Mike

On Fri, 30 Aug 2019 at 14:22, Branko Čibej <br...@apache.org> wrote:

> On 30.08.2019 15:14, Michael Ditum wrote:
> > Hi Brane,
> >
> > Thanks for the reply. Interestingly Daniel's reply had given me the
> > idea to try pretty much what you suggested and I gave it a go this
> > morning and it seems to be working.
> >
> > Stopping svnsync in the right place wasn't hard as i dies as soon as
> > it tried to get the binary diff but before it's made any changes.
> >
> > The one bit I didn't do was update the svnsync metadata. When I
> > resumed the sync it just automatically carried on with the revision
> > after the one I had just committed. Hopefully that won't cause any
> > problems, it seems to be working ok as I'm a lot further syncing than
> > I've ever managed before (crazy how many times we've had svndiff
> > errors, luckily fsfsverify has fixed all of the others so far).
> >
> > Thanks for everyone's help!
>
> Great that it works. :)
>
> I'm curious though ... have you any idea what caused the decompression
> errors? The message you posted came from zlib -- not Subversion's code
> -- and that has been very, very stable literally for decades.
>
> Is it possible that you just had the bad luck to have a broken version
> of zlib, way back in the dawn of time? If it had been a problem with the
> storage, I'm pretty sure you'd have noticed other issues, too.
>
> -- Brane
>
>

Re: Decompression of svndiff data failed

Posted by Branko Čibej <br...@apache.org>.
On 30.08.2019 15:14, Michael Ditum wrote:
> Hi Brane,
>
> Thanks for the reply. Interestingly Daniel's reply had given me the
> idea to try pretty much what you suggested and I gave it a go this
> morning and it seems to be working.
>
> Stopping svnsync in the right place wasn't hard as i dies as soon as
> it tried to get the binary diff but before it's made any changes.
>
> The one bit I didn't do was update the svnsync metadata. When I
> resumed the sync it just automatically carried on with the revision
> after the one I had just committed. Hopefully that won't cause any
> problems, it seems to be working ok as I'm a lot further syncing than
> I've ever managed before (crazy how many times we've had svndiff
> errors, luckily fsfsverify has fixed all of the others so far).
>
> Thanks for everyone's help!

Great that it works. :)

I'm curious though ... have you any idea what caused the decompression
errors? The message you posted came from zlib -- not Subversion's code
-- and that has been very, very stable literally for decades.

Is it possible that you just had the bad luck to have a broken version
of zlib, way back in the dawn of time? If it had been a problem with the
storage, I'm pretty sure you'd have noticed other issues, too.

-- Brane


Re: Decompression of svndiff data failed

Posted by Michael Ditum <mi...@mikeditum.co.uk>.
Hi Brane,

Thanks for the reply. Interestingly Daniel's reply had given me the idea to
try pretty much what you suggested and I gave it a go this morning and it
seems to be working.

Stopping svnsync in the right place wasn't hard as i dies as soon as it
tried to get the binary diff but before it's made any changes.

The one bit I didn't do was update the svnsync metadata. When I resumed the
sync it just automatically carried on with the revision after the one I had
just committed. Hopefully that won't cause any problems, it seems to be
working ok as I'm a lot further syncing than I've ever managed before
(crazy how many times we've had svndiff errors, luckily fsfsverify has
fixed all of the others so far).

Thanks for everyone's help!

Mike

On Fri, 30 Aug 2019 at 12:12, Branko Čibej <br...@apache.org> wrote:

> On 29.08.2019 20:49, Michael Ditum wrote:
> > Apart from using fsfsverify I also tried recreating the diff by
> > creating a Fedora 7 VM, running svnsync on it to copy the repo up to
> > that point and then manually committing the file and copying the
> > revision over to the copy of my original repo.
>
> Yikes. No, that definitely won't work.
>
> > Whilst this allows svnsync to get past that revision I then started
> > having lots of problems with incorrect byte offsets in later revisions
> > and once I (think I correctly) fixed started getting checksum errors.
>
> And that's why ... binary deltas rely on previously stored data, but
> unlike a text diff they have no context. You changed the source of the
> delta and that corrupted everything that depends on it in later revisions.
>
>
> > Does anyone have any ideas on how I can fix this revision? As I
> > mentioned before, the file gets deleted a couple of revisions later so
> > I don't really care about the contents of the revision but I'm
> > currently stuck and can't get any further in my svnsync.
>
> Daniel made the best suggestion, it would work like this:
>
>   * create a new repository
>   * svnsync up to the revision just before the broken one (stopping
>     svnsync is the tricky part here)
>   * commit that one file to the _synced_ repository, and update
>     svnsync's metadata (in revision properties on r0) to skip the
>     offending revision on the next run
>   * svnsync to the end.
>
> You can do a similar trick with svnadmin dump and (incremental) load;
> the benefit of this is that there is no "stopping problem," but it will
> be much, much slower than svnsync. You _could_ combine both methods,
> i.e., initialize your target repo with a partial dump of the source up
> to your offending rX, then commit rX to the target repo, then svnsync
> from there.
>
> -- Brane
>
>

Re: Decompression of svndiff data failed

Posted by Branko Čibej <br...@apache.org>.
On 29.08.2019 20:49, Michael Ditum wrote:
> Apart from using fsfsverify I also tried recreating the diff by
> creating a Fedora 7 VM, running svnsync on it to copy the repo up to
> that point and then manually committing the file and copying the
> revision over to the copy of my original repo.

Yikes. No, that definitely won't work.

> Whilst this allows svnsync to get past that revision I then started
> having lots of problems with incorrect byte offsets in later revisions
> and once I (think I correctly) fixed started getting checksum errors.

And that's why ... binary deltas rely on previously stored data, but
unlike a text diff they have no context. You changed the source of the
delta and that corrupted everything that depends on it in later revisions.


> Does anyone have any ideas on how I can fix this revision? As I
> mentioned before, the file gets deleted a couple of revisions later so
> I don't really care about the contents of the revision but I'm
> currently stuck and can't get any further in my svnsync.

Daniel made the best suggestion, it would work like this:

  * create a new repository
  * svnsync up to the revision just before the broken one (stopping
    svnsync is the tricky part here)
  * commit that one file to the _synced_ repository, and update
    svnsync's metadata (in revision properties on r0) to skip the
    offending revision on the next run
  * svnsync to the end.

You can do a similar trick with svnadmin dump and (incremental) load;
the benefit of this is that there is no "stopping problem," but it will
be much, much slower than svnsync. You _could_ combine both methods,
i.e., initialize your target repo with a partial dump of the source up
to your offending rX, then commit rX to the target repo, then svnsync
from there.

-- Brane


Re: Decompression of svndiff data failed

Posted by Nico Kadel-Garcia <nk...@gmail.com>.
Your history is toast. Stop burning cycles. Lock your working server, Take clean working copies of the important branches if you can get them, the clean working copies, import them to a new server, and start with a much lighter working repository.

And *stop* putting binaries in the same repository as source code. History is usually much more important for source, binaries harshly burden all source control systems, and even the smallest errors corrupt them irretrievably. If you *must* store binaries, be sure to commit them in entirely distinct commits from the source code, to avoid precisely this problem,

Sent from my iPhone


Re: Decompression of svndiff data failed

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Michael Ditum wrote on Thu, 29 Aug 2019 18:49 +00:00:
> Does anyone have any ideas on how I can fix this revision? As I 
> mentioned before, the file gets deleted a couple of revisions later so 
> I don't really care about the contents of the revision but I'm 
> currently stuck and can't get any further in my svnsync.

Use authz on the source repository to hide that file from the user
svnsync authenticates as.

If you re-committed that file a few revisions later and want to preserve
_that_ copy, then stop svnsync before the revision that re-creates the
file.  The easiest way to do this is to ^C svnsync, but you can also
install a start-commit hook on the destination repository that aborts if
HEAD is r1233, where the file was re-committed in r1234.

However, there's a catch.  If you re-committed the file by doing 'svn
rm' and 'svn add' as two separate commits, or under different names,
this approach would work; but if you did the 'rm' and 'add' as a
_single_ commit, under the same name, authz alone won't help.  If
that's the case, let us know.

Cheers,

Daniel

P.S. A word of warning: the trick of using start-commit to prevent r1234
from being committed only relies on a non-obvious property of
svnsync mirrors: they never have two in-flight commit attempts
simultaneously.  The trick won't work on repositories whose use pattern
doesn't have this property.