You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Ryan Hunt <rh...@hp.com> on 2003/11/19 23:17:39 UTC

Compressed Dump Files

I have found that by sending the dump files through gzip i can save 
about 50% of the space.

-rw-rw-r--    1 rhunt    users    710977745 Nov 19 13:51 svn-dumpfile_2
-rw-rw-r--    1 rhunt    users    314348415 Nov 19 13:49 
svn-dumpfile_2.gz

I can then load with out any problems by:

zcat svn-dumpfile_2.gz | svnadmin load /path/to/repos

However, if I try pipe directly to gzip I get a failure.

=> svnadmin dump /Users/rhunt/tmp/svn_test --revision 1 --incremental | 
gzip -9 - | svnadmin load /Users/rhunt/tmp/svn_test2
* Dumped revision 1.
svn: Incomplete data
svn: Premature end of content data in dumpstream.

When piped directly to gzip and then directed to a file there is always 
a 15 byte discrepancy between it and a file that was dumped directly 
and then sent through gzip.

-rw-r--r--   1 rhunt  staff        285 Nov 19 14:06 svn-dumpfile_1.gz
-rw-r--r--   1 rhunt  staff        270 Nov 19 15:41 svn_dump_1.gz
-
Anyone have any ideas why a direct pipe doesn't work??

Any thoughts on incorporating gzip compression directly into the dump 
and load processes??

-Ryan


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Compressed Dump Files

Posted by Ryan Hunt <rh...@hp.com>.
I realize that SSH has its own compression built in, but I have never 
been able to get it better than gzip -9.  Additionally if you are 
writing it to a file on the other end. The compression through SSH wont 
work as the far end of the tunnel will automatically uncompress.

-Ryan

On Wednesday, November 19, 2003, at 06:25  PM, Daniel Berlin wrote:

>
>
> On Wed, 19 Nov 2003, Ryan Hunt wrote:
>
>> Right I can  see my mistake on the pipeline now, but I guess I didn't
>> state the original problem as well as I should have...
>>
>> The benefit of this would be in remote piping. Like the following:
>
> Except SSH can already do compression on it's own (using the same
> algorithm gzip does, through zlib).
>
> And it's better at it than doing "gzip -9 -"
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Compressed Dump Files

Posted by Ryan Hunt <rh...@hp.com>.
I realize that SSH has its own compression built in, but I have never 
been able to get it better than gzip -9.  Additionally if you are 
writing it to a file on the other end. The compression through SSH wont 
work as the far end of the tunnel will automatically uncompress.

-Ryan

On Wednesday, November 19, 2003, at 06:25  PM, Daniel Berlin wrote:

>
>
> On Wed, 19 Nov 2003, Ryan Hunt wrote:
>
>> Right I can  see my mistake on the pipeline now, but I guess I didn't
>> state the original problem as well as I should have...
>>
>> The benefit of this would be in remote piping. Like the following:
>
> Except SSH can already do compression on it's own (using the same
> algorithm gzip does, through zlib).
>
> And it's better at it than doing "gzip -9 -"
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Compressed Dump Files

Posted by Daniel Berlin <db...@dberlin.org>.

On Wed, 19 Nov 2003, Ryan Hunt wrote:

> Right I can  see my mistake on the pipeline now, but I guess I didn't
> state the original problem as well as I should have...
>
> The benefit of this would be in remote piping. Like the following:

Except SSH can already do compression on it's own (using the same
algorithm gzip does, through zlib).

And it's better at it than doing "gzip -9 -"

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Compressed Dump Files

Posted by Daniel Berlin <db...@dberlin.org>.

On Wed, 19 Nov 2003, Ryan Hunt wrote:

> Right I can  see my mistake on the pipeline now, but I guess I didn't
> state the original problem as well as I should have...
>
> The benefit of this would be in remote piping. Like the following:

Except SSH can already do compression on it's own (using the same
algorithm gzip does, through zlib).

And it's better at it than doing "gzip -9 -"

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Compressed Dump Files

Posted by Ryan Hunt <rh...@hp.com>.
Right I can  see my mistake on the pipeline now, but I guess I didn't 
state the original problem as well as I should have...

The benefit of this would be in remote piping. Like the following:

svnadmin dump /Users/rhunt/tmp/svn_test --revision 1 --incremental | 
gzip -9 - | ssh remotehost "cat > /path/to/backups/svn-dump_1.gz"
or
svnadmin dump /Users/rhunt/tmp/svn_test --revision 1 --incremental | 
gzip -9 - | ssh remotehost "zcat | svnadmin load /path/to/new/repos"

It looks as if all the errors were self inflicted...

Thanks for the extra set of eyes...


-Ryan


On Wednesday, November 19, 2003, at 05:02  PM, Brian Mathis wrote:

> Ryan Hunt wrote:
> [...]
>> However, if I try pipe directly to gzip I get a failure.
>> => svnadmin dump /Users/rhunt/tmp/svn_test --revision 1 --incremental 
>> | gzip -9 - | svnadmin load /Users/rhunt/tmp/svn_test2
>> * Dumped revision 1.
>> svn: Incomplete data
>> svn: Premature end of content data in dumpstream.
>
>
> Maybe I don't understand this line here, but it looks like you're 
> trying to pipe the compressed gzip data directly into 'svnadmin load'? 
>  That doesn't really make any sense.  I think you want:
>
> > svnadmin dump /Users/rhunt/tmp/svn_test --revision 1 --incremental | 
> gzip -9 | gzip -d | svnadmin load /Users/rhunt/tmp/svn_test2
>
> Though I'm not really sure of the value of doing this.
>
> [...]
>> -Ryan
>
> -- 
> Brian Mathis
> http://directedge.com/b/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: users-help@subversion.tigris.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Compressed Dump Files

Posted by Ryan Hunt <rh...@hp.com>.
Right I can  see my mistake on the pipeline now, but I guess I didn't 
state the original problem as well as I should have...

The benefit of this would be in remote piping. Like the following:

svnadmin dump /Users/rhunt/tmp/svn_test --revision 1 --incremental | 
gzip -9 - | ssh remotehost "cat > /path/to/backups/svn-dump_1.gz"
or
svnadmin dump /Users/rhunt/tmp/svn_test --revision 1 --incremental | 
gzip -9 - | ssh remotehost "zcat | svnadmin load /path/to/new/repos"

It looks as if all the errors were self inflicted...

Thanks for the extra set of eyes...


-Ryan


On Wednesday, November 19, 2003, at 05:02  PM, Brian Mathis wrote:

> Ryan Hunt wrote:
> [...]
>> However, if I try pipe directly to gzip I get a failure.
>> => svnadmin dump /Users/rhunt/tmp/svn_test --revision 1 --incremental 
>> | gzip -9 - | svnadmin load /Users/rhunt/tmp/svn_test2
>> * Dumped revision 1.
>> svn: Incomplete data
>> svn: Premature end of content data in dumpstream.
>
>
> Maybe I don't understand this line here, but it looks like you're 
> trying to pipe the compressed gzip data directly into 'svnadmin load'? 
>  That doesn't really make any sense.  I think you want:
>
> > svnadmin dump /Users/rhunt/tmp/svn_test --revision 1 --incremental | 
> gzip -9 | gzip -d | svnadmin load /Users/rhunt/tmp/svn_test2
>
> Though I'm not really sure of the value of doing this.
>
> [...]
>> -Ryan
>
> -- 
> Brian Mathis
> http://directedge.com/b/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: users-help@subversion.tigris.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Compressed Dump Files

Posted by Brian Mathis <bm...@directedge.com>.
Ryan Hunt wrote:
[...]
> However, if I try pipe directly to gzip I get a failure.
> 
> => svnadmin dump /Users/rhunt/tmp/svn_test --revision 1 --incremental | 
> gzip -9 - | svnadmin load /Users/rhunt/tmp/svn_test2
> * Dumped revision 1.
> svn: Incomplete data
> svn: Premature end of content data in dumpstream.


Maybe I don't understand this line here, but it looks like you're trying 
to pipe the compressed gzip data directly into 'svnadmin load'?  That 
doesn't really make any sense.  I think you want:

 > svnadmin dump /Users/rhunt/tmp/svn_test --revision 1 --incremental | 
gzip -9 | gzip -d | svnadmin load /Users/rhunt/tmp/svn_test2

Though I'm not really sure of the value of doing this.

[...]
> 
> -Ryan

-- 
Brian Mathis
http://directedge.com/b/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Compressed Dump Files

Posted by Vincent Lefevre <vi...@vinc17.org>.
On 2003-11-19 18:02:41 -0600, C. Michael Pilato wrote:
> Ryan Hunt <rh...@hp.com> writes:
> > Any thoughts on incorporating gzip compression directly into the dump
> > and load processes??
> 
> No.  That's why your system has gzip on it.  And zip.  And bzip2.
> Flexibility, baby.

I agree. However, a dedicated compression (basically with contents
reordering) would be much better, as even bzip2 -9 compression isn't
very good (due to the limitation on the block size IMHO).

-- 
Vincent Lefèvre <vi...@vinc17.org> - Web: <http://www.vinc17.org/> - 100%
validated (X)HTML - Acorn Risc PC, Yellow Pig 17, Championnat International
des Jeux Mathématiques et Logiques, TETRHEX, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Compressed Dump Files

Posted by Vincent Lefevre <vi...@vinc17.org>.
On 2003-11-19 18:02:41 -0600, C. Michael Pilato wrote:
> Ryan Hunt <rh...@hp.com> writes:
> > Any thoughts on incorporating gzip compression directly into the dump
> > and load processes??
> 
> No.  That's why your system has gzip on it.  And zip.  And bzip2.
> Flexibility, baby.

I agree. However, a dedicated compression (basically with contents
reordering) would be much better, as even bzip2 -9 compression isn't
very good (due to the limitation on the block size IMHO).

-- 
Vincent Lefèvre <vi...@vinc17.org> - Web: <http://www.vinc17.org/> - 100%
validated (X)HTML - Acorn Risc PC, Yellow Pig 17, Championnat International
des Jeux Mathématiques et Logiques, TETRHEX, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Compressed Dump Files

Posted by "C. Michael Pilato" <cm...@collab.net>.
Ryan Hunt <rh...@hp.com> writes:

> Anyone have any ideas why a direct pipe doesn't work??

A direct pipe does work.  But not if you toss gzip compression into
the middle of the pipe.  Well, unless you do:

  svnadmin dump | gzip - | gzip -dc | svnadmin load

But then, you sure are wasting a lot of processing time. :-)

> Any thoughts on incorporating gzip compression directly into the dump
> and load processes??

No.  That's why your system has gzip on it.  And zip.  And bzip2.
Flexibility, baby.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Compressed Dump Files

Posted by "C. Michael Pilato" <cm...@collab.net>.
Ryan Hunt <rh...@hp.com> writes:

> Anyone have any ideas why a direct pipe doesn't work??

A direct pipe does work.  But not if you toss gzip compression into
the middle of the pipe.  Well, unless you do:

  svnadmin dump | gzip - | gzip -dc | svnadmin load

But then, you sure are wasting a lot of processing time. :-)

> Any thoughts on incorporating gzip compression directly into the dump
> and load processes??

No.  That's why your system has gzip on it.  And zip.  And bzip2.
Flexibility, baby.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Compressed Dump Files

Posted by Robert Spier <rs...@pobox.com>.
> However, if I try pipe directly to gzip I get a failure.
> 
> => svnadmin dump /Users/rhunt/tmp/svn_test --revision 1 --incremental | 
> gzip -9 - | svnadmin load /Users/rhunt/tmp/svn_test2
> * Dumped revision 1.
> svn: Incomplete data
> svn: Premature end of content data in dumpstream.

> Anyone have any ideas why a direct pipe doesn't work??

Because you're not decompressing the data.  You're feeding the
compressed stream directly to svnadmin.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Compressed Dump Files

Posted by Brian Mathis <bm...@directedge.com>.
Ryan Hunt wrote:
[...]
> However, if I try pipe directly to gzip I get a failure.
> 
> => svnadmin dump /Users/rhunt/tmp/svn_test --revision 1 --incremental | 
> gzip -9 - | svnadmin load /Users/rhunt/tmp/svn_test2
> * Dumped revision 1.
> svn: Incomplete data
> svn: Premature end of content data in dumpstream.


Maybe I don't understand this line here, but it looks like you're trying 
to pipe the compressed gzip data directly into 'svnadmin load'?  That 
doesn't really make any sense.  I think you want:

 > svnadmin dump /Users/rhunt/tmp/svn_test --revision 1 --incremental | 
gzip -9 | gzip -d | svnadmin load /Users/rhunt/tmp/svn_test2

Though I'm not really sure of the value of doing this.

[...]
> 
> -Ryan

-- 
Brian Mathis
http://directedge.com/b/

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org