You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Matt imMute Sickler <im...@msk4.ath.cx> on 2008/03/10 15:25:41 UTC

dumpfile grammar

Does a formal grammar exist for the dumpfile (svnadmin dump ...) exist?
I am writing some utility scripts to mungle a few repositories, and 
munging the dumpfile streams directly would be easiest.
If the grammar is written in Parse::RecDescent bonus points will be 
awarded. :)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: dumpfile grammar

Posted by Karl Fogel <kf...@red-bean.com>.
Matt imMute Sickler <im...@msk4.ath.cx> writes:
> Does a formal grammar exist for the dumpfile (svnadmin dump ...) exist?
> I am writing some utility scripts to mungle a few repositories, and
> munging the dumpfile streams directly would be easiest.
> If the grammar is written in Parse::RecDescent bonus points will be
> awarded. :)

I think https://svn.collab.net/repos/svn/trunk/notes/dump-load-format.txt
is the closest we have.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: dumpfile grammar

Posted by "C. Michael Pilato" <cm...@collab.net>.
John Peacock wrote:
> Martin Furter wrote:
>> There are also a few other scripts on the net which parse and create 
>> subversion dumpfiles.
> 
> Yeah, like
> 
>     http://search.cpan.org/search?query=SVN::Dump
> 
> though you should check the RT queue and potentially apply one or more 
> of the patches there:
> 
>     http://rt.cpan.org/Public/Bug/Display.html?id=26386
>     http://rt.cpan.org/Public/Bug/Display.html?id=25467
>     
> I have a mostly rewritten version to deal with the fact that `svnadmin 
> dump` appends an apparently random number of blank lines between certain 
> elements (sometimes two and sometimes three).  This means that you can 
> miss out on a delete+add (i.e. rename) for certain files. :(

A decent dumpstream parser shouldn't be caring much about the number of 
blank lines, save for knowing that a single blank line is used to terminate 
a header block.  The dumpstream is defined to have two record types 
(revisions and nodes), each of which is identifiable by a particular header 
in the corresponding header block ("Revision-number" or "Node-path").  And 
header blocks are required to contain Content-length headers which indicate 
to parsers how many bytes of data related to that record immediately follow 
that header-block-ending single blank line.

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: dumpfile grammar

Posted by Ben Collins-Sussman <su...@red-bean.com>.
FYI, for those folks writing dumpfile parsers *not* in perl, python,
etc:    libsvn_repos already has a nice abstract public C API to parse
dumpfiles.  You just plug in your own C callbacks to respond to
parsing events.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: dumpfile grammar

Posted by Benjamin Smith-Mannschott <bs...@gmail.com>.
On Mar 10, 2008, at 21:09, John Peacock wrote:
> Martin Furter wrote:
>> There are also a few other scripts on the net which parse and  
>> create subversion dumpfiles.
>
> Yeah, like
>
> 	http://search.cpan.org/search?query=SVN::Dump
>
> though you should check the RT queue and potentially apply one or  
> more of the patches there:
>
> 	http://rt.cpan.org/Public/Bug/Display.html?id=26386
> 	http://rt.cpan.org/Public/Bug/Display.html?id=25467
> 	
> I have a mostly rewritten version to deal with the fact that  
> `svnadmin dump` appends an apparently random number of blank lines  
> between certain elements (sometimes two and sometimes three).  This  
> means that you can miss out on a delete+add (i.e. rename) for  
> certain files. :(

I didn't find the number of blank lines to be entirely random when I  
wrote my streaming parser for svn's dumpfile format.  I worked around  
the problem though by simply passing non-optional blank lines through  
the pipeline as explicit parse events.

You might take a look at:

http://pypi.python.org/pypi/revisionist/1.0.0

While it doesn't include a formal EBNF of the dumpfile format, the  
same is derivable from the implementation:

(1) The parser is simple recursive decent.  One method per grammar  
production.
(2) The syntax describing all possible sequences of parse events is  
specified in an EBNF-like form, thus describing the abstract structure  
of the dumpfile.
(3) The generated parse events all represented by objects whose  
__str__() methods format them for output to a dump file, so you can  
see the concrete syntax parsed here by example if the parse procedures  
leave you scratching your head.

// ben smith-mannschott


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: dumpfile grammar

Posted by John Peacock <jo...@havurah-software.org>.
Martin Furter wrote:
> There are also a few other scripts on the net which parse and create 
> subversion dumpfiles.

Yeah, like

	http://search.cpan.org/search?query=SVN::Dump

though you should check the RT queue and potentially apply one or more 
of the patches there:

	http://rt.cpan.org/Public/Bug/Display.html?id=26386
	http://rt.cpan.org/Public/Bug/Display.html?id=25467
	
I have a mostly rewritten version to deal with the fact that `svnadmin 
dump` appends an apparently random number of blank lines between certain 
elements (sometimes two and sometimes three).  This means that you can 
miss out on a delete+add (i.e. rename) for certain files. :(

JOhn

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: dumpfile grammar

Posted by Martin Furter <mf...@rola.ch>.

On Mon, 10 Mar 2008, Matt imMute Sickler wrote:

> Does a formal grammar exist for the dumpfile (svnadmin dump ...) exist?
> I am writing some utility scripts to mungle a few repositories, and munging 
> the dumpfile streams directly would be easiest.
> If the grammar is written in Parse::RecDescent bonus points will be awarded. 
> :)

Please have a look at http://svn.borg.ch/svndumptool/ . It may already 
have the features you need, or you could use it's API to read and write 
dump files.

There are also a few other scripts on the net which parse and create 
subversion dumpfiles.

HTH
Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org