You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Ramkumar Ramachandra <ar...@gmail.com> on 2010/08/10 14:02:34 UTC

svnrdump: The BIG update

Hi,

I've been putting this off for some time now- it's so much easier to
write code than to write English :p Anyway, here it is- a massive
status update.

It's been a few weeks since I got partial committer access, and ~80
commits later, this is what we have:

Firstly, thanks to Daniel for motivating me and driving me to submit
the series to the list, and guiding me through everything. Without
him, I'd probably not have finished svnrdump to begin with.

The command line interface and argument parsing library is ready-
thanks to Bert and lots of others for getting me started with
this. The interface is solid and looks like the one used in the other
SVN tools.

The dump functionality is also complete- thanks to Stefan's review and
MANY others for cleaning it up. It's however hit a brick wall now
because of missing headers in the RA layer. Until I (or someone else)
figures out how to fix the RA layer, we can't do better than the XFail
copy-and-modify test I've committed. It's quite mature and dumps
surprisingly fast though. I'm tempted to run benchmarks, but I haven't
done it yet because I fear I might be biased towards the tool :p

The load functionality is also quite complete, thanks to Bert et al
for helping me debug all the cryptic errors. The code is mostly
unreviewed though- there might be plenty of bugs and code cleanup
opportunities. Not to say that I've stopped working on it- just that
the work has become less challenging, now that all the tests pass :)

TODO:
- Write more tests and start using svnrdump for real! Advertise it,
  especially to developers of other versioning systems looking to
  communicate with SVN. Remember how this project started out?
- More optimizations. Since svnrdump is already so fast compared to
  the other tools, I think we can squeeze some more speed out of it.
- Huge documentation effort. svnrdump is a hack- I just did what I
  felt like and got it to work somehow. It's very unlike svnmucc,
  which does things by the book.
- Build more infrastructure around svnrdump- I've mostly used existing
  SVN API. Although a lot of new functions were suggested, I never
  really got down to writing them.
- Make dumpfile v3 the de-facto standard and improve it for optimized
  loading/ generation. The former part was suggested by Stefan.
- Integrate it into svnadmin etc as appropriate. I think there's
  enough work here for a mini-GSoC project?
- GitHub support (?) -- I saw this discussed on IRC somewhere, but I
  didn't understand this myself. Can someone clarify?

-- Ram

Re: svnrdump: The BIG update

Posted by 'Daniel Shahaf' <d....@daniel.shahaf.name>.
(sorry for the delay; didn't want to reply while sleepy)

Bert Huijben wrote on Tue, Aug 17, 2010 at 09:30:08 -0700:
> 
> 
> > -----Original Message-----
> > From: Ramkumar Ramachandra [mailto:artagnon@gmail.com]
> > Sent: dinsdag 17 augustus 2010 9:09
> > To: Daniel Shahaf
> > Cc: Subversion-dev Mailing List
> > Subject: Re: svnrdump: The BIG update
> > 
> > Hi Daniel,
> > 
> > Daniel Shahaf writes:
> > > Ramkumar Ramachandra wrote on Thu, Aug 12, 2010 at 12:17:34 +0530:
> > > > > > The dump functionality is also complete- thanks to Stefan's review
> > and
> > > > > > MANY others for cleaning it up. It's however hit a brick wall now
> > > > > > because of missing headers in the RA layer. Until I (or someone
> else)
> > > > > > figures out how to fix the RA layer, we can't do better than the
> XFail
> > > > > > copy-and-modify test I've committed.
> > > > >
> > > > > Part of the diff there is lack of SHA-1 headers --- which is
> unavoidable
> > > > > until editor is revved --- but part of it is a missing
> Text-copy-source-
> > md5.
> > > > > Why don't you output that information --- doesn't the editor give it
> to
> > you?
> > > >
> > > > Afaik, no. I don't see Text-copy-source-* anywhere in the RA
> > > > layer. Maybe I'm not looking hard enough?
> > > >
> > >
> > > Hmm.  It seems you're right.  So you might have to use two RA session in
> > > parallel...
> > >
> > > (and then, you might have to have the user authenticate twice?)
> > 
> > Hm, I also have to find out if it's allowed. The commit_editor doesn't
> > allow it for instance. Besides, it's a very inelegant solution- I'd
> > rather fix the RA layer than do this.
> 
> @Daniel, what would adding these adders add?
> 
> The extra headers are for making it easier to detect corruptions by checking
> them along the transfer. 
> 
> If we are just doing additional work to add headers via a different process
> it slows the dumping down more than a bit and it doesn't make the dump file
> any safer because it uses a different processes to obtain the header. 
> I think you would have to obtain the source of the copyfrom and get some
> checksum from that; maybe you can do that without transferring the file
> again, but I'm not sure about that.
> 

I'm a bit surprised, but indeed I don't see a way to obtain the checksum
via svn_ra.h.  (The word 'checksum' doesn't appear there, and it isn't
included in svn_dirent_t either.)  I wonder how we got away without
having it...

> (And without the added headers the process is already as safe as svnsync.).
> 
> Yes, we can add more and more processing to also get those new Sha1 headers
> by recalculating them while dumping, but the idea for svnrdump was to create
> a fast and secure way to dump and load repositories... not an incredible
> slow one that has to transfer files multiple times just to make all the
> optional headers match the output of svnadmin.
> 
> Those headers were made optional for a reason: you don't always have them. 
> And different conversion processes have different headers available.
> Svnadmin looks at the FS layer for dumping, so it sees different things than
> an RA layer api. E.g. the dump in svnadmin has to create diffs from
> fulltexts itself, while svnrdump has diffs and must apply these itself to
> get full texts. The checksums have a similar mangling. The FS has access to
> some of the checksums and recalculates others for you. (See the performance
> drop in 1.6 of svnadmin dump)
> 

Okay, agreed.  I assumed the editor would provide the copyfrom's
checksum for free (or, at least, that svn_ra_stat() would provide it),
but of course I won't suggest to add those copyfrom-checksum headers if
calculating them is as expensive as it now appears to be.

> There is a similar case at the import side. Applying commits can't check all
> the checksums, but the really important ones are already handled. Svnrdump
> dump and svnrdump load are a nice match.
> 
> 	Bert
> 

Thanks for doubting,

Daniel

Re: svnrdump: The BIG update

Posted by 'Daniel Shahaf' <d....@daniel.shahaf.name>.
(sorry for the delay; didn't want to reply while sleepy)

Bert Huijben wrote on Tue, Aug 17, 2010 at 09:30:08 -0700:
> 
> 
> > -----Original Message-----
> > From: Ramkumar Ramachandra [mailto:artagnon@gmail.com]
> > Sent: dinsdag 17 augustus 2010 9:09
> > To: Daniel Shahaf
> > Cc: Subversion-dev Mailing List
> > Subject: Re: svnrdump: The BIG update
> > 
> > Hi Daniel,
> > 
> > Daniel Shahaf writes:
> > > Ramkumar Ramachandra wrote on Thu, Aug 12, 2010 at 12:17:34 +0530:
> > > > > > The dump functionality is also complete- thanks to Stefan's review
> > and
> > > > > > MANY others for cleaning it up. It's however hit a brick wall now
> > > > > > because of missing headers in the RA layer. Until I (or someone
> else)
> > > > > > figures out how to fix the RA layer, we can't do better than the
> XFail
> > > > > > copy-and-modify test I've committed.
> > > > >
> > > > > Part of the diff there is lack of SHA-1 headers --- which is
> unavoidable
> > > > > until editor is revved --- but part of it is a missing
> Text-copy-source-
> > md5.
> > > > > Why don't you output that information --- doesn't the editor give it
> to
> > you?
> > > >
> > > > Afaik, no. I don't see Text-copy-source-* anywhere in the RA
> > > > layer. Maybe I'm not looking hard enough?
> > > >
> > >
> > > Hmm.  It seems you're right.  So you might have to use two RA session in
> > > parallel...
> > >
> > > (and then, you might have to have the user authenticate twice?)
> > 
> > Hm, I also have to find out if it's allowed. The commit_editor doesn't
> > allow it for instance. Besides, it's a very inelegant solution- I'd
> > rather fix the RA layer than do this.
> 
> @Daniel, what would adding these adders add?
> 
> The extra headers are for making it easier to detect corruptions by checking
> them along the transfer. 
> 
> If we are just doing additional work to add headers via a different process
> it slows the dumping down more than a bit and it doesn't make the dump file
> any safer because it uses a different processes to obtain the header. 
> I think you would have to obtain the source of the copyfrom and get some
> checksum from that; maybe you can do that without transferring the file
> again, but I'm not sure about that.
> 

I'm a bit surprised, but indeed I don't see a way to obtain the checksum
via svn_ra.h.  (The word 'checksum' doesn't appear there, and it isn't
included in svn_dirent_t either.)  I wonder how we got away without
having it...

> (And without the added headers the process is already as safe as svnsync.).
> 
> Yes, we can add more and more processing to also get those new Sha1 headers
> by recalculating them while dumping, but the idea for svnrdump was to create
> a fast and secure way to dump and load repositories... not an incredible
> slow one that has to transfer files multiple times just to make all the
> optional headers match the output of svnadmin.
> 
> Those headers were made optional for a reason: you don't always have them. 
> And different conversion processes have different headers available.
> Svnadmin looks at the FS layer for dumping, so it sees different things than
> an RA layer api. E.g. the dump in svnadmin has to create diffs from
> fulltexts itself, while svnrdump has diffs and must apply these itself to
> get full texts. The checksums have a similar mangling. The FS has access to
> some of the checksums and recalculates others for you. (See the performance
> drop in 1.6 of svnadmin dump)
> 

Okay, agreed.  I assumed the editor would provide the copyfrom's
checksum for free (or, at least, that svn_ra_stat() would provide it),
but of course I won't suggest to add those copyfrom-checksum headers if
calculating them is as expensive as it now appears to be.

> There is a similar case at the import side. Applying commits can't check all
> the checksums, but the really important ones are already handled. Svnrdump
> dump and svnrdump load are a nice match.
> 
> 	Bert
> 

Thanks for doubting,

Daniel


RE: svnrdump: The BIG update

Posted by Bert Huijben <be...@vmoo.com>.

> -----Original Message-----
> From: Ramkumar Ramachandra [mailto:artagnon@gmail.com]
> Sent: dinsdag 17 augustus 2010 9:09
> To: Daniel Shahaf
> Cc: Subversion-dev Mailing List
> Subject: Re: svnrdump: The BIG update
> 
> Hi Daniel,
> 
> Daniel Shahaf writes:
> > Ramkumar Ramachandra wrote on Thu, Aug 12, 2010 at 12:17:34 +0530:
> > > > > The dump functionality is also complete- thanks to Stefan's review
> and
> > > > > MANY others for cleaning it up. It's however hit a brick wall now
> > > > > because of missing headers in the RA layer. Until I (or someone
else)
> > > > > figures out how to fix the RA layer, we can't do better than the
XFail
> > > > > copy-and-modify test I've committed.
> > > >
> > > > Part of the diff there is lack of SHA-1 headers --- which is
unavoidable
> > > > until editor is revved --- but part of it is a missing
Text-copy-source-
> md5.
> > > > Why don't you output that information --- doesn't the editor give it
to
> you?
> > >
> > > Afaik, no. I don't see Text-copy-source-* anywhere in the RA
> > > layer. Maybe I'm not looking hard enough?
> > >
> >
> > Hmm.  It seems you're right.  So you might have to use two RA session in
> > parallel...
> >
> > (and then, you might have to have the user authenticate twice?)
> 
> Hm, I also have to find out if it's allowed. The commit_editor doesn't
> allow it for instance. Besides, it's a very inelegant solution- I'd
> rather fix the RA layer than do this.

@Daniel, what would adding these adders add?

The extra headers are for making it easier to detect corruptions by checking
them along the transfer. 

If we are just doing additional work to add headers via a different process
it slows the dumping down more than a bit and it doesn't make the dump file
any safer because it uses a different processes to obtain the header. 
I think you would have to obtain the source of the copyfrom and get some
checksum from that; maybe you can do that without transferring the file
again, but I'm not sure about that.

(And without the added headers the process is already as safe as svnsync.).

Yes, we can add more and more processing to also get those new Sha1 headers
by recalculating them while dumping, but the idea for svnrdump was to create
a fast and secure way to dump and load repositories... not an incredible
slow one that has to transfer files multiple times just to make all the
optional headers match the output of svnadmin.

Those headers were made optional for a reason: you don't always have them. 
And different conversion processes have different headers available.
Svnadmin looks at the FS layer for dumping, so it sees different things than
an RA layer api. E.g. the dump in svnadmin has to create diffs from
fulltexts itself, while svnrdump has diffs and must apply these itself to
get full texts. The checksums have a similar mangling. The FS has access to
some of the checksums and recalculates others for you. (See the performance
drop in 1.6 of svnadmin dump)

There is a similar case at the import side. Applying commits can't check all
the checksums, but the really important ones are already handled. Svnrdump
dump and svnrdump load are a nice match.

	Bert

Re: svnrdump: The BIG update

Posted by Ramkumar Ramachandra <ar...@gmail.com>.
Hi Daniel,

Daniel Shahaf writes:
> Ramkumar Ramachandra wrote on Thu, Aug 12, 2010 at 12:17:34 +0530:
> > > > The dump functionality is also complete- thanks to Stefan's review and
> > > > MANY others for cleaning it up. It's however hit a brick wall now
> > > > because of missing headers in the RA layer. Until I (or someone else)
> > > > figures out how to fix the RA layer, we can't do better than the XFail
> > > > copy-and-modify test I've committed.
> > > 
> > > Part of the diff there is lack of SHA-1 headers --- which is unavoidable
> > > until editor is revved --- but part of it is a missing Text-copy-source-md5.
> > > Why don't you output that information --- doesn't the editor give it to you?
> > 
> > Afaik, no. I don't see Text-copy-source-* anywhere in the RA
> > layer. Maybe I'm not looking hard enough?
> > 
> 
> Hmm.  It seems you're right.  So you might have to use two RA session in
> parallel...
> 
> (and then, you might have to have the user authenticate twice?)

Hm, I also have to find out if it's allowed. The commit_editor doesn't
allow it for instance. Besides, it's a very inelegant solution- I'd
rather fix the RA layer than do this.

> > > > - Make dumpfile v3 the de-facto standard and improve it for optimized
> > > >   loading/ generation. The former part was suggested by Stefan.
> > > > - Integrate it into svnadmin etc as appropriate. I think there's
> > > >   enough work here for a mini-GSoC project?
> > > 
> > > How would it be integrated into svnadmin?  Do you want to push the logic
> > > into the standard 'svnadmin dump' command?
> > 
> > This is something I haven't given thought either. I brought it up
> > because of an earlier discussion in which everyone seemed to be in
> > favor of NOT having a new command. It feels like we're stuffing a lot
> > of functionality into one tool though.
> > 
> 
> Personally I also like having svnadmin operates only locally (so it doesn't
> even link against libsvn_ra), but that was hashed out already on that
> moderately-long thread a few weeks ago.

Yeah. It looks like I'll have to ressurect this thread soon and reach
a concrete conclusion.

-- Ram

Re: svnrdump: The BIG update

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Ramkumar Ramachandra wrote on Thu, Aug 12, 2010 at 12:17:34 +0530:
> > > The dump functionality is also complete- thanks to Stefan's review and
> > > MANY others for cleaning it up. It's however hit a brick wall now
> > > because of missing headers in the RA layer. Until I (or someone else)
> > > figures out how to fix the RA layer, we can't do better than the XFail
> > > copy-and-modify test I've committed.
> > 
> > Part of the diff there is lack of SHA-1 headers --- which is unavoidable
> > until editor is revved --- but part of it is a missing Text-copy-source-md5.
> > Why don't you output that information --- doesn't the editor give it to you?
> 
> Afaik, no. I don't see Text-copy-source-* anywhere in the RA
> layer. Maybe I'm not looking hard enough?
> 

Hmm.  It seems you're right.  So you might have to use two RA session in
parallel...

(and then, you might have to have the user authenticate twice?)

> > > - More optimizations. Since svnrdump is already so fast compared to
> > >   the other tools, I think we can squeeze some more speed out of it.
> > > - Huge documentation effort. svnrdump is a hack- I just did what I
> > >   felt like and got it to work somehow. It's very unlike svnmucc,
> > >   which does things by the book.
> > > - Build more infrastructure around svnrdump- I've mostly used existing
> > >   SVN API. Although a lot of new functions were suggested, I never
> > >   really got down to writing them.
> > 
> > Yep.  There was also talk of moving some of the logic into the libraries ---
> > where does that stand?
> 
> Yeah, I haven't started working on this yet. I'll need some guidance
> for this- I have to sketch out a roadmap and ask for access to the
> specified regions or branch; planning is something I'm not used to at
> all :p
> 

:-)

> > > - Make dumpfile v3 the de-facto standard and improve it for optimized
> > >   loading/ generation. The former part was suggested by Stefan.
> > > - Integrate it into svnadmin etc as appropriate. I think there's
> > >   enough work here for a mini-GSoC project?
> > 
> > How would it be integrated into svnadmin?  Do you want to push the logic
> > into the standard 'svnadmin dump' command?
> 
> This is something I haven't given thought either. I brought it up
> because of an earlier discussion in which everyone seemed to be in
> favor of NOT having a new command. It feels like we're stuffing a lot
> of functionality into one tool though.
> 

Personally I also like having svnadmin operates only locally (so it doesn't
even link against libsvn_ra), but that was hashed out already on that
moderately-long thread a few weeks ago.

> > > - GitHub support (?) -- I saw this discussed on IRC somewhere, but I
> > >   didn't understand this myself. Can someone clarify?
> > > 
> > 
> > Joke.  GitHub implemented a mod_dav_svn interface to their repositories [1],
> > so it's now possible (if their implementation is sound) to generate an svn
> > dump of a GitHub git repository.
> 
> Ah, yes. I'm aware. With the infrastructure I've written on the Git
> end (incomplete), the SVN <-> Git bidirectional bridge should be
> seamless and awesome :)
> 
> Note: I'll be visiting home this weekend (that means: mostly
> travelling). I'll be back to hack next week.
> 

> -- Ram

Re: svnrdump: The BIG update

Posted by Ramkumar Ramachandra <ar...@gmail.com>.
Hi Daniel,

Daniel Shahaf writes:
> > It's been a few weeks since I got partial committer access, and ~80
> > commits later, this is what we have:
> > 
> > Firstly, thanks to Daniel for motivating me and driving me to submit
> > the series to the list, and guiding me through everything. Without
> > him, I'd probably not have finished svnrdump to begin with.
> > 
> > The command line interface and argument parsing library is ready-
> > thanks to Bert and lots of others for getting me started with
> > this. The interface is solid and looks like the one used in the other
> > SVN tools.
> > 
> > The dump functionality is also complete- thanks to Stefan's review and
> > MANY others for cleaning it up. It's however hit a brick wall now
> > because of missing headers in the RA layer. Until I (or someone else)
> > figures out how to fix the RA layer, we can't do better than the XFail
> > copy-and-modify test I've committed.
> 
> Part of the diff there is lack of SHA-1 headers --- which is unavoidable
> until editor is revved --- but part of it is a missing Text-copy-source-md5.
> Why don't you output that information --- doesn't the editor give it to you?

Afaik, no. I don't see Text-copy-source-* anywhere in the RA
layer. Maybe I'm not looking hard enough?

> Nitpick: svnrdump_tests 5 6 have the same textual description / docstring as
> each other, could you please change that?  See other test files (e.g.,
> ./commit_tests.py --list) for plenty of examples.

Fixed. Thanks for noticing this.

> > It's quite mature and dumps
> > surprisingly fast though. I'm tempted to run benchmarks, but I haven't
> > done it yet because I fear I might be biased towards the tool :p
> > 
> 
> Just write all the benchmarks before running them?

Hehe, yeah. Will do- I just have to make sure that no external factors
affect the tests (example: variations of network speed, disk speed,
cache with time).

> > The load functionality is also quite complete, thanks to Bert et al
> > for helping me debug all the cryptic errors. The code is mostly
> > unreviewed though- there might be plenty of bugs and code cleanup
> > opportunities. Not to say that I've stopped working on it- just that
> > the work has become less challenging, now that all the tests pass :)
> > 
> 
> Okay, good.  Some field testing probably needed here?

Yeah, lots. I've tested against 1000 revisions of the ASF
successfully, but I'll need more time and patience to run more tests.

> > TODO:
> > - Write more tests and start using svnrdump for real! Advertise it,
> >   especially to developers of other versioning systems looking to
> >   communicate with SVN. Remember how this project started out?
> 
> Don't forget to inform users@subversion.apache.org :-)

Oh, okay. I'll write another email for them.

> > - More optimizations. Since svnrdump is already so fast compared to
> >   the other tools, I think we can squeeze some more speed out of it.
> > - Huge documentation effort. svnrdump is a hack- I just did what I
> >   felt like and got it to work somehow. It's very unlike svnmucc,
> >   which does things by the book.
> > - Build more infrastructure around svnrdump- I've mostly used existing
> >   SVN API. Although a lot of new functions were suggested, I never
> >   really got down to writing them.
> 
> Yep.  There was also talk of moving some of the logic into the libraries ---
> where does that stand?

Yeah, I haven't started working on this yet. I'll need some guidance
for this- I have to sketch out a roadmap and ask for access to the
specified regions or branch; planning is something I'm not used to at
all :p

> > - Make dumpfile v3 the de-facto standard and improve it for optimized
> >   loading/ generation. The former part was suggested by Stefan.
> > - Integrate it into svnadmin etc as appropriate. I think there's
> >   enough work here for a mini-GSoC project?
> 
> How would it be integrated into svnadmin?  Do you want to push the logic
> into the standard 'svnadmin dump' command?

This is something I haven't given thought either. I brought it up
because of an earlier discussion in which everyone seemed to be in
favor of NOT having a new command. It feels like we're stuffing a lot
of functionality into one tool though.

> > - GitHub support (?) -- I saw this discussed on IRC somewhere, but I
> >   didn't understand this myself. Can someone clarify?
> > 
> 
> Joke.  GitHub implemented a mod_dav_svn interface to their repositories [1],
> so it's now possible (if their implementation is sound) to generate an svn
> dump of a GitHub git repository.

Ah, yes. I'm aware. With the infrastructure I've written on the Git
end (incomplete), the SVN <-> Git bidirectional bridge should be
seamless and awesome :)

Note: I'll be visiting home this weekend (that means: mostly
travelling). I'll be back to hack next week.

-- Ram

Re: svnrdump: The BIG update

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Ramkumar Ramachandra wrote on Tue, Aug 10, 2010 at 19:32:34 +0530:
> Hi,
> 
> I've been putting this off for some time now- it's so much easier to
> write code than to write English :p Anyway, here it is- a massive
> status update.
> 

Thanks for the update.

> It's been a few weeks since I got partial committer access, and ~80
> commits later, this is what we have:
> 
> Firstly, thanks to Daniel for motivating me and driving me to submit
> the series to the list, and guiding me through everything. Without
> him, I'd probably not have finished svnrdump to begin with.
> 
> The command line interface and argument parsing library is ready-
> thanks to Bert and lots of others for getting me started with
> this. The interface is solid and looks like the one used in the other
> SVN tools.
> 
> The dump functionality is also complete- thanks to Stefan's review and
> MANY others for cleaning it up. It's however hit a brick wall now
> because of missing headers in the RA layer. Until I (or someone else)
> figures out how to fix the RA layer, we can't do better than the XFail
> copy-and-modify test I've committed.

Part of the diff there is lack of SHA-1 headers --- which is unavoidable
until editor is revved --- but part of it is a missing Text-copy-source-md5.
Why don't you output that information --- doesn't the editor give it to you?

Nitpick: svnrdump_tests 5 6 have the same textual description / docstring as
each other, could you please change that?  See other test files (e.g.,
./commit_tests.py --list) for plenty of examples.

> It's quite mature and dumps
> surprisingly fast though. I'm tempted to run benchmarks, but I haven't
> done it yet because I fear I might be biased towards the tool :p
> 

Just write all the benchmarks before running them?

> The load functionality is also quite complete, thanks to Bert et al
> for helping me debug all the cryptic errors. The code is mostly
> unreviewed though- there might be plenty of bugs and code cleanup
> opportunities. Not to say that I've stopped working on it- just that
> the work has become less challenging, now that all the tests pass :)
> 

Okay, good.  Some field testing probably needed here?

> TODO:
> - Write more tests and start using svnrdump for real! Advertise it,
>   especially to developers of other versioning systems looking to
>   communicate with SVN. Remember how this project started out?

Don't forget to inform users@subversion.apache.org :-)

> - More optimizations. Since svnrdump is already so fast compared to
>   the other tools, I think we can squeeze some more speed out of it.
> - Huge documentation effort. svnrdump is a hack- I just did what I
>   felt like and got it to work somehow. It's very unlike svnmucc,
>   which does things by the book.
> - Build more infrastructure around svnrdump- I've mostly used existing
>   SVN API. Although a lot of new functions were suggested, I never
>   really got down to writing them.

Yep.  There was also talk of moving some of the logic into the libraries ---
where does that stand?

> - Make dumpfile v3 the de-facto standard and improve it for optimized
>   loading/ generation. The former part was suggested by Stefan.
> - Integrate it into svnadmin etc as appropriate. I think there's
>   enough work here for a mini-GSoC project?

How would it be integrated into svnadmin?  Do you want to push the logic
into the standard 'svnadmin dump' command?

> - GitHub support (?) -- I saw this discussed on IRC somewhere, but I
>   didn't understand this myself. Can someone clarify?
> 

Joke.  GitHub implemented a mod_dav_svn interface to their repositories [1],
so it's now possible (if their implementation is sound) to generate an svn
dump of a GitHub git repository.


[1] http://github.com/blog/626-announcing-svn-support
[1] `svn info http://svn.github.com/artagnon/svnrdump.git`

> -- Ram