You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Garrett Rooney <ro...@electricjellyfish.net> on 2005/11/10 01:25:11 UTC

[PATCH] expose svn_repos_replay via the RA API

Over the past week or so I've been working on ways to mirror
Subversion repositories over our RA layers.  Initially, I started out
trying to clone the work that clkao did in his SVN::Mirror perl
module, with the slight simplification that I was only interested in
copying an entire repository, not a subset of one.

Unfortunately, I came to the conclusion that you can't actually create
an exact copy of a repository via the current RA API.  Since the diff
and update editors don't provide you with copyfrom information, you
have to use the log API to get that information.  Once you have that
you can recreate the copies correctly (although it takes a fair amount
of hairy logic to get it right), but there's still one problem, you
can't tell the difference between a plain copy and a copy that
includes additional textual modifications.

I suppose I could try detecting that case by diffing copied files with
the source, but that was one hoop too many for me to be willing to
jump through, so I broke down and started thinking of what kind of RA
API I'd really like to use for this sort of thing.

What I came up with was the idea of exposing the svn_repos_replay API
(which is what svn_repos_dump uses under the hood), so you can
actually drive an editor through the correct sequence of commands
needed to reproduce the change in a given revision.  This isn't quite
as straitforward as just exposing the existing API, since the current
replay code doesn't actually produce text and property deltas (since
dump doesn't really need them, since it calculates them on its own),
but the change to enable that behavior is pretty small.

I'm kind of hoping that this sort of API will be useful to other
programs that try to copy revision history from one repository to
another, like SVK, but I honestly don't know enough about their exact
needs to say if this is correct.  It does work for the use case I was
working on though, full repository replication.

Anyway, here's a preliminary patch people can look at, it doesn't
implement the ra_dav side of things (I only had time to get ra_svn and
ra_local working today), doesn't update the ra_svn protocol document,
and it doesn't include the actual tool I'm writing to make use of this
API, because I still need to clean that up a bit, but it should be
able to give people something to comment on.

So what do you think?  Good idea?  Bad idea?

-garrett

Re: [PATCH] expose svn_repos_replay via the RA API

Posted by Garrett Rooney <ro...@electricjellyfish.net>.
On 11/9/05, Philip Martin <ph...@codematters.co.uk> wrote:
> Garrett Rooney <ro...@electricjellyfish.net> writes:
>
> > What I came up with was the idea of exposing the svn_repos_replay API
> > (which is what svn_repos_dump uses under the hood), so you can
> > actually drive an editor through the correct sequence of commands
>
> I think that bypasses the auth stuff, doesn't it?

Yeah, that's the other thing I still need to implement.  Knew I forgot
to mention something ;-)

> >  /**
> > + * Replay the changes from @a revision through @a editor and @a edit_baton.
> > + *
> > + * If @a send_deltas is @c TRUE, the actual text and property changes in
> > + * the revision will be sent, otherwise dummy text deltas and null property
> > + * changes will be sent instead.
> > + *
> > + * @a pool is used for all allocation.
> > + *
> > + * @since New in 1.4.
> > + */
> > +svn_error_t *svn_ra_replay (svn_ra_session_t *session,
> > +                            svn_revnum_t revision,
> > +                            svn_boolean_t send_deltas,
> > +                            const svn_delta_editor_t *editor,
> > +                            void *edit_baton,
> > +                            apr_pool_t *pool);
> > +
>
> That's at least one round trip per revision isn't it, is that going to
> be a problem when people want to replay large numbers of revision?

Yeah, it's one round trip per revision, but for non-trivial commits
you're spending a fair amount of time getting the actual content, so
I'm not really sure how much the round trips hurt.

> I suppose allowing a single request to replay a large number of
> revisions might be a bit of a denial-of-service problem.  Perhaps we
> would need a replay hook to control it?

That occurred to me, but honestly, this is actually less data being
transferred than you'd end up with from doing diff and update, so I'm
not sure it's that big a deal.  You can likely cause more of a load
problem via the existing APIs than you can with this one.

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: [PATCH] expose svn_repos_replay via the RA API

Posted by Philip Martin <ph...@codematters.co.uk>.
Garrett Rooney <ro...@electricjellyfish.net> writes:

> What I came up with was the idea of exposing the svn_repos_replay API
> (which is what svn_repos_dump uses under the hood), so you can
> actually drive an editor through the correct sequence of commands

I think that bypasses the auth stuff, doesn't it?

>  /**
> + * Replay the changes from @a revision through @a editor and @a edit_baton.
> + *
> + * If @a send_deltas is @c TRUE, the actual text and property changes in
> + * the revision will be sent, otherwise dummy text deltas and null property
> + * changes will be sent instead.
> + *
> + * @a pool is used for all allocation.
> + *
> + * @since New in 1.4.
> + */
> +svn_error_t *svn_ra_replay (svn_ra_session_t *session,
> +                            svn_revnum_t revision,
> +                            svn_boolean_t send_deltas,
> +                            const svn_delta_editor_t *editor,
> +                            void *edit_baton,
> +                            apr_pool_t *pool);
> +

That's at least one round trip per revision isn't it, is that going to
be a problem when people want to replay large numbers of revision?

I suppose allowing a single request to replay a large number of
revisions might be a bit of a denial-of-service problem.  Perhaps we
would need a replay hook to control it?

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org