You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Julian Foad <ju...@btopenworld.com> on 2009/10/05 17:10:27 UTC

Obliterate - call graph, esp. repos and FS layers

These last few days I've been working down through the layers, looking
at a call graph something like this:


"svn obliterate URL@REV"
  calls
svn_cl__obliterate()
  calls
svn_client_obliterate()
  calls
svn_ra_obliterate()
  calls
svn_ra__vtable_t:obliterate_path_rev()
  calls
svn_ra_local__obliterate_path_rev()
  calls
svn_repos_obliterate_path_rev()
  calls
svn_fs_begin_obliteration_txn()
svn_fs_commit_obliteration_txn()
  calls
svn_fs_t::begin_obliteration_txn()
svn_fs_txn_t::commit_obliteration_txn()
  calls
svn_fs_fs__begin_obliteration_txn()
svn_fs_fs__commit_obliteration_txn()


Most of these calls will do little but forward the request on down the
chain. Somewhare in the upper layers will be the logic that decides
which node-revs to obliterate, based on what the client wants, and
decides how many calls to make to the lower layers based on how
"powerful" the lower APIs are.

At the moment I'm not thinking about the high-level logic, but instead
looking at how the repos layer downwards will modify one revision.


# repos layer

svn_repos_obliterate_path_rev(path-in-repos, rev)
  // For now, let's just suppose that the API is only to be capable of
removing one path from one rev.

# FS layer

svn_fs_begin_obliteration_txn(rev)
svn_fs_commit_obliteration_txn(rev)
  // Special versions of the normal-txn "begin" and "commit" APIs.
  // I've yet to discover whether the standard APIs for modifying
  // the contents of a transaction are sufficient to use between
  // these "begin" and "commit" calls.

svn_fs_t::begin_obliteration_txn(rev)
svn_fs_txn_t::commit_obliteration_txn(rev)

# FSFS

svn_fs_fs__begin_obliteration_txn(rev)
  // Like svn_fs_fs__begin_txn() but with no special rev-props (no need
to change the "date" property, for example).

svn_fs_fs__commit_obliteration_txn(rev)
  // Where the bulk of the new work will be.


All of these APIs take a revision number as a parameter, and replace the
specified revision with a modified version of itself.

Thoughts?

- Julian

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2403816

RE: Obliterate - call graph, esp. repos and FS layers

Posted by Bert Huijben <rh...@sharpsvn.net>.

> -----Original Message-----
> From: Ivan Zhakov [mailto:ivan@visualsvn.com]
> Sent: dinsdag 6 oktober 2009 10:52
> To: Julian Foad
> Cc: dev@subversion.tigris.org
> Subject: Re: Obliterate - call graph, esp. repos and FS layers
> 
> On Mon, Oct 5, 2009 at 9:10 PM, Julian Foad
> <ju...@btopenworld.com> wrote:
> > These last few days I've been working down through the layers,
> looking
> > at a call graph something like this:
> >
> >
> > "svn obliterate URL@REV"
> It's slightly off-topic, but I'd like to see obliterate functionality
> as administrative task only. I.e. "svnadmin obliterate". Currently
> even authorization is disabled administrators are sure that nobody can
> remove anything from repository. But if you'll add "svn obliterate"
> many administrators will be forced to enable authorization and check
> they permissions. Another note that you will end with special
> permission for obliterating files, which is add unnecessary
> complexity.

I would suggest using the same method as for editing revision properties. 

By default disabled, unless a hook script is installed that allows it for a
specific user.

	Bert

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2404001

Re: Re: Obliterate - client-side authorization [was: Obliterate - call graph, esp. repos and FS layers]

Posted by Stefan Sperling <st...@elego.de>.

On Mon, Oct 12, 2009 at 07:51:36AM -0700, Andy Bolstridge wrote:
> I can give you another reason to make the obliterate functionality part of the client system (even if its turned off or the command itself is not exposed), and that's implementing shelving functionality.
> 
> Once obliterate is with us, and it frees up disc space, SVN can get 'temporary commits' that are deleted once the temps are fully committed. The entire system could be implemented as switching to a special hidden path in the repo and committing. When a full commit occurs, the hidden commits can be obliterated in the background.
> 
> I don't know how the client commands would look, but it could be implemented entirely on the client if obliterate was a client-callable option.

Developer talk on IRC indicates that shelving a.k.a. offline commits
(as well as local working copy cloning) is planned to be added to WC-NG
as a pure client-side feature at some point in the future (i.e. not 1.7).

Stefan

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2406703

RE: Re: Obliterate - client-side authorization [was: Obliterate - call graph, esp. repos and FS layers]

Posted by Andy Bolstridge <an...@bolstridge.plus.com>.

I can give you another reason to make the obliterate functionality part of the client system (even if its turned off or the command itself is not exposed), and that's implementing shelving functionality.

Once obliterate is with us, and it frees up disc space, SVN can get 'temporary commits' that are deleted once the temps are fully committed. The entire system could be implemented as switching to a special hidden path in the repo and committing. When a full commit occurs, the hidden commits can be obliterated in the background.

I don't know how the client commands would look, but it could be implemented entirely on the client if obliterate was a client-callable option.

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2406696

Re: Obliterate - client-side authorization [was: Obliterate - call graph, esp. repos and FS layers]

Posted by Branko Cibej <br...@xbc.nu>.

Julian Foad wrote:
> If we do make a client-side "svn obliterate" command, we will need an
> appropriate form of authentication for it. This could be as simple as
> assigning the permission to certain user names, ...

You mean, "authorization." :)

But yes, I agree with what you say about client-side obliteration.

I'd also mention the not so far-fetched use case where a project is
using a public Subversion hosting service. I bet if we had server-side
obliterate only, such services would quickly build a Web interface to it.

-- Brane

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2404029

Obliterate - client-side authorization [was: Obliterate - call graph, esp. repos and FS layers]

Posted by Julian Foad <ju...@btopenworld.com>.

Re. client-side obliterate.

Mark Phippard wrote:
> Was there a specific use-case or reason that has caused this to be
> pushed down to the client layers?  I thought we were planning for
> obliterate to be an svnadmin function that required physical access to
> the repository.

Certainly there is real-world reason to want it available client side.
Imagine our source code being hosted in the big Apache repository, and
us asking their admins once a year or so to obliterate some huge mistake
that we've accidentally committed. We rarely make really huge commit
mistakes so the admins would probably tolerate us. Now imagine a similar
shared repository with lots of projects with diverse and non-VC-aware
users, and imagine there are frequent mistakes in which a user commits a
copy of their hard drive root or something. In such a repository it is
not really scaleable for the central admins to attend to such requests,
and it should be delegated to per-project admins.

But note that I'm approaching it this way in order to develop a good and
extensible design, and am not necessarily going to release it initially
with a client-side interface (even at the network API level). That's
still under consideration. I realized that if I designed it initially
without consideration for on-line operation and without using the normal
API layers, the work would not be re-usable when we want to extend it to
client-side operation in the future.

>  If we are planning to allow any client to obliterate
> revisions, what sort of access control are we planning?  Just
> something with hook scripts, similar to how we treat revision property
> changes?

and Ivan Zhakov wrote:
> It's slightly off-topic, but I'd like to see obliterate functionality
> as administrative task only. I.e. "svnadmin obliterate". Currently
> even authorization is disabled administrators are sure that nobody can
> remove anything from repository. But if you'll add "svn obliterate"
> many administrators will be forced to enable authorization and check
> they permissions. Another note that you will end with special
> permission for obliterating files, which is add unnecessary
> complexity.

If we do make a client-side "svn obliterate" command, we will have a
server-side configuration option to enable it, that will be OFF by
default. If the administrator does not enable it, it will not be
possible for any client (even a specially modified client) to do
obliteration. So we won't force administrators to enable authorization
if they don't want this.

If we do make a client-side "svn obliterate" command, we will need an
appropriate form of authentication for it. This could be as simple as
assigning the permission to certain user names, or much more complex. It
could be built-in or done by a pre-obliterate hook. I have not started
to design this, and ideas are welcome.

- Julian

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2404012

Re: Obliterate - call graph, esp. repos and FS layers

Posted by Ivan Zhakov <iv...@visualsvn.com>.

On Mon, Oct 5, 2009 at 9:10 PM, Julian Foad <ju...@btopenworld.com> wrote:
> These last few days I've been working down through the layers, looking
> at a call graph something like this:
>
>
> "svn obliterate URL@REV"
It's slightly off-topic, but I'd like to see obliterate functionality
as administrative task only. I.e. "svnadmin obliterate". Currently
even authorization is disabled administrators are sure that nobody can
remove anything from repository. But if you'll add "svn obliterate"
many administrators will be forced to enable authorization and check
they permissions. Another note that you will end with special
permission for obliterating files, which is add unnecessary
complexity.

-- 
Ivan Zhakov
VisualSVN Team

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2403986

Re: Obliterate - call graph, esp. repos and FS layers

Posted by Mark Phippard <ma...@gmail.com>.

On Mon, Oct 5, 2009 at 1:10 PM, Julian Foad <ju...@btopenworld.com> wrote:
> These last few days I've been working down through the layers, looking
> at a call graph something like this:
>
>
> "svn obliterate URL@REV"
>  calls
> svn_cl__obliterate()
>  calls
> svn_client_obliterate()
>  calls
> svn_ra_obliterate()
>  calls
> svn_ra__vtable_t:obliterate_path_rev()
>  calls
> svn_ra_local__obliterate_path_rev()
>  calls
> svn_repos_obliterate_path_rev()
>  calls
> svn_fs_begin_obliteration_txn()
> svn_fs_commit_obliteration_txn()
>  calls
> svn_fs_t::begin_obliteration_txn()
> svn_fs_txn_t::commit_obliteration_txn()
>  calls
> svn_fs_fs__begin_obliteration_txn()
> svn_fs_fs__commit_obliteration_txn()
>
>
> Most of these calls will do little but forward the request on down the
> chain. Somewhare in the upper layers will be the logic that decides
> which node-revs to obliterate, based on what the client wants, and
> decides how many calls to make to the lower layers based on how
> "powerful" the lower APIs are.
>
> At the moment I'm not thinking about the high-level logic, but instead
> looking at how the repos layer downwards will modify one revision.
>
>
> # repos layer
>
> svn_repos_obliterate_path_rev(path-in-repos, rev)
>  // For now, let's just suppose that the API is only to be capable of
> removing one path from one rev.
>
> # FS layer
>
> svn_fs_begin_obliteration_txn(rev)
> svn_fs_commit_obliteration_txn(rev)
>  // Special versions of the normal-txn "begin" and "commit" APIs.
>  // I've yet to discover whether the standard APIs for modifying
>  // the contents of a transaction are sufficient to use between
>  // these "begin" and "commit" calls.
>
> svn_fs_t::begin_obliteration_txn(rev)
> svn_fs_txn_t::commit_obliteration_txn(rev)
>
> # FSFS
>
> svn_fs_fs__begin_obliteration_txn(rev)
>  // Like svn_fs_fs__begin_txn() but with no special rev-props (no need
> to change the "date" property, for example).
>
> svn_fs_fs__commit_obliteration_txn(rev)
>  // Where the bulk of the new work will be.
>
>
> All of these APIs take a revision number as a parameter, and replace the
> specified revision with a modified version of itself.

Was there a specific use-case or reason that has caused this to be
pushed down to the client layers?  I thought we were planning for
obliterate to be an svnadmin function that required physical access to
the repository.  If we are planning to allow any client to obliterate
revisions, what sort of access control are we planning?  Just
something with hook scripts, similar to how we treat revision property
changes?

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2403819

Re: Obliterate - call graph, esp. repos and FS layers

Posted by Julian Foad <ju...@btopenworld.com>.

On Mon, 2009-10-05, Branko Čibej wrote:
> Julian Foad wrote:
> > These last few days I've been working down through the layers, looking
> > at a call graph something like this:
> >   
> [...]
> > svn_fs_begin_obliteration_txn()
> >   
> [...]
> 
> > All of these APIs take a revision number as a parameter, and replace the
> > specified revision with a modified version of itself.
> 
> Given that obliterate does everything differently from a normal
> transaction, do you really need an obliteration txn on the API level?
> you could really have a single svn_fs_obliterate(_path_rev) that handles
> the underlying txn management. Unless you're thinking ahead to issuing
> several obliterate commands in a single transaction; but in that case
> your whole call chain would have to be different, at least up to the RA
> layer.

That's an interesting question. Certainly there has to be a
"transaction" in the implementation in terms of "safe concurrent
changes". Does it need to be an "FS transaction" in the sense of
creating a new (replacement) revision? Well, here I'm still exploring
but it looks like this "FS transaction" model is a good fit for at least
part of the problem space. Some obliteration use cases are basically
within one revision, and I think we can serialize an "obliteration
commit" within one revision in the sequence of mostly "normal" commits
and maintain integrity while doing so.

See
<http://svn.collab.net/repos/svn/trunk/notes/obliterate/design-repos.html>. (I'll develop this explanation further.)

Does it need to be exposed in an API? I think the logic that decides
what changes need to be made within one revision should be higher level
than the fundamental "apply a change" API. At the moment it seems likely
that re-using the existing transaction-start and transaction-modify and
transaction-end framework is the most productive route. It depends
partly on how suitable the existing transaction-modify APIs are for an
obliteration transaction.

Is this the right API? The use case it doesn't cover efficiently is
"obliterate PATH from revisions 1 to 500000". One avenue to explore is
"obliterate the path history segment identified by the peg PATH@500000".
That might be a sufficiently large granularity to efficiently accomplish
an obliteration across a large revision range.

So I might well end up with a different form of transaction.

> Another question pops up: how do you obliterate a whole revision? the
> use case being, e.g., to undo the last commit. Would that be "svnadmin
> obliterate / HEAD"? How do we handle obliteration of directories in
> general -- I only recall files being mentioned.

Directories:

<http://svn.collab.net/repos/svn/trunk/notes/obliterate/design-repos.html> should help to explain it. The model of obliteration I'm using is per node, and applies equally to a directory-tree as to a file. It says, "Remove these nodes (files and/or directory trees) ... from revision R1, and these nodes ... from revision R2, and ...".

Obliterating a whole revision:

I am considering two models of obliteration. Both are per node-rev. The
model I call "dd1" is "delete PATH@REV...". The model I call "cc1" is
"undo the changes to PATH@REV", leaving the content of the PATH@REV+1
unchanged so that, after obliterating PATH@50, the changes originally
made in PATH@50 and PATH@51 will be combined in the change PATH@51.

In model "dd1", the way to undo a whole revision HEAD=r50 would be to
create a new revision r51 the way you want it and then obliterate
selected nodes from r50. (This would leave gaps in the history of any
affected nodes that already existed in r49 and still exist in r51.)

In model "cc1", the way to undo a whole revision HEAD=r50 would be
"obliterate ^/@50", which would replace the old r50 with a new r50 whose
file and directory tree content is an identical copy of r49's. (So the
change recorded by the new r50 is a null change.)

I do not intend to provide a way to remove a revision number from the
sequence of revision numbers, not even the head revision, neither by
renumbering subsequent revisions nor leaving a gap in the numbers. There
are several reasons why doing so would be troublesome, and I do not
think there are any use cases in which this would be particularly
useful.

- Julian

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2404028

Re: Obliterate - call graph, esp. repos and FS layers

Posted by Branko Cibej <br...@xbc.nu>.

Julian Foad wrote:
> These last few days I've been working down through the layers, looking
> at a call graph something like this:
>   
[...]
> svn_fs_begin_obliteration_txn()
>   
[...]

> All of these APIs take a revision number as a parameter, and replace the
> specified revision with a modified version of itself.
>
> Thoughts?
>   

Given that obliterate does everything differently from a normal
transaction, do you really need an obliteration txn on the API level?
you could really have a single svn_fs_obliterate(_path_rev) that handles
the underlying txn management. Unless you're thinking ahead to issuing
several obliterate commands in a single transaction; but in that case
your whole call chain would have to be different, at least up to the RA
layer.

Another question pops up: how do you obliterate a whole revision? the
use case being, e.g., to undo the last commit. Would that be "svnadmin
obliterate / HEAD"? How do we handle obliteration of directories in
general -- I only recall files being mentioned.

--  Brane

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2403870