You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by "Bolstridge, Andrew" <an...@intergraph.com> on 2009/03/09 11:13:00 UTC

deleting files from the repository

Hi all.

I have a large (several gigabyte) repository and unfortunately a remote
developer has checked in quite a few temporary build objects.

What I'd like to ask is what others do about such a problem. I've tried
dumping the repo and running svndumpfilter on it, but that complained
that I'd dumped the repo with the -deltas option (of course) and refused
to have anything to do with it. I'd try dumping the repo without deltas,
but the resulting dump is unmanageably huge.

Are there any plans for an improved dump/filter process, preferably one
that could remove several files in 1 go?
Are there any plans for svn obliterate to be implemented (I'm not sure
of this as a client command, I think it should be an admin tool, which
would be nice)

Can I manually edit the dump file? If I wrote a tool that removed the
relevant sections, would it work? These particular files do not have any
branches/moves/renames etc. 


I've since updated the pre-commit hook to prevent this kind of problem,
but I live in fear that someone will upload a tiff or a database file.
Wouldn't it be a good idea to set such a hook up by default for most
repositories - it wouldn't be a newbie-knowledge problem, the first time
someone checked in a banned file, they'd get a commit-error message
telling them why and they could alter the hook.

Thanks, Andy

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=1295413

To unsubscribe from this discussion, e-mail: [users-unsubscribe@subversion.tigris.org].

Re: deleting files from the repository

Posted by Ryan Schmidt <su...@ryandesign.com>.
On Mar 10, 2009, at 09:19, Bolstridge, Andrew wrote:

>> -----Original Message-----
>> From: Ryan Schmidt [mailto:subversion-2009a@ryandesign.com]
>> Sent: Tuesday, March 10, 2009 6:00 AM
>> To: Bolstridge, Andrew
>> Cc: users@subversion.tigris.org
>> Subject: Re: deleting files from the repository
>>
> ...
>>
>> While the dumpfile format is not complex, I'm sure it would take a
>> significant amount of work to write a tool that processes it
>> accurately. I would instead strongly recommend you look into the
>> existing tools that can modify dumpfiles, including svndumpfilter,
>> svndumptool, svndumpfilter2 and svndumpfilter3.
>>
> My problem is that I have roughly 100 files to remove, and my repo is
> very large - dumping it takes several hours and I was under the
> impression that svndumpfilter only excluded 1 (explicitly specified)
> file at a time.
> I'll have another go this weekend with svndumpfilter3.

My reading of the book is that you can exclude as many items as you  
want.

http://svnbook.red-bean.com/en/1.5/ 
svn.ref.svndumpfilter.commands.c.exclude.html


>>> Wouldn't it be a good idea to set such a hook up by default for
>>> most repositories - it wouldn't be a newbie-knowledge problem, the
>>> first time someone checked in a banned file, they'd get a commit-
>>> error message telling them why and they could alter the hook.
>>
>> What criteria would you suggest the hook script use to reject files?
>> Are you talking about just matching certain filename extensions,
>> perhaps the same list Subversion uses by default for global-ignores?
>
> Yes, an alternative would be to block every file, the first thing an
> admin would do is delete or modify the list of extensions. If they
> forgot, they'd be reminded the first time they try to add a new file.
> The list could be extended dramatically - the current default set
> doesn't include many windows development exclusions, and will probably
> be always out of date, so exclude practically everything by default.

I don't think the devs will be keen on a modification that makes new  
repositories non-functional by default.

It would probably be fine to include a new hook script in the contrib  
directory showing how to do this kind of check. Then again, maybe the  
enforcer script can already be used to do this.

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=1306267

To unsubscribe from this discussion, e-mail: [users-unsubscribe@subversion.tigris.org].

RE: deleting files from the repository

Posted by "Bolstridge, Andrew" <an...@intergraph.com>.
> -----Original Message-----
> From: Ryan Schmidt [mailto:subversion-2009a@ryandesign.com]
> Sent: Tuesday, March 10, 2009 6:00 AM
> To: Bolstridge, Andrew
> Cc: users@subversion.tigris.org
> Subject: Re: deleting files from the repository
> 
...
> 
> While the dumpfile format is not complex, I'm sure it would take a
> significant amount of work to write a tool that processes it
> accurately. I would instead strongly recommend you look into the
> existing tools that can modify dumpfiles, including svndumpfilter,
> svndumptool, svndumpfilter2 and svndumpfilter3.
> 
My problem is that I have roughly 100 files to remove, and my repo is
very large - dumping it takes several hours and I was under the
impression that svndumpfilter only excluded 1 (explicitly specified)
file at a time. 
I'll have another go this weekend with svndumpfilter3.


> 
> > I've since updated the pre-commit hook to prevent this kind of
> > problem, but I live in fear that someone will upload a tiff or a
> > database file.
> 
> If it's size in the repository you're concerned with, your pre-commit
> hook could check the size of the committed files and fail if they're
> too large.

I have since done this for file extensions, good suggestion to restrict
by filesize too.

> 
> > Wouldn't it be a good idea to set such a hook up by default for
> > most repositories - it wouldn't be a newbie-knowledge problem, the
> > first time someone checked in a banned file, they'd get a commit-
> > error message telling them why and they could alter the hook.
> 
> What criteria would you suggest the hook script use to reject files?
> Are you talking about just matching certain filename extensions,
> perhaps the same list Subversion uses by default for global-ignores?
> 

Yes, an alternative would be to block every file, the first thing an
admin would do is delete or modify the list of extensions. If they
forgot, they'd be reminded the first time they try to add a new file.
The list could be extended dramatically - the current default set
doesn't include many windows development exclusions, and will probably
be always out of date, so exclude practically everything by default.

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=1303250

To unsubscribe from this discussion, e-mail: [users-unsubscribe@subversion.tigris.org].


Re: deleting files from the repository

Posted by Ryan Schmidt <su...@ryandesign.com>.
On Mar 9, 2009, at 06:13, Bolstridge, Andrew wrote:

> I have a large (several gigabyte) repository and unfortunately a  
> remote developer has checked in quite a few temporary build objects.
>
> What I’d like to ask is what others do about such a problem. I’ve  
> tried dumping the repo and running svndumpfilter on it, but that  
> complained that I’d dumped the repo with the –deltas option (of  
> course) and refused to have anything to do with it. I’d try dumping  
> the repo without deltas, but the resulting dump is unmanageably huge.
>
> Are there any plans for an improved dump/filter process, preferably  
> one that could remove several files in 1 go?
>

You don't need to write the huge dumpfile to disk. You can pipe the  
output of svnadmin dump directly to svndumpfilter whose output you  
can pipe directly into svnadmin load to put it back into a new  
repository.


> Are there any plans for svn obliterate to be implemented (I’m not  
> sure of this as a client command, I think it should be an admin  
> tool, which would be nice)
>
Everything we know about "svn obliterate" is in this ticket:

http://subversion.tigris.org/issues/show_bug.cgi?id=516

> Can I manually edit the dump file? If I wrote a tool that removed  
> the relevant sections, would it work? These particular files do not  
> have any branches/moves/renames etc.
>

While the dumpfile format is not complex, I'm sure it would take a  
significant amount of work to write a tool that processes it  
accurately. I would instead strongly recommend you look into the  
existing tools that can modify dumpfiles, including svndumpfilter,  
svndumptool, svndumpfilter2 and svndumpfilter3.


> I’ve since updated the pre-commit hook to prevent this kind of  
> problem, but I live in fear that someone will upload a tiff or a  
> database file.

If it's size in the repository you're concerned with, your pre-commit  
hook could check the size of the committed files and fail if they're  
too large.

> Wouldn’t it be a good idea to set such a hook up by default for  
> most repositories – it wouldn’t be a newbie-knowledge problem, the  
> first time someone checked in a banned file, they’d get a commit- 
> error message telling them why and they could alter the hook.

What criteria would you suggest the hook script use to reject files?  
Are you talking about just matching certain filename extensions,  
perhaps the same list Subversion uses by default for global-ignores?

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=1301295

To unsubscribe from this discussion, e-mail: [users-unsubscribe@subversion.tigris.org].