You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Dave_Thomas mailing lists <da...@peoplemerge.com> on 2006/06/09 02:19:01 UTC

Maintaining a list of files that have this one property

Hi all,

I'm new to the subversion mailing list, sorry in advance if this belongs on
the developer list.

I have a constraint on subversion that files across the repository
containing a certain property must have unique names (this is a business
rule for our app... don't ask).

For example, suppose Trunk/src/foo.txt has the property must-be-unique=*.  I
have a hook in place that will prevents Trunk/conf/foo.txt having the same
property from being committed.

The problem?  The hook now makes this call:
pysvn.Client().ls(reposurl, recurse = True)
then traverses to find its friends.

Our repository is soon to grow to 20k files, and that call alone will take
minutes.

I couldn't find another function that implemented something like UNIX 'find'
checking properties as well.

The best solution I can come up with is to maintain a list of all files with
that property already but I have concerned questions about how to do this in
a hook.

Here's the idea.  On every commit transaction, I would read in a flatfile
list, or better yet, a pickle containing an array.  Then, if I need to add
an entry, I would somehow add the pickle to the transaction (so changing it
is atomic), and reserialize it.

Would I need to be creating a subtransaction (does that exist)?  Or is this
something as easy as attaching a couple of lines of DIFF to the
transaction?  (I feel like a homeless guy holding a sign "any spare
code appreciated. God bless")

Here's an alternative but I think you're going to say this is a bad
idea.   Doesn't subversion use a database internally?  Isn't it
transactional?  I could keep a table with all files with this property.  Or
maybe manage an index on this property or create a view on the
database? Will I be able to do this within a hook or do I need to go deep
into the subversion API to support this kind of thing?

Your thoughts on the most elegant solution would be appreciated!
Dave Thomas

Re: Maintaining a list of files that have this one property

Posted by Ryan Schmidt <su...@ryandesign.com>.
On Jun 9, 2006, at 04:19, Dave_Thomas mailing lists wrote:

> I have a constraint on subversion that files across the repository  
> containing a certain property must have unique names (this is a  
> business rule for our app... don't ask).
>
> For example, suppose Trunk/src/foo.txt has the property must-be- 
> unique=*.  I have a hook in place that will prevents Trunk/conf/ 
> foo.txt having the same property from being committed.
>
> The problem?  The hook now makes this call:
> pysvn.Client().ls(reposurl, recurse = True)
> then traverses to find its friends.
>
> Our repository is soon to grow to 20k files, and that call alone  
> will take minutes.

Yes, I can see how that would be a problem.


> I couldn't find another function that implemented something like  
> UNIX 'find' checking properties as well.

Yes, Subversion has no find command (yet; I'm not sure if one is  
planned or, if so, how far along it is).


> The best solution I can come up with is to maintain a list of all  
> files with that property already but I have concerned questions  
> about how to do this in a hook.
>
> Here's the idea.  On every commit transaction, I would read in a  
> flatfile list, or better yet, a pickle containing an array.  Then,  
> if I need to add an entry, I would somehow add the pickle to the  
> transaction (so changing it is atomic), and reserialize it.

I don't know what a pickle is in this context, but I could understand  
the flatfile solution, and agree that this should be faster than  
having Subversion list all files at each commit.


> Would I need to be creating a subtransaction (does that exist)?  Or  
> is this something as easy as attaching a couple of lines of DIFF to  
> the transaction?  (I feel like a homeless guy holding a sign "any  
> spare code appreciated. God bless")

You should never modify a transaction that's in progress (as in, in a  
pre-commit hook). There is no sub-transaction that I'm aware of. If  
you want to commit a change to a repository (in a post-commit hook,  
for example) you can certainly do so, but your post-commit hook would  
have to maintain its own working copy, update it, make the change in  
it, and commit it, making sure that the post-commit hook then does  
not attempt to act on the transaction it is itself in the process of  
committing.

I'm not sure what you want to commit though. So far, everything  
you've described is just a pre-commit hook that either allows or  
denies the user's commit, via some algorithm, depending on whether  
the filename is unique, given the presence of the must-be-unique  
property. What are you wanting to commit at this point? Are you  
talking about the list of filenames? That could be useful, but it  
should also work (more simply) to just have the flatfile somewhere on  
the repository server's hard drive.


> Here's an alternative but I think you're going to say this is a bad  
> idea.   Doesn't subversion use a database internally?  Isn't it  
> transactional?  I could keep a table with all files with this  
> property.  Or maybe manage an index on this property or create a  
> view on the database? Will I be able to do this within a hook or do  
> I need to go deep into the subversion API to support this kind of  
> thing?

Subversion can use BerkeleyDB or FSFS as a backend. BDB is of course  
a general-purpose a database system, which can be used for anything,  
but I'm not sure how good an idea it is to try to use a Subversion  
repository BDB database for your own purposes. FSFS is a format  
developed specifically for Subversion and I wouldn't want to try to  
store arbitrary data in that either.

If a flatfile for these filenames is not fast enough or you dislike  
it for other reasons, there's no reason why you couldn't set up your  
own relational database (MySQL or whatever) and use that. It would  
not be connected to Subversion, except by any connections you devise  
in your hook script, but does that matter?



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org