You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by "C. Michael Pilato" <cm...@collab.net> on 2004/08/05 13:17:16 UTC

Re: Database Schema for Commit Logs

Just adding James Henstridge to the Cc: list, who just recently wrote
the ViewCVS/Subversion/MySQL commit database stuffs.  James, can you
field this man's questions?

Mark Phippard <Ma...@softlanding.com> writes:

> I want to create a database schema where I will log all of my commits, via 
> a hook script, and perhaps also capture and store other statistics in the 
> process such as diff stats.  My ultimate goal is to drive a 
> ViewCVS/WebSVN-type web application written in Java/JSP.  Ideally, I would 
> like to have the web app work completely off the database, as opposed to 
> talking to the actual repository.  Exception would be pages that display 
> file contents etc...
> 
> I am curious if someone has already done some of this, particularly the 
> database schema design.  In particular, I am curious as to how best to 
> design the schema to make it easy to build the kind of repository/browse 
> and drill-down features as contained in ViewCVS.  I have looked at it's 
> SQL database but I am unclear as to how it is actually used by the web 
> interface.  I didn't see how the schema could support representing a 
> repository layout, as I saw nothing that indicated the relationships 
> between directories in the repository.
> 
> It would be easy to design a repository that just stored the commit info, 
> but ideally, I would like the file info from the commits to be stored in a 
> way that would make it easy to build a repository browser off of the info.
> 
> So, has anyone done this before?



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Database Schema for Commit Logs

Posted by James Henstridge <ja...@jamesh.id.au>.

On 08/08/04 02:20, C. Michael Pilato wrote:

>In other words, you'd need to practically duplicate the entire
>repository sans file contents. :-)
>  
>
Yep.  Is there any reason why this repository browser needs to be run 
off a database instead of the repository itself?

With the "fsfs" repository backend, it should be a lot easier to provide 
direct repository access to the web server, since read only access is 
sufficient.  That should greatly reduce the security concerns about 
providing a web based repository browser (if the repository browser 
can't write to the repository, it can't damage it).

James.

-- 
Email: james@jamesh.id.au
WWW:   http://www.jamesh.id.au/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Database Schema for Commit Logs

Posted by "C. Michael Pilato" <cm...@collab.net>.

James Henstridge <ja...@jamesh.id.au> writes:

> I suppose if I was designing something like this that needed to be
> able  to get directory listings and file logs, the database schema
> would probably need a table containing entries for each file/directory
> in each revision with at least these fields (you might create unique
> ids for some of these fields and store the actual values in other
> tables):

In other words, you'd need to practically duplicate the entire
repository sans file contents. :-)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Database Schema for Commit Logs

Posted by James Henstridge <ja...@jamesh.id.au>.

On 05/08/04 21:17, C. Michael Pilato wrote:

>Just adding James Henstridge to the Cc: list, who just recently wrote
>the ViewCVS/Subversion/MySQL commit database stuffs.  James, can you
>field this man's questions?
>  
>
I'll give it a go, although the viewcvs query support really only stores 
enough information to perform its queries.  The Bonsai schema is 
probably not sufficient for a full repository browser though (or at 
least, the subversion repository would provide a better backend).

Also, the Bonsai schema was designed for use with CVS, so doesn't store 
any information about copies or moves.  Also, it doesn't offer an easy 
way to do things like get a directory listing for a particular revision.

>Mark Phippard <Ma...@softlanding.com> writes:
>
>  
>
>>I want to create a database schema where I will log all of my commits, via 
>>a hook script, and perhaps also capture and store other statistics in the 
>>process such as diff stats.  My ultimate goal is to drive a 
>>ViewCVS/WebSVN-type web application written in Java/JSP.  Ideally, I would 
>>like to have the web app work completely off the database, as opposed to 
>>talking to the actual repository.  Exception would be pages that display 
>>file contents etc...
>>
>>I am curious if someone has already done some of this, particularly the 
>>database schema design.  In particular, I am curious as to how best to 
>>design the schema to make it easy to build the kind of repository/browse 
>>and drill-down features as contained in ViewCVS.  I have looked at it's 
>>SQL database but I am unclear as to how it is actually used by the web 
>>interface.  I didn't see how the schema could support representing a 
>>repository layout, as I saw nothing that indicated the relationships 
>>between directories in the repository.
>>    
>>
I suppose if I was designing something like this that needed to be able  
to get directory listings and file logs, the database schema would 
probably need a table containing entries for each file/directory in each 
revision with at least these fields (you might create unique ids for 
some of these fields and store the actual values in other tables):

    * revision number
    * file path
    * type (normal file or directory)
    * parent path (so you can easily get a list of the contents of a
      directory)
    * a boolean field stating whether the file contents changed in this
      revision
    * another boolean, for the properties
    * the name of the file in the previous revision (maybe set to NULL
      if the file wasn't copied).

You'd probably also want a table containing revision information.  It 
could just hold (revision number, date, log message, author).

To list a particular directory, simply select for the appropriate 
revision number and parent path.  To get a file log select for a 
matching file path and records where the text or properties change, and 
join with the revision info table for the log messages.  You'd probably 
want to discard any rows after the first copy though (you could repeat 
the process using the old path name to get the logs from before the copy).

I've probably forgotten one or two things here, but it should get you 
started.  If you don't mind programming, you might want to look at 
viewcvs's "svndbadmin" hook script that extracts information needed by 
the Bonsai db schema for the revision.  It also includes some code to 
work out the "lines added/removed" counts that you might find useful.

James.

-- 
Email: james@jamesh.id.au
WWW:   http://www.jamesh.id.au/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org