You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by "Glenn A. Thompson" <gt...@cdr.net> on 2002/05/30 16:34:22 UTC

Re: svnadmin deltify that evil thing (SQL internal deltification branch, Alpha company)

>

Hey,

>
>
> Subversion is complex.
> Lather, rinse repeat. :)
>

Or the newbie variation:
What the hell have I gotten myself into.
Later, rinse, repeat :-)

A new thread ?!?

OK so this SQL implementation is going to be a little harder for me than I
thought.  No problemo. I will keep working it.
Anyway, all these deltification discussions have made me feel I should bring
to light something I've been playing with.
As I'm still a Subversion Feebe, I would have waited until I better
understood subversion if it weren't for this thread.

Anyway,
For a SQL scheme it seems that having directories and properties rolled up
into a delta ball made adhoc queries difficult at best.
So I'm kicking around the idea of list data being stored in one of two
additional tables.  I'm not happy with the names but for now who cares.

Hanging off the Reps table (perhaps Revs?) are collectionAction (for
properties) and nodeCollectionAction (for directories).  Think of a row in
these tables as a method invocation on a list object.  Two methods would be
supported; "add" and "delete".

The tables look something like this:
collectionAction (
coll_id        large number,
action         set("add","delete"),
item_name  text,
item_value   text
)

nodeCollectionAction (
coll_id        largenumber,
action         set("add","delete"),
chld_nod    largenumber,
chld_cpy    largenumber,
chld_txn     largenumber
)


There also needs to be a flag somewhere which indicates that a particular
node-rev is a complete rev with respect to the list.  The moral equiv of
full text if you will.  As such it would contain only add rows for a given
data_id.  These "baselines" could be manufactured anywhere throughout a
lists history.  I also considered having a "has" action which makes this
more obvious.

So the current items in a list would be derived by applying "adds" and
"deletes" to a complete rev using SQL statments.  I assumed that I could use
the nodes predecessor to go back in time.  However, based on what I'm
reading that would be badness. So should I have it explicitly based on a
rep_id and so on and so on?  That seems to track more closely with deltas.
This would also allow me to use a "group by" clause using MAX(coll_id) to
resolve "add" or "delete" conflicts involving the same node-rev for
nodeCollectionAction or item_name in collectionAction.

These heirarchical queries are quite uuuuuuugggly without "connect by".
Does anyone know of any DB besides Oracle that has them?  Can you say temp
tables?

Does anyone have any major grief with the above chit?

Thanks,
gat





---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org