You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Greg Hudson <gh...@MIT.EDU> on 2004/03/31 20:34:07 UTC

FSFS: Plan of attack

I counted one vote for working on a branch (rooneg) and three for
working on the trunk (kfogel, mbk, and maxb), so barring objections, I
will create a trunk/subversion/libsvn_fs_fs directory soon.  Initially
it will have no integration with the build system and no way to hook
into the rest of the code base, but there will be reference materials
and some temporary standalone code which need a place to live.

I've come up with a high-level plan for the native-fs-backed-fs layer,
which I hope can facilitate incremental commits as well as parallel
work.  It goes thus:

  1. Develop a straw-man revision file format.  It's not critically
     important to get it right on the first try, but it should be
     something that we think can efficiently support the libsvn_fs
     read APIs, including the ones related to node history.

  2. Write code to turn a (standard, not diffy) svn dump into the
     proposed file format.  This code will not survive the project,
     but will allow rapid generation of test cases.  At this point,
     the code should store everything node-rev contents as plain text,
     and it can do stupid things like hold the entire directory
     structure of the head revision in memory.  It doesn't need to
     work incrementally.

  3. Write FS code to read, but not write, the file format, under the
     assumption that node-revs contain only plain text.  Use the tool
     from (2) to generate test cases to ensure that the code works.

  4. Develop a format for the directory part of mutable transactions.

  5. Produce some test case transactions, either using an extension of
     the tool from (2) or by hand.

  6. Extend the read code from (3) to be able to read from unfinished
     transactions as well as from revision files.

  7. Write FS code to create and modify unfinished transactions.
     It can store node-rev contents in plain text.

  8. Write FS code to perform the auto-merge of an unfinished
     transaction and any revisions which have occurred since the
     transaction was created.

  9. Write FS code to commit an unfinished transaction by marshalling
     its changed-directory data onto the end of the changed-file data
     and moving the resulting file into place.

  10. Extend the tool from (2) to write out node-rev contents in delta
      form, possibly using the code from (3) to determine the base
      contents to diff against.

  11. Extend the read code from (3) to handle deltas.

  12. Extend the write code from (7) and (9) to generate deltas
      instead of plain text.  (Or do we use deltas for directories in
      the current code?  I forget.  It may be desirable not to.)

Not every step depends on the previous step, so not everything has to
proceed in order.  Also, steps (1) and (2) can happen in parallel with
FS abstraction work.  (Really, the whole project could happen in
parallel with FS abstraction work; we could create a branch which rips
out the current libsvn_fs and replaces it with new stuff.  But it
would be less convenient to work with and harder to integrate at the
end, so it's probably better to do most or all of the abstraction work
first.)

Comments?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: FSFS: Plan of attack

Posted by kf...@collab.net.
Josh Pieper <jp...@andrew.cmu.edu> writes:
> So my proposal is to create a branch, say called
> 'libsvn_fs_fs_abstraction' where this prototype implementation of
> libsvn_fs_fs can live.  Once we have it (or during the process of
> creating it), we decide on a good abstraction layer, then libsvn_fs_fs
> can be re-implemented to meet that abstraction.
> 
> Thoughts?

If no development would be happening on trunk libsvn_fs_fs while this
branch was being worked on, then don't bother to branch.  Just do that
ugly stuff on trunk.  We already know that subdir is prototype code.
If you later discover you *need* to branch, for parallel development,
then you can.

(Also, it'd be weird that the branch named 'libsvn_fs_fs_abstraction'
would in fact start out as a *non*-abstracted implementation :-) ).

Just my $0.02,
-K

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: FSFS: Plan of attack

Posted by Edmund Horner <ch...@chrysophylax.cjb.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Greg Hudson wrote:
| On Sat, 2004-04-10 at 20:59, Edmund Horner wrote:

|>1. Doesn't the existence/level of a lower-level abstration depend on the
|>implementation of a higher-level abstraction?
|
|
| Sorry, that was supposed to be implicit.  If we have abstraction layer
| (1), then layer (2) only applies to the implementations which follow the
| current implementation fork, and likewise if we have distinct
| abstractions at both (2) and (3) (as CMike wishes to avoid), only
| implementations which follow the current implementation fork at (2)
| would be participating in the abstraction at (3).

Ok fair enough then, I should have considered that everyon else might be
seeing it like that too.  But I guess (depending on how many ultimate
implementations there might be) you might get your level (2)
abstraction at a different level from a level (2) abstraction in another
implementation?  (Hmm, I'm getting a bit worried about how repository
format versions are going to be assigned.  Might get a bit like TCP/IP
port numbers.)

| Glenn Thompson wrote:
|
|>I'll finish the HTML formatting as soon as possible so folks can
|>modify the document if they wish.
|
|
| This document has the same problem as your "master patch": it's 31 pages
| long and hard to digest.  We really, really need your contributions to
| this process to be concise, focused, and objective.  If we're arguing
| about a particular aspect of FS abstraction and you have an opinion
| about it, we need you to give your opinion about just that one question,
| not say, "I've been working on this big document with diagrams over here
| and part of it talks about that; please go read it."

Ha ha, yes it is a big document, but then the file system is pretty
complicated.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFAegSTEbvImmpUq7gRAgy1AJ9wiX/Yh57X9lYpi1ifrfbMYVVvdQCeLe3m
HwFkEaqjsbr2uo81rk1idN4=
=HKhV
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: FSFS: Plan of attack

Posted by "Glenn A. Thompson" <gt...@cdr.net>.
Greg,

First, I'd like to say thanks for working your way through my document.

Greg Hudson wrote:

>On Mon, 2004-04-12 at 00:16, C. Michael Pilato wrote:
>  
>
>>Having read Glenn's document in full in the past, I don't recall
>>feeling that it's size was not warranted by the scope of the problem.
>>    
>>
>
>Well, the size wasn't my only criticism; I felt like it wandered a lot. 
>
Heh, you know I've never heard this about my writing before:-)  I've 
been told my writing is easier to follow after meeting me.  I'm a bit 
hyper and as such I shift gears quickly.

>I often had trouble telling when it was talking about the API-level
>vtable or about the FSP-level vtable within the baseline FSAP, and there
>were a lot of personal comments which tended to draw my attention
>astray.
>
I can see both of these points.  I'll take another stab clarifying some 
sections.  I wasn't happy with my discussion of the construction of the 
three primary objects either.  It's crucial to understanding the 
flexibility I'm trying to build in.

As for your remaining comments:  I've started a response but I need to 
sleep on it. 

I'll post back tomorrow with more.

Thanks again,
gat

Re: FSFS: Plan of attack

Posted by "Glenn A. Thompson" <gt...@cdr.net>.
[...]

>I often had trouble telling when it was talking about the API-level
>vtable or about the FSP-level vtable within the baseline FSAP, and there
>were a lot of personal comments which tended to draw my attention
>astray.
>
I can see both of these points.  I'll take another stab at clarifying 
some sections.  I wasn't happy with my discussion of the construction 
and initialization of the three primary objects.  It's crucial to 
understanding the flexibility I'm trying to provide.

Just to clarify a few things:
1.  File System providers (FSPs) should always be concrete.  A FSP 
writer may want a vtable for certain virtual methods, but they should 
attempt to push abstractions into a File System Abstract Provider 
(FSAP).  Any virtual methods that FSPs introduce should be resolved 
during initialization.

2.  FSPs can be a Hybrid multi-level implementation.  A FSP writer may 
choose to override methods at the "Big Three" API level on down.  I'm 
trying to encourage overriding of methods at an appropriate level within 
the FS layer.   I'd like to get rid of the layer (1), (2), (3) mindset 
and instead think of it like so:  The "XYZ" FSP is a concrete 
implementation of  the "DEF" FSAP which extends the "ABC" FSAP.  The XYZ 
FSP may implement and/or override methods of all FSAPs in it's 
ancestry.  Certain methods could be made "final" by convention.

3.  FSPs do not have to derive from a FSAP.  They can be written to the 
FS API level.  At a minimum they have to use the subversion name 
protection scheme and implement static vtable initializers and a few 
simple constructors.  They can be a flat as they wish.

4.  The approach I'm proposing is a *little* like the IOC containers 
that are gaining popularity in the Java community.


>
>At any rate, I feel I've absorbed it now.  My comments:
>
>  * Assuming we have an API-level vtable, the general method presented
>in the document (with separate vtables for FS objects, transaction
>objects, and root objects) seems fine.
>
Okay.

>
>  * I'm having trouble fitting the FSP abstraction together with the
>"Don't fall in love with a physical schema" sentiment expressed in
><http://www.contactor.se/~dast/svn/archive-2004-03/1664.shtml>.  The
>FSP-level abstraction allows some flexibility in the physical
>representation, but does assume a particular set of tables.  (This is
>not an area I'm heavily invested in, since I'm not interested in working
>on alternate DB implementations, but I'm still curious.)  
>
Yes.  I can see where I left this too lean.  The plan was to use the 
examples (which  I never put to words) to make this more clear.   I have 
always felt there would be more than one SQL FS implementation.  They 
will most likely have a fair amount of overlap.  This overlap will not 
necessarily be along nice clean "level 1,2,3" boundaries.

I was trying to avoid any SQL bias in my document. But since you opened 
the door.  IMO, Collections (Directories and Properties) are the biggest 
area of concern, and provide the most potential for divergence, in a SQL 
implementation.  In fact, I've already discussed changes to the "inner" 
baseline FSAP vtable which should help in this area.

Putting aside how the tables would be keyed, a typical implementation 
would have one member per row in some table.  Another solution would be 
to plunk a skel equivalent into a blob field.  This would be badness in 
SQL land as you lose selectivity (syntactically speaking).  So using the 
first method, how do we handle directory revisions?  I believe Sander 
was planning on replicating rows for every new revision.  
Programmatically, this is certainly the most straight forward solution.  
But consider the situation where at rev 10 we have a directory with 1000 
entries.  For rev 11 we add 1 new entry.  Now we have 2001 rows for just 
two revisions. This impacts performance in more than one way.  Besides 
the obvious rapid increase in table rows, it can also reduce the 
selectivity (key uniqueness) of a table.  Plus it generates a 
considerable number of inserts.  While most modern DBs can handle this 
reasonably well.  It's worth exploring alternatives.

An approach I want explore involves storing only the changes between 
revs with complete representations sprinkled throughout revision 
history. So rev 11 mentioned above would have a single "add x" row.  
Seem familiar? This can be viewed as a form of "in DB" deltification.  
It creates *ALL SORTS* of query challenges.  I offer this as an extreme 
example of  "physical schema" variations.  I'd rather not debate this at 
this time.  It's both risky and challenging.  Frankly, I don't know if I 
can make it work until I can make it work.  Implementations like this 
will make good use of the hierarchy mentioned in item 2 above.

>Also, the FSP
>abstraction does not appear to have pools in its vtable calls.  That's
>consistent with the current FS code, but seems like a good thing to fix.
>
I'll take a look at this. 

>
>  * I'm a little concerned about the long-term implications of this
>paragraph:
>        
>        Roundtrips kill performance. [...] In a SQL DBMS every call to
>        the DB creates considerable overhead.  I’m not picking on those
>        methods.  I’m just pointing out that many things being done
>        procedurally in  the current FS will be faster if they can be
>        done in a stored procedure or in “mega” queries.  Also, if a lot
>        of interim data is being processed, even temp tables solutions
>        beat application based procedural solutions in many cases.
>
>The implication here is that the ideal point of divergence for an SQL
>implementation might be at a *higher* level than the ideal point of
>divergence for libsvn_fs_fs; for while fs_fs wants to reuse all of the
>DAG logic in tree.c, an SQL implementation might want to use stored
>procedures or caching tables to minimize round trips.
>  
>
Yes, I believe this is exactly gstein's concern.  But again, I view this 
more on a "per method" or "per FSAP" basis.  Not "per FSP".  This is why 
I talk about reviewing the FS API.  The idea is to nail down very 
specific contracts for each and every method in the FS API.  My "Nits" 
section was an attempt to get that process started.

I hope this helps.

Thanks,
gat







---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: FSFS: Plan of attack

Posted by Greg Hudson <gh...@MIT.EDU>.
On Mon, 2004-04-12 at 00:16, C. Michael Pilato wrote:
> Having read Glenn's document in full in the past, I don't recall
> feeling that it's size was not warranted by the scope of the problem.

Well, the size wasn't my only criticism; I felt like it wandered a lot. 
I often had trouble telling when it was talking about the API-level
vtable or about the FSP-level vtable within the baseline FSAP, and there
were a lot of personal comments which tended to draw my attention
astray.

At any rate, I feel I've absorbed it now.  My comments:

  * Assuming we have an API-level vtable, the general method presented
in the document (with separate vtables for FS objects, transaction
objects, and root objects) seems fine.

  * I'm having trouble fitting the FSP abstraction together with the
"Don't fall in love with a physical schema" sentiment expressed in
<http://www.contactor.se/~dast/svn/archive-2004-03/1664.shtml>.  The
FSP-level abstraction allows some flexibility in the physical
representation, but does assume a particular set of tables.  (This is
not an area I'm heavily invested in, since I'm not interested in working
on alternate DB implementations, but I'm still curious.)  Also, the FSP
abstraction does not appear to have pools in its vtable calls.  That's
consistent with the current FS code, but seems like a good thing to fix.

  * I'm a little concerned about the long-term implications of this
paragraph:
        
        Roundtrips kill performance. [...] In a SQL DBMS every call to
        the DB creates considerable overhead.  I’m not picking on those
        methods.  I’m just pointing out that many things being done
        procedurally in  the current FS will be faster if they can be
        done in a stored procedure or in “mega” queries.  Also, if a lot
        of interim data is being processed, even temp tables solutions
        beat application based procedural solutions in many cases.

The implication here is that the ideal point of divergence for an SQL
implementation might be at a *higher* level than the ideal point of
divergence for libsvn_fs_fs; for while fs_fs wants to reuse all of the
DAG logic in tree.c, an SQL implementation might want to use stored
procedures or caching tables to minimize round trips.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: FSFS: Plan of attack

Posted by "C. Michael Pilato" <cm...@collab.net>.
Greg Hudson <gh...@MIT.EDU> writes:

> This document has the same problem as your "master patch": it's 31 pages
> long and hard to digest.  We really, really need your contributions to
> this process to be concise, focused, and objective.  If we're arguing
> about a particular aspect of FS abstraction and you have an opinion
> about it, we need you to give your opinion about just that one question,
> not say, "I've been working on this big document with diagrams over here
> and part of it talks about that; please go read it."

Having read Glenn's document in full in the past, I don't recall
feeling that it's size was not warranted by the scope of the problem.
And especially since we're talking changing one of the most
fundamental layers of Subversion -- the one that talk to database that
holds all your goodies -- I think all parties who claim to have an
interest in doing this abstraction Right should suck it up and go read
the document if they haven't already.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: FSFS: Plan of attack

Posted by Greg Hudson <gh...@MIT.EDU>.
On Sat, 2004-04-10 at 20:59, Edmund Horner wrote:
> Greg Hudson wrote:
> |   (1) At the API level
> |   (2) At a level which lets libsvn_fs_fs reuse the DAG logic (tree.c)
> |   (3) At the level appropriate for pluggable DB support

> By now I realise (only slightly self-pityingly) that I'm not going to
> influence any of you one way or the other, but:

What does this mean?  It seems like as the author of a prototype SQL
implementation, you'd have a unique view on where (3) should lie.

> 1. Doesn't the existence/level of a lower-level abstration depend on the
> implementation of a higher-level abstraction?

Sorry, that was supposed to be implicit.  If we have abstraction layer
(1), then layer (2) only applies to the implementations which follow the
current implementation fork, and likewise if we have distinct
abstractions at both (2) and (3) (as CMike wishes to avoid), only
implementations which follow the current implementation fork at (2)
would be participating in the abstraction at (3).

Glenn Thompson wrote:
> I'll finish the HTML formatting as soon as possible so folks can
> modify the document if they wish.

This document has the same problem as your "master patch": it's 31 pages
long and hard to digest.  We really, really need your contributions to
this process to be concise, focused, and objective.  If we're arguing
about a particular aspect of FS abstraction and you have an opinion
about it, we need you to give your opinion about just that one question,
not say, "I've been working on this big document with diagrams over here
and part of it talks about that; please go read it."


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: FSFS: Plan of attack

Posted by "Glenn A. Thompson" <gt...@cdr.net>.
> influence any of you one way or the other, but:
>
> 1. Doesn't the existence/level of a lower-level abstration depend on the
> implementation of a higher-level abstraction?  E.g. there might be two
> implementations at level (1): 

Yes as you may rememeber I call them File System Abstract Providers 
(FSAPs).  Eventually a see around three.

>
>
> ~  - libsvn_fs_fs, where the higher-level FS stuff is closely tied to the
> storage, so there's no need/room for further abstraction.
> ~  - libsvn_fs_default (the existing file system), with abstraction (2)
> below it.
>
> At level (2) in svn_fs_fs there might be two implementations:
>
> ~  - libsvn_fs_bdb (existing BDB storage).
> ~  - libsvn_fs_sql (SQL storage), with abstraction (3) to allow for a
> growing range of SQL client libraries/dialects.

Yup.  Plus tweaked method versions that install methods at level 1.

>
> Obviously a nice tree diagram can be imagined here, with the salient
> feature that *not all leaves are the same distance from root*.


Edmond, I think *I'm* with you.
http://www.cdrguys.com/subversion/pluggable3.pdf
I'll finish the HTML formatting as soon as possible so folks can modify 
the document if they wish.

Gotta run,
gat




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: FSFS: Plan of attack

Posted by Edmund Horner <ch...@chrysophylax.cjb.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Greg Hudson wrote:
| So, if I understand all this right, there are three possibly distinct
| places to chop up libsvn_fs:
|
|   (1) At the API level
|   (2) At a level which lets libsvn_fs_fs reuse the DAG logic (tree.c)
|   (3) At the level appropriate for pluggable DB support
|
| As I understand it, you are in favor of chopping at (1) now, (3) when
| someone actually gets around to writing an SQL back end, and (2) never,
| presumably because chopping up the current FS implementation in two
| places would make it hard to maintain.
|
| My question is: how far apart are (2) and (3)?  I think I have seen some
| argument over how much of the schema should be assumed by the
| pluggable-DB abstraction layer.

By now I realise (only slightly self-pityingly) that I'm not going to
influence any of you one way or the other, but:

1. Doesn't the existence/level of a lower-level abstration depend on the
implementation of a higher-level abstraction?  E.g. there might be two
implementations at level (1):

~  - libsvn_fs_fs, where the higher-level FS stuff is closely tied to the
storage, so there's no need/room for further abstraction.
~  - libsvn_fs_default (the existing file system), with abstraction (2)
below it.

At level (2) in svn_fs_fs there might be two implementations:

~  - libsvn_fs_bdb (existing BDB storage).
~  - libsvn_fs_sql (SQL storage), with abstraction (3) to allow for a
growing range of SQL client libraries/dialects.

Obviously a nice tree diagram can be imagined here, with the salient
feature that *not all leaves are the same distance from root*.

Edmund.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFAeJhWEbvImmpUq7gRAgsKAJ9m9WJuYYTKA4VI2XYtFNOvy+hFmgCgjpkk
l/ZqFKHmoSHZqkj0hTZGBLk=
=91ii
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: FSFS: Plan of attack

Posted by "C. Michael Pilato" <cm...@collab.net>.
Greg Hudson <gh...@MIT.EDU> writes:

> So, if I understand all this right, there are three possibly distinct
> places to chop up libsvn_fs:
> 
>   (1) At the API level
>   (2) At a level which lets libsvn_fs_fs reuse the DAG logic (tree.c)
>   (3) At the level appropriate for pluggable DB support
> 
> As I understand it, you are in favor of chopping at (1) now, (3) when
> someone actually gets around to writing an SQL back end, and (2) never,
> presumably because chopping up the current FS implementation in two
> places would make it hard to maintain.

Right.  I just don't see the need for 3 divisions.

> My question is: how far apart are (2) and (3)?  I think I have seen
> some argument over how much of the schema should be assumed by the
> pluggable-DB abstraction layer.

That question isn't easy to answer because today there is too much
bleed of BDB-isms into the upper layers.  Until we can purge dag.c and
tree.c of those BDB-isms, it'll be hard to say where the right break
is.

My gut feeling, though, is that the pluggable-DB abstraction API must
necessarily define at least a basic schema.  There will be notions of
things like revisions and transactions and nodes.  Those thing will
have piece of information associated with them which can be used to
determine the relationships between them.  Functions are going to
accept certain parameters and return certain values.  You see where
I'm going.

I don't know right now what the final API will look like.  It's
possible that (2) and (3) are closer than I think.  We'll try to keep
that API as vague as we can without going mad, and try to define
relationships between first-class schema objects without defining
their storage and retrieval mechanisms, and hopefully without
destroying any hopes of great performance from all the available
backends.  That's the goal of the exercise.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: FSFS: Plan of attack

Posted by Greg Hudson <gh...@MIT.EDU>.
On Thu, 2004-04-08 at 21:24, C. Michael Pilato wrote:
> Josh Pieper <jp...@andrew.cmu.edu> writes:
> > If you put a vtable right at the svn_fs.h level, then every single
> > filesystem would have to re-implement tree.c.  That's a big hunk of
> > code that is very error prone.

> I can't say that that reason alone is especially interesting to me.
> What do we do when someone else comes along and says they want to
> write libsvn_fs_gofigure(), and they have no use at all for any of the
> tree.c code?  Our public API doesn't require that there even *be* a
> DAG subsystem -- this is an implementation detail of our initial
> shot at a versioning filesystem library.

> I think the thing to do here is to put the abstraction layer
> immediately below the existing svn_fs.h.

So, if I understand all this right, there are three possibly distinct
places to chop up libsvn_fs:

  (1) At the API level
  (2) At a level which lets libsvn_fs_fs reuse the DAG logic (tree.c)
  (3) At the level appropriate for pluggable DB support

As I understand it, you are in favor of chopping at (1) now, (3) when
someone actually gets around to writing an SQL back end, and (2) never,
presumably because chopping up the current FS implementation in two
places would make it hard to maintain.

My question is: how far apart are (2) and (3)?  I think I have seen some
argument over how much of the schema should be assumed by the
pluggable-DB abstraction layer.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: FSFS: Plan of attack

Posted by "C. Michael Pilato" <cm...@collab.net>.
Josh Pieper <jp...@andrew.cmu.edu> writes:

> If you put a vtable right at the svn_fs.h level, then every single
> filesystem would have to re-implement tree.c.  That's a big hunk of
> code that is very error prone.

I can't say that that reason alone is especially interesting to me.
What do we do when someone else comes along and says they want to
write libsvn_fs_gofigure(), and they have no use at all for any of the
tree.c code?  Our public API doesn't require that there even *be* a
DAG subsystem -- this is an implementation detail of our initial
shot at a versioning filesystem library.

I think the thing to do here is to put the abstraction layer
immediately below the existing svn_fs.h.  Remember, if someone doesn't
want to re-write the tree.c code, they can do *exactly* what you plan
to do with libsvn_fs_fs -- copy libsvn_fs.

In fact, I'm surprised that you guys haven't just literally 'svn copy
libsvn_fs libsvn_fs_fs' and then re-added all the docs you've put in.

> > If I'm understanding your proposal correctly, none of this involves
> > changing a single line of code in libsvn_fs, right?
> 
> Correct.

Eeeeeexcellent.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: FSFS: Plan of attack

Posted by Josh Pieper <jp...@andrew.cmu.edu>.
On Thu, Apr 08, 2004 at 07:41:54PM -0500, C. Michael Pilato wrote:
>
> > This would be the top level abstraction, i.e. implementing a tree
> > structured filesystem on top of the DAG.  There would still be a
> > need for another abstraction for different database backends.
> 
> Have we reached consensus on whether the top-most FS abstraction would
> be immediately below svn_fs.h or immediately above dag.h?  If the
> answer is, "Yes, immediately above dag.h", then the answer is false,
> because I think that the svn_fs.h should be converted to mostly one
> (or two) big fat vtables and some init functions.  If people want to
> rewrite filesystem libraries, let 'em rewrite the whole thing, because
> next week someone else is going to come along and want the abstraction
> moved up a little higher.

If you put a vtable right at the svn_fs.h level, then every single
filesystem would have to re-implement tree.c.  That's a big hunk of
code that is very error prone.

> > So my proposal is to create a branch, say called
> > 'libsvn_fs_fs_abstraction' where this prototype implementation of
> > libsvn_fs_fs can live.  Once we have it (or during the process of
> > creating it), we decide on a good abstraction layer, then
> > libsvn_fs_fs can be re-implemented to meet that abstraction.
> 
> As Karl noted in his reply, there's no reason to branch unless you
> know you'll be conflicting with other libsvn_fs_fs work.

Ok, that seems to be the consensus, so that's the route I will go.

> If I'm understanding your proposal correctly, none of this involves
> changing a single line of code in libsvn_fs, right?

Correct.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: FSFS: Plan of attack

Posted by "C. Michael Pilato" <cm...@collab.net>.
Josh Pieper <jp...@andrew.cmu.edu> writes:

> This would be the top level abstraction, i.e. implementing a tree
> structured filesystem on top of the DAG.  There would still be a
> need for another abstraction for different database backends.

Have we reached consensus on whether the top-most FS abstraction would
be immediately below svn_fs.h or immediately above dag.h?  If the
answer is, "Yes, immediately above dag.h", then the answer is false,
because I think that the svn_fs.h should be converted to mostly one
(or two) big fat vtables and some init functions.  If people want to
rewrite filesystem libraries, let 'em rewrite the whole thing, because
next week someone else is going to come along and want the abstraction
moved up a little higher.

> So my proposal is to create a branch, say called
> 'libsvn_fs_fs_abstraction' where this prototype implementation of
> libsvn_fs_fs can live.  Once we have it (or during the process of
> creating it), we decide on a good abstraction layer, then
> libsvn_fs_fs can be re-implemented to meet that abstraction.

As Karl noted in his reply, there's no reason to branch unless you
know you'll be conflicting with other libsvn_fs_fs work.

If I'm understanding your proposal correctly, none of this involves
changing a single line of code in libsvn_fs, right?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: FSFS: Plan of attack

Posted by Josh Pieper <jp...@andrew.cmu.edu>.
I've been working on the fs_fs proposal and the svn_fs abstraction
layer a bit, mostly on point 3 in ghudson's Workplan:

>  3. Write FS code to read, but not write, the file format, under the
>  assumption that node-revs contain only deltas against the empty
>  base.  Use the tool from (2) to generate test cases to ensure that
>  the code works.

What I've found so far is that while a significant part of the code in
libsvn_fs/ is independent of the access method, there are a lot of
assumptions about BDB interspersed and mixed in with the generic
things.

My plan for trying to sort out which parts are generic and which parts
are access specific, (i.e. the abstraction layer) has been to copy all
the relevant files from libsvn_fs into libsvn_fs_fs, hack out what is
unnecessary, and add all fs_fs specific functions separately.  Then
hopefully when it all works, we can use the resulting two
implementations of svn_fs to derive a more permanent abstraction
layer.  This would be the top level abstraction, i.e. implementing a
tree structured filesystem on top of the DAG.  There would still be a
need for another abstraction for different database backends.

So my proposal is to create a branch, say called
'libsvn_fs_fs_abstraction' where this prototype implementation of
libsvn_fs_fs can live.  Once we have it (or during the process of
creating it), we decide on a good abstraction layer, then libsvn_fs_fs
can be re-implemented to meet that abstraction.

Thoughts?

	    
-- 
A modem is a baudy house.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: FSFS: Plan of attack

Posted by "C. Michael Pilato" <cm...@collab.net>.
"C. Michael Pilato" <cm...@collab.net> writes:

> Greg Hudson <gh...@MIT.EDU> writes:
> 
> > I counted one vote for working on a branch (rooneg)
> 
> Whoa, there!  Branko and I both said "branch".  I said "I prefer this
> work be done on a branch...", and Branko replied "+1 to the Nth
> power".

Oops.  As you gracefully reminded me in IRC, I was cool with trunk
development after I realized you only needed the FS API abstraction.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: FSFS: Plan of attack

Posted by "C. Michael Pilato" <cm...@collab.net>.
Greg Hudson <gh...@MIT.EDU> writes:

> I counted one vote for working on a branch (rooneg)

Whoa, there!  Branko and I both said "branch".  I said "I prefer this
work be done on a branch...", and Branko replied "+1 to the Nth
power".

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org