You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Karl Fogel <kf...@collab.net> on 2001/03/28 23:55:48 UTC

Change #6

Greg Stein <gs...@lyra.org> writes:
> On Wed, Mar 28, 2001 at 06:20:52PM -0500, Greg Hudson wrote:
> > > Euh... I still need Change #7 (from STACK) to enable sending files
> > > up to the server.
> > 
> > I don't understand.  What do copy nodes have to do with enabling
> > sending files up to the server?
> 
> Sorry. #6

Have you read the problem noted in that item?  Here's the relevant bit
from STACK:

      Editor composition becomes more difficult if we use streams.  A
      window is a discrete chunk of data that can be used by several
      consumers, but streams are different: if consumer A reads some
      data off a stream, then when consumer B reads, she'll get
      different results.  You'd have to design your streams in a funky
      way to make this not be a problem.

      In some circumstances, this isn't an issue.  After all, usually
      a set of composed editors is a bunch of lightweight editors,
      that don't do much, surrounding a core editor that does the real
      work.  For example, an editor that prints out filenames wrapped
      with an editor that actually updates those files.  In such
      cases, the lightweight editor simply never reads data off the
      stream, so the core editor is not deprived of anything.

      But other editors (say, a commit guard?) might want to actually
      examine file data.  That could have bad consequences if we
      switch from windows to streams.

Re: Change #6

Posted by Greg Stein <gs...@lyra.org>.

On Fri, Mar 30, 2001 at 09:29:38AM -0600, Ben Collins-Sussman wrote:
> Greg Stein <gs...@lyra.org> writes:
> > All right. Let me do this one more time.
> > 
> > 1) if I only receive delta windows, then I can't know whether the
> >    Content-Type is application/octet-stream ("plain text") or
> >    application/vnd.svn-svndiff
> 
> As Karl said, why can't you:
> 
>   * add a field to svn_txdelta_window_t that indicates the content-type
> 
>   * make svn_txdelta_apply() generate window-consumers that understand
>     this field
> 
>   * add a "content-type" argument to svn_txdelta() so it will produce
>     a window stream in the format you like.

ra_dav/commit.c is not the person who sets up the stream, so commit.c has no
way to set the field or to select the format.

Since the delta stream is done by the *driver*, there is nothing the
*editor* can do about it. That sucks.

>...
> > 3) I cannot enable "send plain text all the time"
> > 
> > 4) I cannot choose a different diff format for the wire.
> 
> This tweak should allow you to accomplish your points #1, #3, and #4.

Sorry, see above.

> > 2) the delta window model is "push-to-editor". the network code works best
> >    with a "pull-from-source" model. in fact, we may not even get Neon to be
> >    able to work with the push model, except by buffering everything into
> >    memory first.
> 
> This is the only point that scares me.  Yikes.  Is this a real
> roadblock?

Yes. There is no way to use Neon in a push mode. I create a temporary file
right now, and then let Neon pull from that file.

If we had a pull-style delta interface, then I could set up Neon to pull
from that delta stream.

For systems that want to stick with the push style, then we'd just have a
function like:

apply_delta(stream)
{
    while (1) {
      window = read_window_from_stream(stream)
      if (window)
          old_push_style_function(window)
    }
}

IOW, a pull-style interface allows an editor to use either style, at its
convenience. A push-style at the API cannot be converted into a pull-style
unless you buffer or use two threads, a pipe, and IPC.

> I'm confused about this, because neon is acting as an HTTP client and
> mod_dav_svn is acting as the HTTP server.  Don't HTTP clients "push"
> requests at servers, and servers respond?  How could mod_dav_svn
> possibly "pull" data from neon?  That seems like a backwards HTTP
> model to me.  Maybe you can clarify.

Yes, the client pushes stuff up, but the logic in Neon is:

- start request
- do:
    - read a block of data to send
    - write the block to the network
- end request

The start/end request is a single function call. It pulls data from us as it
needs it, to send it over the wire.

Want to know something *really* weird? mod_dav_svn *does* pull data from
Neon. Time to put on your heavy-duty thinking cap here:

  Within Apache, when handling the data from a PUT, we read data from the
  network, and throw it at the FS (in a loop). On the client side, it pushes
  data over the network, at Apache. Here is the trick: when the client
  writes to the network, it might *block*. What happens is that the TCP
  stack on the Apache side becomes "full" if Apache doesn't read the data
  (at all, or fast enough). This "fullness" is sent by the magic of TCP back
  to the client side. When it is full, the client will block, waiting for
  the server to signal that it is no longer full. The client then resumes.
  In effect, mod_dav_svn reads from the network, which then signals Neon to
  write more to the network.

  mod_dav_svn pulls from Neon.

:-)

> >    (this buffering in memory will be happening for M2, btw; don't go commit
> >     100M files unless you can hold that in memory on your client)
> 
> Heh, this is happening anyway within the client *and* fs libraries too.
> It will all be fixed, no worries.  :)

I buffer to disk now. Couldn't do it to memory.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: Change #6

Posted by Ben Collins-Sussman <su...@newton.ch.collab.net>.

Greg Stein <gs...@lyra.org> writes:

> All right. Let me do this one more time.
> 
> 1) if I only receive delta windows, then I can't know whether the
>    Content-Type is application/octet-stream ("plain text") or
>    application/vnd.svn-svndiff

As Karl said, why can't you:

  * add a field to svn_txdelta_window_t that indicates the content-type

  * make svn_txdelta_apply() generate window-consumers that understand
    this field

  * add a "content-type" argument to svn_txdelta() so it will produce
    a window stream in the format you like.

This is a tweak to the existing system... seems more elegant to me
than a total inversion of the interface.

> 
> 3) I cannot enable "send plain text all the time"
> 
> 4) I cannot choose a different diff format for the wire.
> 

This tweak should allow you to accomplish your points #1, #3, and #4.

> 2) the delta window model is "push-to-editor". the network code works best
>    with a "pull-from-source" model. in fact, we may not even get Neon to be
>    able to work with the push model, except by buffering everything into
>    memory first.

This is the only point that scares me.  Yikes.  Is this a real
roadblock?

I'm confused about this, because neon is acting as an HTTP client and
mod_dav_svn is acting as the HTTP server.  Don't HTTP clients "push"
requests at servers, and servers respond?  How could mod_dav_svn
possibly "pull" data from neon?  That seems like a backwards HTTP
model to me.  Maybe you can clarify.

>    (this buffering in memory will be happening for M2, btw; don't go commit
>     100M files unless you can hold that in memory on your client)

Heh, this is happening anyway within the client *and* fs libraries too.
It will all be fixed, no worries.  :)

Re: Change #6

Posted by Greg Stein <gs...@lyra.org>.

On Thu, Mar 29, 2001 at 06:18:39PM -0600, Ben Collins-Sussman wrote:
> Greg Stein <gs...@lyra.org> writes:
> > My comment still stands: I can see a need/use for this change. The counter
> > point seems to be theoretical. So, what is the problem with doing the
> > change?
> 
> It means a *whole* lot of rewriting -- lots of time and effort for
> little benefit.  (Unless I underestimate how hard debugging is for
> you).

I'm not suggesting to do it now. That is exactly why I said that I'll find
a workaround in the mean time. But I am still going to continue to push for
this after M2.

And I've already outlined the "benefits" (in certain cases, "fixes").
Without the change, there are some things that I simply cannot do.

Lastly, just because rewriting is called for, does not mean it is improper.
Existence does not mean best/correct. Karl's got a phrase for this.

> Saying that "but we might be able to use sendfile() someday" is just
> as theoretical as saying "but we might break future editor
> compositions".  Neither holds real weight.  :)

Untrue. I am saying we can/will use sendfile. That is a specific use case.
Nobody has specified a use case of an editor composition that would read the
streams, so it remains theoretical.

[ and I could easily build a stream that provides for double-reading in any
  case. read N bytes into a buffer, return the buffer. next reader consumes
  the buffer. ]

> (And, as Karl pointed out, we can still make debugging easier for you
> by tweaking the existing system, rather than inventing a new system.
> This still isn't enough?)

It isn't just for debugging, but for actual operation.

If it was simply for debugging, then I wouldn't bring it up. I'd just slam
my system around locally and leave it at that.

All right. Let me do this one more time.

1) if I only receive delta windows, then I can't know whether the
   Content-Type is application/octet-stream ("plain text") or
   application/vnd.svn-svndiff

   (IMO, sending svndiff over the wire for plain text does not follow the
    HTTP design well; we want to send the appropriate type, and be able to
    label it properly)

2) the delta window model is "push-to-editor". the network code works best
   with a "pull-from-source" model. in fact, we may not even get Neon to be
   able to work with the push model, except by buffering everything into
   memory first.

   (this buffering in memory will be happening for M2, btw; don't go commit
    100M files unless you can hold that in memory on your client)

3) I cannot enable "send plain text all the time"

4) I cannot choose a different diff format for the wire.

-g

-- 
Greg Stein, http://www.lyra.org/

Re: Change #6

Posted by Ben Collins-Sussman <su...@newton.ch.collab.net>.

Greg Stein <gs...@lyra.org> writes:

> My comment still stands: I can see a need/use for this change. The counter
> point seems to be theoretical. So, what is the problem with doing the
> change?

It means a *whole* lot of rewriting -- lots of time and effort for
little benefit.  (Unless I underestimate how hard debugging is for
you).  

Saying that "but we might be able to use sendfile() someday" is just
as theoretical as saying "but we might break future editor
compositions".  Neither holds real weight.  :)

(And, as Karl pointed out, we can still make debugging easier for you
by tweaking the existing system, rather than inventing a new system.
This still isn't enough?)

Re: Change #6

Posted by Karl Fogel <kf...@collab.net>.

Greg Stein <gs...@lyra.org> writes:
> > However, I can see various possible workarounds for that; it
> > may not be a showstopper.)
> 
> As I mentioned to Ben just now, it is relatively straightforward to have a
> forked stream.

Yeah, that's one of the ways I was thinking.  Should be pretty easy.

> > The real issue at the moment [which I think Ben mentioned, but I
> > didn't] is that the change is a huge one, so it would be nice to get
> > M2 done first and then deal with it, that's all.
> 
> Agreed, and why I said that I'll find a workaround. And that workaround is
> to buffer the whole file into memory before delivery.

Cool.

-K

Re: Change #6

Posted by Greg Stein <gs...@lyra.org>.

On Thu, Mar 29, 2001 at 05:52:00PM -0600, Karl Fogel wrote:
>...
> That is to say, +1 on not punting the change, and on discussing it
> further, and probably +1 on finding a way to make it happen.

Sure.

> (I'm not sure how theoretical the editor composition objection is
> right now;

It is until you come up with something :-)

> there's *plenty* of editor composition happening in
> Subversion right now, and it's only a matter of time before some of
> those composees start reading from streams -- for example, in commit
> guards.

"commit guard" ?  And "a matter of time" isn't very persuasive. Yes, we do
composition, but none of them manipulate data in any way. They are simply
tracking what is happening. Thus, I don't see an obvious towards data
manipulation.

> However, I can see various possible workarounds for that; it
> may not be a showstopper.)

As I mentioned to Ben just now, it is relatively straightforward to have a
forked stream.

> The real issue at the moment [which I think Ben mentioned, but I
> didn't] is that the change is a huge one, so it would be nice to get
> M2 done first and then deal with it, that's all.

Agreed, and why I said that I'll find a workaround. And that workaround is
to buffer the whole file into memory before delivery.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: Change #6

Posted by Karl Fogel <kf...@collab.net>.

Greg Stein <gs...@lyra.org> writes:
> It always has been. That is part of why I came up with this change. Also,
> the change is needed so that we can use sendfile() to efficiently deliver
> content to the server. And to let the network layer determine the right
> format for the delivery. And to assist with properly setting the
> Content-Type when talking to the server.
> 
> etc etc.
> 
> I didn't say that I agree the change should be punted. Just that I will work
> around the problem for now.
>
> My comment still stands: I can see a need/use for this change. The counter
> point seems to be theoretical. So, what is the problem with doing the
> change?

+1

That is to say, +1 on not punting the change, and on discussing it
further, and probably +1 on finding a way to make it happen.

(I'm not sure how theoretical the editor composition objection is
right now; there's *plenty* of editor composition happening in
Subversion right now, and it's only a matter of time before some of
those composees start reading from streams -- for example, in commit
guards.  However, I can see various possible workarounds for that; it
may not be a showstopper.)

The real issue at the moment [which I think Ben mentioned, but I
didn't] is that the change is a huge one, so it would be nice to get
M2 done first and then deal with it, that's all.

We'll all talk about it more in person next week.

-K

Re: Change #6

Posted by Greg Stein <gs...@lyra.org>.

On Thu, Mar 29, 2001 at 05:07:38PM -0600, Karl Fogel wrote:
> Greg Stein <gs...@lyra.org> writes:
> > Hmm. That was a bit short.
> > 
> > There may be enough facilities on both sides of the network to enable delta
> > deliveries at this point. I'll poke at it.
> 
> Cool.  Btw, I had a thought:
> 
> If debuggability is a big issue for you right now (i.e., you want to

It always has been. That is part of why I came up with this change. Also,
the change is needed so that we can use sendfile() to efficiently deliver
content to the server. And to let the network layer determine the right
format for the delivery. And to assist with properly setting the
Content-Type when talking to the server.

etc etc.

I didn't say that I agree the change should be punted. Just that I will work
around the problem for now.

My comment still stands: I can see a need/use for this change. The counter
point seems to be theoretical. So, what is the problem with doing the
change?

-g

> see plaintexts flying by on the wire, so you can tell things are
> working right), then it would be pretty easy for us to make the binary
> differ have a special "debug mode".  Under debug mode, a delta between
> 
>   SOURCE  and  TARGET
> 
> would totally ignore SOURCE :-).  It would just generate windows
> containing sequential blocks of TARGET's text, and ops to insert the
> data.  (It would also not use previous bits of TARGET -- just send a
> series of data blocks, as dumb as possible.)
> 
> Just a thought; I haven't investigated implementation, but I can't
> imagine it would be that hard.
> 
> -Karl
> 
> 
> > On Wed, Mar 28, 2001 at 04:11:54PM -0800, Greg Stein wrote:
> > > Ah, I see. A practical requirement for plain text streams is overridden by a
> > > theoretical need of composing editors?
> > > 
> > > Feh.
> > > 
> > > On Wed, Mar 28, 2001 at 05:55:48PM -0600, Karl Fogel wrote:
> > > > Greg Stein <gs...@lyra.org> writes:
> > > > > On Wed, Mar 28, 2001 at 06:20:52PM -0500, Greg Hudson wrote:
> > > > > > > Euh... I still need Change #7 (from STACK) to enable sending files
> > > > > > > up to the server.
> > > > > > 
> > > > > > I don't understand.  What do copy nodes have to do with enabling
> > > > > > sending files up to the server?
> > > > > 
> > > > > Sorry. #6
> > > > 
> > > > Have you read the problem noted in that item?  Here's the relevant bit
> > > > from STACK:
> > > > 
> > > >       Editor composition becomes more difficult if we use streams.  A
> > > >       window is a discrete chunk of data that can be used by several
> > > >       consumers, but streams are different: if consumer A reads some
> > > >       data off a stream, then when consumer B reads, she'll get
> > > >       different results.  You'd have to design your streams in a funky
> > > >       way to make this not be a problem.
> > > > 
> > > >       In some circumstances, this isn't an issue.  After all, usually
> > > >       a set of composed editors is a bunch of lightweight editors,
> > > >       that don't do much, surrounding a core editor that does the real
> > > >       work.  For example, an editor that prints out filenames wrapped
> > > >       with an editor that actually updates those files.  In such
> > > >       cases, the lightweight editor simply never reads data off the
> > > >       stream, so the core editor is not deprived of anything.
> > > > 
> > > >       But other editors (say, a commit guard?) might want to actually
> > > >       examine file data.  That could have bad consequences if we
> > > >       switch from windows to streams.
> > > 
> > > -- 
> > > Greg Stein, http://www.lyra.org/
> > 
> > -- 
> > Greg Stein, http://www.lyra.org/

-- 
Greg Stein, http://www.lyra.org/

Re: Change #6

Posted by Karl Fogel <kf...@collab.net>.

Greg Stein <gs...@lyra.org> writes:
> Hmm. That was a bit short.
> 
> There may be enough facilities on both sides of the network to enable delta
> deliveries at this point. I'll poke at it.

Cool.  Btw, I had a thought:

If debuggability is a big issue for you right now (i.e., you want to
see plaintexts flying by on the wire, so you can tell things are
working right), then it would be pretty easy for us to make the binary
differ have a special "debug mode".  Under debug mode, a delta between

  SOURCE  and  TARGET

would totally ignore SOURCE :-).  It would just generate windows
containing sequential blocks of TARGET's text, and ops to insert the
data.  (It would also not use previous bits of TARGET -- just send a
series of data blocks, as dumb as possible.)

Just a thought; I haven't investigated implementation, but I can't
imagine it would be that hard.

-Karl


> On Wed, Mar 28, 2001 at 04:11:54PM -0800, Greg Stein wrote:
> > Ah, I see. A practical requirement for plain text streams is overridden by a
> > theoretical need of composing editors?
> > 
> > Feh.
> > 
> > On Wed, Mar 28, 2001 at 05:55:48PM -0600, Karl Fogel wrote:
> > > Greg Stein <gs...@lyra.org> writes:
> > > > On Wed, Mar 28, 2001 at 06:20:52PM -0500, Greg Hudson wrote:
> > > > > > Euh... I still need Change #7 (from STACK) to enable sending files
> > > > > > up to the server.
> > > > > 
> > > > > I don't understand.  What do copy nodes have to do with enabling
> > > > > sending files up to the server?
> > > > 
> > > > Sorry. #6
> > > 
> > > Have you read the problem noted in that item?  Here's the relevant bit
> > > from STACK:
> > > 
> > >       Editor composition becomes more difficult if we use streams.  A
> > >       window is a discrete chunk of data that can be used by several
> > >       consumers, but streams are different: if consumer A reads some
> > >       data off a stream, then when consumer B reads, she'll get
> > >       different results.  You'd have to design your streams in a funky
> > >       way to make this not be a problem.
> > > 
> > >       In some circumstances, this isn't an issue.  After all, usually
> > >       a set of composed editors is a bunch of lightweight editors,
> > >       that don't do much, surrounding a core editor that does the real
> > >       work.  For example, an editor that prints out filenames wrapped
> > >       with an editor that actually updates those files.  In such
> > >       cases, the lightweight editor simply never reads data off the
> > >       stream, so the core editor is not deprived of anything.
> > > 
> > >       But other editors (say, a commit guard?) might want to actually
> > >       examine file data.  That could have bad consequences if we
> > >       switch from windows to streams.
> > 
> > -- 
> > Greg Stein, http://www.lyra.org/
> 
> -- 
> Greg Stein, http://www.lyra.org/

Re: Change #6

Posted by Greg Stein <gs...@lyra.org>.

Hmm. That was a bit short.

There may be enough facilities on both sides of the network to enable delta
deliveries at this point. I'll poke at it.

-g

On Wed, Mar 28, 2001 at 04:11:54PM -0800, Greg Stein wrote:
> Ah, I see. A practical requirement for plain text streams is overridden by a
> theoretical need of composing editors?
> 
> Feh.
> 
> On Wed, Mar 28, 2001 at 05:55:48PM -0600, Karl Fogel wrote:
> > Greg Stein <gs...@lyra.org> writes:
> > > On Wed, Mar 28, 2001 at 06:20:52PM -0500, Greg Hudson wrote:
> > > > > Euh... I still need Change #7 (from STACK) to enable sending files
> > > > > up to the server.
> > > > 
> > > > I don't understand.  What do copy nodes have to do with enabling
> > > > sending files up to the server?
> > > 
> > > Sorry. #6
> > 
> > Have you read the problem noted in that item?  Here's the relevant bit
> > from STACK:
> > 
> >       Editor composition becomes more difficult if we use streams.  A
> >       window is a discrete chunk of data that can be used by several
> >       consumers, but streams are different: if consumer A reads some
> >       data off a stream, then when consumer B reads, she'll get
> >       different results.  You'd have to design your streams in a funky
> >       way to make this not be a problem.
> > 
> >       In some circumstances, this isn't an issue.  After all, usually
> >       a set of composed editors is a bunch of lightweight editors,
> >       that don't do much, surrounding a core editor that does the real
> >       work.  For example, an editor that prints out filenames wrapped
> >       with an editor that actually updates those files.  In such
> >       cases, the lightweight editor simply never reads data off the
> >       stream, so the core editor is not deprived of anything.
> > 
> >       But other editors (say, a commit guard?) might want to actually
> >       examine file data.  That could have bad consequences if we
> >       switch from windows to streams.
> 
> -- 
> Greg Stein, http://www.lyra.org/

-- 
Greg Stein, http://www.lyra.org/

Re: Change #6

Posted by Greg Stein <gs...@lyra.org>.

Ah, I see. A practical requirement for plain text streams is overridden by a
theoretical need of composing editors?

Feh.

On Wed, Mar 28, 2001 at 05:55:48PM -0600, Karl Fogel wrote:
> Greg Stein <gs...@lyra.org> writes:
> > On Wed, Mar 28, 2001 at 06:20:52PM -0500, Greg Hudson wrote:
> > > > Euh... I still need Change #7 (from STACK) to enable sending files
> > > > up to the server.
> > > 
> > > I don't understand.  What do copy nodes have to do with enabling
> > > sending files up to the server?
> > 
> > Sorry. #6
> 
> Have you read the problem noted in that item?  Here's the relevant bit
> from STACK:
> 
>       Editor composition becomes more difficult if we use streams.  A
>       window is a discrete chunk of data that can be used by several
>       consumers, but streams are different: if consumer A reads some
>       data off a stream, then when consumer B reads, she'll get
>       different results.  You'd have to design your streams in a funky
>       way to make this not be a problem.
> 
>       In some circumstances, this isn't an issue.  After all, usually
>       a set of composed editors is a bunch of lightweight editors,
>       that don't do much, surrounding a core editor that does the real
>       work.  For example, an editor that prints out filenames wrapped
>       with an editor that actually updates those files.  In such
>       cases, the lightweight editor simply never reads data off the
>       stream, so the core editor is not deprived of anything.
> 
>       But other editors (say, a commit guard?) might want to actually
>       examine file data.  That could have bad consequences if we
>       switch from windows to streams.

-- 
Greg Stein, http://www.lyra.org/