You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Julian Foad <ju...@wandisco.com> on 2010/08/16 10:55:27 UTC

Re: svn commit: r985514 - stream_move_mark()

On Sat, 2010-08-14, stefan2@apache.org wrote:
> Log:
> Extend the stream API by three functions:
> svn_stream_move_mark() to move an existing mark by some delta

Hi Stefan.

Unfortunately this is not a logical extension to the stream API.
Subversion's svn_stream_t streams support arbitrary "filtering" or
"translation" tasks that require a mostly sequential processing of the
stream.  One of the ideas embodied by the "get a mark" and "seek to a
mark" paradigm is that the stream position is not expressible by a
simple number (such as a byte offset from the start), because the stream
may be performing translation whose state at a given position depends on
the preceding content of the stream.  Therefore "get a mark" does not
return a scalar number but instead returns a stream state object which
contains the translation state of the stream as well as the "mark" of
the underlying stream if there is one.

Seeking to an arbitrary new position requires telling the stream all of
its relevant internal state at that position.  "Move by +/- N bytes"
does not seem to be an operation that can be supported.

- Julian


> svn_stream_supports_mark() tells whether getting, setting and moving marks is supported by this stream
> 
> * subversion/include/svn_io.h
>   (svn_io_move_mark_fn_t, svn_io_buffered_fn_t):
>    declare new vtable function pointers
>   (svn_stream_set_move_mark, svn_stream_set_buffered):
>    declare functions to set these vtable pointers
>   (svn_stream_supports_mark, svn_stream_move_mark, svn_stream_buffered):
>    declare new stream API functions
> 
> * subversion/libsvn_subr/stream.c
>   (svn_stream_t): extend the vtable part by the new functions
>   (svn_stream_create): add initialization code for the new vtable entries
>   (svn_stream_set_move_mark, svn_stream_set_buffered):
>    implement new vtable modifiers
>   (svn_stream_supports_mark, svn_stream_buffered, svn_stream_buffered):
>    implement new stream generic API functions
>   (move_mark_handler_empty, buffered_handler_empty, svn_stream_empty):
>    implement support for the new stream API in empty streams
>   (move_mark_handler_disown, buffered_handler_disown, svn_stream_disown):
>    implement support for the new stream API in disowned streams
>   (move_mark_handler_apr, buffered_handler_apr, stream_from_aprfile,
>    svn_stream_from_aprfile_range_readonly):
>    implement support for the new stream API in APR file based streams
>   (move_mark_handler_stringbuf, buffered_handler_stringbuf,
>    svn_stream_from_stringbuf):
>    implement support for the new stream API in stringbuf streams
> 
> * subversion/libsvn_subr/subst.c
>   (translated_stream_move_mark, translated_stream_buffered,
>    svn_subst_stream_translated):
>    implement support for the new stream API in translated streams
> 
> Modified:
>     subversion/branches/performance/subversion/include/svn_io.h
>     subversion/branches/performance/subversion/libsvn_subr/stream.c
>     subversion/branches/performance/subversion/libsvn_subr/subst.c
> 
> Modified: subversion/branches/performance/subversion/include/svn_io.h
> URL: http://svn.apache.org/viewvc/subversion/branches/performance/subversion/include/svn_io.h?rev=985514&r1=985513&r2=985514&view=diff
> ==============================================================================
> --- subversion/branches/performance/subversion/include/svn_io.h (original)
> +++ subversion/branches/performance/subversion/include/svn_io.h Sat Aug 14 15:46:13 2010
> @@ -780,6 +780,22 @@ typedef svn_error_t *(*svn_io_mark_fn_t)
>  typedef svn_error_t *(*svn_io_seek_fn_t)(void *baton,
>                                           svn_stream_mark_t *mark);
>  
> +/** Mark movement handler function for a generic stream. @see svn_stream_t 
> + * and svn_stream_move_mark().
> + *
> + * @since New in 1.7.
> + */
> +typedef svn_error_t *(*svn_io_move_mark_fn_t)(void *baton,
> +                                              svn_stream_mark_t *mark,
> +                                              apr_off_t delta);
> +
> +/** Buffer test handler function for a generic stream. @see svn_stream_t 
> + * and svn_stream_buffered().
> + *
> + * @since New in 1.7.
> + */
> +typedef svn_boolean_t (*svn_io_buffered_fn_t)(void *baton);
> +
>  /** Create a generic stream.  @see svn_stream_t. */
>  svn_stream_t *
>  svn_stream_create(void *baton,
> @@ -829,6 +845,22 @@ void
>  svn_stream_set_seek(svn_stream_t *stream,
>                      svn_io_seek_fn_t seek_fn);
>  
> +/** Set @a stream's move mark function to @a move_mark_fn
> + *
> + * @since New in 1.7.
> + */
> +void
> +svn_stream_set_move_mark(svn_stream_t *stream,
> +                         svn_io_move_mark_fn_t move_mark_fn);
> +
> +/** Set @a stream's buffer test function to @a buffered_fn
> + *
> + * @since New in 1.7.
> + */
> +void
> +svn_stream_set_buffered(svn_stream_t *stream,
> +                        svn_io_buffered_fn_t buffered_fn);
> +
>  /** Create a stream that is empty for reading and infinite for writing. */
>  svn_stream_t *
>  svn_stream_empty(apr_pool_t *pool);
> @@ -1077,6 +1109,14 @@ svn_stream_close(svn_stream_t *stream);
>  svn_error_t *
>  svn_stream_reset(svn_stream_t *stream);
>  
> +/** Returns @c TRUE if the generic @a stream supports svn_stream_mark().
> + *
> + * @see svn_stream_mark()
> + * @since New in 1.7.
> + */
> +svn_boolean_t
> +svn_stream_supports_mark(svn_stream_t *stream);
> +
>  /** Set a @a mark at the current position of a generic @a stream,
>   * which can later be sought back to using svn_stream_seek().
>   * The @a mark is allocated in @a pool.
> @@ -1102,6 +1142,27 @@ svn_stream_mark(svn_stream_t *stream,
>  svn_error_t *
>  svn_stream_seek(svn_stream_t *stream, svn_stream_mark_t *mark);
>  
> +/** Move a @a mark for a generic @a stream by delta.
> + * This function returns the #SVN_ERR_STREAM_SEEK_NOT_SUPPORTED error
> + * if the stream doesn't implement seeking.
> + *
> + * @see svn_stream_mark()
> + * @since New in 1.7.
> + */
> +svn_error_t *
> +svn_stream_move_mark(svn_stream_t *stream, 
> +                     svn_stream_mark_t *mark, 
> +                     apr_off_t delta);
> +
> +/** Return whether this generic @a stream uses internal buffering.
> + * This may be used to work around subtle differences between buffered
> + * an non-buffered APR files.
> + *
> + * @since New in 1.7.
> + */
> +svn_boolean_t 
> +svn_stream_buffered(svn_stream_t *stream);
> +
>  /** Return a writable stream which, when written to, writes to both of the
>   * underlying streams.  Both of these streams will be closed upon closure of
>   * the returned stream; use svn_stream_disown() if this is not the desired
> 
> Modified: subversion/branches/performance/subversion/libsvn_subr/stream.c
> URL: http://svn.apache.org/viewvc/subversion/branches/performance/subversion/libsvn_subr/stream.c?rev=985514&r1=985513&r2=985514&view=diff
> ==============================================================================
> --- subversion/branches/performance/subversion/libsvn_subr/stream.c (original)
> +++ subversion/branches/performance/subversion/libsvn_subr/stream.c Sat Aug 14 15:46:13 2010
> @@ -53,6 +53,8 @@ struct svn_stream_t {
>    svn_io_reset_fn_t reset_fn;
>    svn_io_mark_fn_t mark_fn;
>    svn_io_seek_fn_t seek_fn;
> +  svn_io_move_mark_fn_t move_mark_fn;
> +  svn_io_buffered_fn_t buffered_fn;
>  };
>  
>  
> @@ -71,6 +73,8 @@ svn_stream_create(void *baton, apr_pool_
>    stream->reset_fn = NULL;
>    stream->mark_fn = NULL;
>    stream->seek_fn = NULL;
> +  stream->move_mark_fn = NULL;
> +  stream->buffered_fn = NULL;
>    return stream;
>  }
>  
> @@ -119,6 +123,20 @@ svn_stream_set_seek(svn_stream_t *stream
>    stream->seek_fn = seek_fn;
>  }
>  
> +void
> +svn_stream_set_move_mark(svn_stream_t *stream, 
> +                         svn_io_move_mark_fn_t move_mark_fn)
> +{
> +  stream->move_mark_fn = move_mark_fn;
> +}
> +
> +void
> +svn_stream_set_buffered(svn_stream_t *stream, 
> +                        svn_io_buffered_fn_t buffered_fn)
> +{
> +  stream->buffered_fn = buffered_fn;
> +}
> +
>  svn_error_t *
>  svn_stream_read(svn_stream_t *stream, char *buffer, apr_size_t *len)
>  {
> @@ -144,6 +162,12 @@ svn_stream_reset(svn_stream_t *stream)
>    return stream->reset_fn(stream->baton);
>  }
>  
> +svn_boolean_t
> +svn_stream_supports_mark(svn_stream_t *stream)
> +{
> +  return stream->mark_fn == NULL ? FALSE : TRUE;
> +}
> +
>  svn_error_t *
>  svn_stream_mark(svn_stream_t *stream, svn_stream_mark_t **mark,
>                  apr_pool_t *pool)
> @@ -164,6 +188,26 @@ svn_stream_seek(svn_stream_t *stream, sv
>  }
>  
>  svn_error_t *
> +svn_stream_move_mark(svn_stream_t *stream, 
> +                     svn_stream_mark_t *mark, 
> +                     apr_off_t delta)
> +{
> +  if (stream->move_mark_fn == NULL)
> +    return svn_error_create(SVN_ERR_STREAM_SEEK_NOT_SUPPORTED, NULL, NULL);
> +
> +  return stream->move_mark_fn(stream->baton, mark, delta);
> +}
> +
> +svn_boolean_t 
> +svn_stream_buffered(svn_stream_t *stream)
> +{
> +  if (stream->buffered_fn == NULL)
> +    return FALSE;
> +
> +  return stream->buffered_fn(stream->baton);
> +}
> +
> +svn_error_t *
>  svn_stream_close(svn_stream_t *stream)
>  {
>    if (stream->close_fn == NULL)
> @@ -452,6 +496,18 @@ seek_handler_empty(void *baton, svn_stre
>    return SVN_NO_ERROR;
>  }
>  
> +static svn_error_t *
> +move_mark_handler_empty(void *baton, svn_stream_mark_t *mark, apr_off_t delta)
> +{
> +  return SVN_NO_ERROR;
> +}
> +
> +static svn_boolean_t
> +buffered_handler_empty(void *baton)
> +{
> +  return FALSE;
> +}
> +
>  
>  svn_stream_t *
>  svn_stream_empty(apr_pool_t *pool)
> @@ -464,6 +520,8 @@ svn_stream_empty(apr_pool_t *pool)
>    svn_stream_set_reset(stream, reset_handler_empty);
>    svn_stream_set_mark(stream, mark_handler_empty);
>    svn_stream_set_seek(stream, seek_handler_empty);
> +  svn_stream_set_move_mark(stream, move_mark_handler_empty);
> +  svn_stream_set_buffered(stream, buffered_handler_empty);
>    return stream;
>  }
>  
> @@ -558,6 +616,18 @@ seek_handler_disown(void *baton, svn_str
>    return svn_stream_seek(baton, mark);
>  }
>  
> +static svn_error_t *
> +move_mark_handler_disown(void *baton, svn_stream_mark_t *mark, apr_off_t delta)
> +{
> +  return svn_stream_move_mark(baton, mark, delta);
> +}
> +
> +static svn_boolean_t
> +buffered_handler_disown(void *baton)
> +{
> +  return svn_stream_buffered(baton);
> +}
> +
>  svn_stream_t *
>  svn_stream_disown(svn_stream_t *stream, apr_pool_t *pool)
>  {
> @@ -568,6 +638,8 @@ svn_stream_disown(svn_stream_t *stream, 
>    svn_stream_set_reset(s, reset_handler_disown);
>    svn_stream_set_mark(s, mark_handler_disown);
>    svn_stream_set_seek(s, seek_handler_disown);
> +  svn_stream_set_move_mark(s, move_mark_handler_disown);
> +  svn_stream_set_buffered(s, buffered_handler_disown);
>  
>    return s;
>  }
> @@ -694,6 +766,23 @@ seek_handler_apr(void *baton, svn_stream
>    return SVN_NO_ERROR;
>  }
>  
> +static svn_error_t *
> +move_mark_handler_apr(void *baton, svn_stream_mark_t *mark, apr_off_t delta)
> +{
> +  struct mark_apr *mark_apr;
> +
> +  mark_apr = (struct mark_apr *)mark;
> +  mark_apr->off += delta;
> +  return SVN_NO_ERROR;
> +}
> +
> +static svn_boolean_t
> +buffered_handler_apr(void *baton)
> +{
> +  struct baton_apr *btn = baton;
> +  return apr_file_buffer_size_get(btn->file) != 0;
> +}
> +
>  svn_error_t *
>  svn_stream_open_readonly(svn_stream_t **stream,
>                           const char *path,
> @@ -774,6 +863,8 @@ stream_from_aprfile(struct baton_apr **b
>    svn_stream_set_reset(stream, reset_handler_apr);
>    svn_stream_set_mark(stream, mark_handler_apr);
>    svn_stream_set_seek(stream, seek_handler_apr);
> +  svn_stream_set_move_mark(stream, move_mark_handler_apr);
> +  svn_stream_set_buffered(stream, buffered_handler_apr);
>  
>    /* return structures */
>    *baton = new_baton;
> @@ -896,6 +987,8 @@ svn_stream_from_aprfile_range_readonly(a
>    svn_stream_set_reset(stream, reset_handler_apr);
>    svn_stream_set_mark(stream, mark_handler_apr);
>    svn_stream_set_seek(stream, seek_handler_apr);
> +  svn_stream_set_move_mark(stream, move_mark_handler_apr);
> +  svn_stream_set_buffered(stream, buffered_handler_apr);
>  
>    if (! disown)
>      svn_stream_set_close(stream, close_handler_apr);
> @@ -1454,6 +1547,24 @@ seek_handler_stringbuf(void *baton, svn_
>    return SVN_NO_ERROR;
>  }
>  
> +static svn_error_t *
> +move_mark_handler_stringbuf(void *baton, 
> +                            svn_stream_mark_t *mark, 
> +                            apr_off_t delta)
> +{
> +  struct stringbuf_stream_mark *stringbuf_stream_mark;
> +
> +  stringbuf_stream_mark = (struct stringbuf_stream_mark *)mark;
> +  stringbuf_stream_mark->pos += delta;
> +  return SVN_NO_ERROR;
> +}
> +
> +static svn_boolean_t
> +buffered_handler_stringbuf(void *baton)
> +{
> +  return TRUE;
> +}
> +
>  svn_stream_t *
>  svn_stream_from_stringbuf(svn_stringbuf_t *str,
>                            apr_pool_t *pool)
> @@ -1473,6 +1584,8 @@ svn_stream_from_stringbuf(svn_stringbuf_
>    svn_stream_set_reset(stream, reset_handler_stringbuf);
>    svn_stream_set_mark(stream, mark_handler_stringbuf);
>    svn_stream_set_seek(stream, seek_handler_stringbuf);
> +  svn_stream_set_move_mark(stream, move_mark_handler_stringbuf);
> +  svn_stream_set_buffered(stream, buffered_handler_stringbuf);
>    return stream;
>  }
>  
> 
> Modified: subversion/branches/performance/subversion/libsvn_subr/subst.c
> URL: http://svn.apache.org/viewvc/subversion/branches/performance/subversion/libsvn_subr/subst.c?rev=985514&r1=985513&r2=985514&view=diff
> ==============================================================================
> --- subversion/branches/performance/subversion/libsvn_subr/subst.c (original)
> +++ subversion/branches/performance/subversion/libsvn_subr/subst.c Sat Aug 14 15:46:13 2010
> @@ -1222,6 +1222,42 @@ translated_stream_seek(void *baton, svn_
>    return SVN_NO_ERROR;
>  }
>  
> +/* Implements svn_io_move_mark_fn_t. */
> +static svn_error_t *
> +translated_stream_move_mark(void *baton, 
> +                            svn_stream_mark_t *mark, 
> +                            apr_off_t delta)
> +{
> +  struct translated_stream_baton *b = baton;
> +  mark_translated_t *mt = (mark_translated_t *)mark;
> +
> +  /* Flush output buffer if necessary. */
> +  if (b->written)
> +    SVN_ERR(translate_chunk(b->stream, b->out_baton, NULL, 0, b->iterpool));
> +
> +  SVN_ERR(svn_stream_move_mark(b->stream, mt->mark, delta));
> +
> +  /* Restore translation state, avoiding new allocations. */
> +  *b->in_baton = *mt->saved_baton.in_baton;
> +  *b->out_baton = *mt->saved_baton.out_baton;
> +  b->written = mt->saved_baton.written;
> +  svn_stringbuf_setempty(b->readbuf);
> +  svn_stringbuf_appendbytes(b->readbuf, mt->saved_baton.readbuf->data, 
> +                            mt->saved_baton.readbuf->len);
> +  b->readbuf_off = mt->saved_baton.readbuf_off;
> +  memcpy(b->buf, mt->saved_baton.buf, SVN__TRANSLATION_BUF_SIZE);
> +
> +  return SVN_NO_ERROR;
> +}
> +
> +/* Implements svn_io_buffered_fn_t. */
> +static svn_boolean_t
> +translated_stream_buffered(void *baton)
> +{
> +  struct translated_stream_baton *b = baton;
> +  return svn_stream_buffered(b->stream);
> +}
> +
>  svn_error_t *
>  svn_subst_read_specialfile(svn_stream_t **stream,
>                             const char *path,
> @@ -1324,6 +1360,8 @@ svn_subst_stream_translated(svn_stream_t
>    svn_stream_set_reset(s, translated_stream_reset);
>    svn_stream_set_mark(s, translated_stream_mark);
>    svn_stream_set_seek(s, translated_stream_seek);
> +  svn_stream_set_move_mark(s, translated_stream_move_mark);
> +  svn_stream_set_buffered(s, translated_stream_buffered);
>  
>    return s;
>  }
> 
> 



Re: svn commit: r985514 - stream_move_mark()

Posted by Stefan Fuhrmann <st...@alice-dsl.de>.
Julian Foad wrote:
> On Wed, 2010-08-18, Stefan Sperling wrote:
>   
>> On Tue, Aug 17, 2010 at 11:57:11PM +0200, Stefan Fuhrmann wrote:
>>     
>>> I overgeneralized my use-case here: I actually needed "move forward" only.
>>> The concept of "skip N bytes", however, is perfectly in line with
>>> the general
>>> stream semantics. Because this also fits my needs, I changed the API in
>>> r986485 accordingly.
>>>       
>> But the stream implementation may still need to read all bytes until N
>> anyway, to make sure the internal state is still valid after moving to
>> offset N.
>>
>> Of course, we could say that a stream implementation is free to optimize
>> for this use case if possible, but allow wrapper streams to override
>> the "move forward" implementation.
>>
>> For instance, a stream wrapping an APR file would support "move forward",
>> but as soon as you wrap this APR file stream with a translation stream,
>> the translation stream's semantics take over and the APR file stream
>> will effectively be read byte-per-byte when the "move forward" method
>> is called on the translation stream. This would allow the translation stream
>> to keep its internal state intact during a "move forward" operation.
>>
>> Performance benefits would then depend on the type of stream being used.
>>
>> Does this make sense?
>>     
>
> Yes, that's exactly what I was thinking. Sounds good.
I agree to all the above. The key is that reading a chunk of data
(e.g. 80 bytes) at once from a translated stream is much faster
than reading each bytes individually because most streams buffer
their data - even translated streams use a translation buffer.

Of course, it is even better if the streams can directly skip that
data but that is not a requirement for the optimization of
stream_readline.

-- Stefan^2.

Re: svn commit: r985514 - stream_move_mark()

Posted by Julian Foad <ju...@wandisco.com>.
On Wed, 2010-08-18, Stefan Sperling wrote:
> On Tue, Aug 17, 2010 at 11:57:11PM +0200, Stefan Fuhrmann wrote:
> > I overgeneralized my use-case here: I actually needed "move forward" only.
> > The concept of "skip N bytes", however, is perfectly in line with
> > the general
> > stream semantics. Because this also fits my needs, I changed the API in
> > r986485 accordingly.
> 
> But the stream implementation may still need to read all bytes until N
> anyway, to make sure the internal state is still valid after moving to
> offset N.
> 
> Of course, we could say that a stream implementation is free to optimize
> for this use case if possible, but allow wrapper streams to override
> the "move forward" implementation.
> 
> For instance, a stream wrapping an APR file would support "move forward",
> but as soon as you wrap this APR file stream with a translation stream,
> the translation stream's semantics take over and the APR file stream
> will effectively be read byte-per-byte when the "move forward" method
> is called on the translation stream. This would allow the translation stream
> to keep its internal state intact during a "move forward" operation.
> 
> Performance benefits would then depend on the type of stream being used.
> 
> Does this make sense?

Yes, that's exactly what I was thinking. Sounds good.

- Julian


Re: svn commit: r985514 - stream_move_mark()

Posted by Stefan Sperling <st...@elego.de>.
On Tue, Aug 17, 2010 at 11:57:11PM +0200, Stefan Fuhrmann wrote:
> I overgeneralized my use-case here: I actually needed "move forward" only.
> The concept of "skip N bytes", however, is perfectly in line with
> the general
> stream semantics. Because this also fits my needs, I changed the API in
> r986485 accordingly.

But the stream implementation may still need to read all bytes until N
anyway, to make sure the internal state is still valid after moving to
offset N.

Of course, we could say that a stream implementation is free to optimize
for this use case if possible, but allow wrapper streams to override
the "move forward" implementation.

For instance, a stream wrapping an APR file would support "move forward",
but as soon as you wrap this APR file stream with a translation stream,
the translation stream's semantics take over and the APR file stream
will effectively be read byte-per-byte when the "move forward" method
is called on the translation stream. This would allow the translation stream
to keep its internal state intact during a "move forward" operation.

Performance benefits would then depend on the type of stream being used.

Does this make sense?

> The problem is that a lot of parser code would need to change at least its
> function signatures to string buffers and offsets instead of
> streams. I'm not
> even sure that all streams are based on strings; some may be parsed on
> APR file as well and use the same parser code (e.g. read hash from stream).

Sure, that's what I meant. The patch code for instance uses multiple streams
all mapped onto a subset of a file, and each such stream does special
filtering and transformations on the text read from the file.
See the definition of svn_hunk_t in include/svn_diff.h.

Stefan

Re: svn commit: r985514 - stream_move_mark()

Posted by Stefan Fuhrmann <st...@alice-dsl.de>.
Stefan Sperling wrote:
> On Mon, Aug 16, 2010 at 10:04:31AM +0100, Julian Foad wrote:
>   
>> On Mon, 2010-08-16 at 09:55 +0100, Julian Foad wrote:
>>     
>>> On Sat, 2010-08-14, stefan2@apache.org wrote:
>>>       
>>>> Log:
>>>> Extend the stream API by three functions:
>>>> svn_stream_move_mark() to move an existing mark by some delta
>>>>         
>>> Hi Stefan.
>>>
>>> Unfortunately this is not a logical extension to the stream API.
>>> Subversion's svn_stream_t streams support arbitrary "filtering" or
>>> "translation" tasks that require a mostly sequential processing of the
>>> stream.  One of the ideas embodied by the "get a mark" and "seek to a
>>> mark" paradigm is that the stream position is not expressible by a
>>> simple number (such as a byte offset from the start), because the stream
>>> may be performing translation whose state at a given position depends on
>>> the preceding content of the stream.  Therefore "get a mark" does not
>>> return a scalar number but instead returns a stream state object which
>>> contains the translation state of the stream as well as the "mark" of
>>> the underlying stream if there is one.
>>>
>>> Seeking to an arbitrary new position requires telling the stream all of
>>> its relevant internal state at that position.  "Move by +/- N bytes"
>>> does not seem to be an operation that can be supported.
>>>       
I overgeneralized my use-case here: I actually needed "move forward" only.
The concept of "skip N bytes", however, is perfectly in line with the 
general
stream semantics. Because this also fits my needs, I changed the API in
r986485 accordingly.
>> ... supported by all streams that support mark & seek.  Of course some
>> streams could support it.
>>
>> I don't like to see an API be complicated by lots of functions that may
>> or may not be supported, and functions to find out whether they are
>> supported.  I would ask you to look for another way to realize most of
>> the speed gain that you were hoping to get from this extension.
>>     
The impact of this change is pretty large as all FSFS structures except for
the actual delta info will be processed by parsers using stream_readline().
Reading that data byte-by-byte from the APR file is by far the most 
expensive
part of that parsing process. So, I would like to see that API extension
being accepted.
> If it provides a real performance advantage, maybe we should just switch
> some APIs which use streams to APR file handles where it makes sense.
> Files can be seeked via byte-offsets.
>
> The stream abstraction is nice, but it can get in the way.
> Even mark/seek support as currently implemented is already beyond the
> original idea of the stream abstraction. But we needed it for the patch code
> which is using streams to provide an interface for reading text from patch
> files in various ways. It's a nice abstraction for that purpose and should
> remain, but maybe Stefan Fuhrmann has good reasons for not using the
> stream abstraction elsewhere. 
>   
The problem is that a lot of parser code would need to change at least its
function signatures to string buffers and offsets instead of streams. 
I'm not
even sure that all streams are based on strings; some may be parsed on
APR file as well and use the same parser code (e.g. read hash from stream).

IOW, the least risky alternative approach would probably be the introduction
of a "kind-of-stream" concept that supports the extra functionality.

-- Stefan^2.

Re: svn commit: r985514 - stream_move_mark()

Posted by Stefan Sperling <st...@elego.de>.
On Mon, Aug 16, 2010 at 10:04:31AM +0100, Julian Foad wrote:
> On Mon, 2010-08-16 at 09:55 +0100, Julian Foad wrote:
> > On Sat, 2010-08-14, stefan2@apache.org wrote:
> > > Log:
> > > Extend the stream API by three functions:
> > > svn_stream_move_mark() to move an existing mark by some delta
> > 
> > Hi Stefan.
> > 
> > Unfortunately this is not a logical extension to the stream API.
> > Subversion's svn_stream_t streams support arbitrary "filtering" or
> > "translation" tasks that require a mostly sequential processing of the
> > stream.  One of the ideas embodied by the "get a mark" and "seek to a
> > mark" paradigm is that the stream position is not expressible by a
> > simple number (such as a byte offset from the start), because the stream
> > may be performing translation whose state at a given position depends on
> > the preceding content of the stream.  Therefore "get a mark" does not
> > return a scalar number but instead returns a stream state object which
> > contains the translation state of the stream as well as the "mark" of
> > the underlying stream if there is one.
> > 
> > Seeking to an arbitrary new position requires telling the stream all of
> > its relevant internal state at that position.  "Move by +/- N bytes"
> > does not seem to be an operation that can be supported.
> 
> ... supported by all streams that support mark & seek.  Of course some
> streams could support it.
> 
> I don't like to see an API be complicated by lots of functions that may
> or may not be supported, and functions to find out whether they are
> supported.  I would ask you to look for another way to realize most of
> the speed gain that you were hoping to get from this extension.

If it provides a real performance advantage, maybe we should just switch
some APIs which use streams to APR file handles where it makes sense.
Files can be seeked via byte-offsets.

The stream abstraction is nice, but it can get in the way.
Even mark/seek support as currently implemented is already beyond the
original idea of the stream abstraction. But we needed it for the patch code
which is using streams to provide an interface for reading text from patch
files in various ways. It's a nice abstraction for that purpose and should
remain, but maybe Stefan Fuhrmann has good reasons for not using the
stream abstraction elsewhere. 

Stefan

Re: svn commit: r985514 - stream_move_mark()

Posted by Julian Foad <ju...@wandisco.com>.
On Mon, 2010-08-16 at 09:55 +0100, Julian Foad wrote:
> On Sat, 2010-08-14, stefan2@apache.org wrote:
> > Log:
> > Extend the stream API by three functions:
> > svn_stream_move_mark() to move an existing mark by some delta
> 
> Hi Stefan.
> 
> Unfortunately this is not a logical extension to the stream API.
> Subversion's svn_stream_t streams support arbitrary "filtering" or
> "translation" tasks that require a mostly sequential processing of the
> stream.  One of the ideas embodied by the "get a mark" and "seek to a
> mark" paradigm is that the stream position is not expressible by a
> simple number (such as a byte offset from the start), because the stream
> may be performing translation whose state at a given position depends on
> the preceding content of the stream.  Therefore "get a mark" does not
> return a scalar number but instead returns a stream state object which
> contains the translation state of the stream as well as the "mark" of
> the underlying stream if there is one.
> 
> Seeking to an arbitrary new position requires telling the stream all of
> its relevant internal state at that position.  "Move by +/- N bytes"
> does not seem to be an operation that can be supported.

... supported by all streams that support mark & seek.  Of course some
streams could support it.

I don't like to see an API be complicated by lots of functions that may
or may not be supported, and functions to find out whether they are
supported.  I would ask you to look for another way to realize most of
the speed gain that you were hoping to get from this extension.

- Julian


[...]
> > +/** Returns @c TRUE if the generic @a stream supports svn_stream_mark().
> > + *
> > + * @see svn_stream_mark()
> > + * @since New in 1.7.
> > + */
> > +svn_boolean_t
> > +svn_stream_supports_mark(svn_stream_t *stream);
> > +
> >  /** Set a @a mark at the current position of a generic @a stream,
> >   * which can later be sought back to using svn_stream_seek().
> >   * The @a mark is allocated in @a pool.
> > @@ -1102,6 +1142,27 @@ svn_stream_mark(svn_stream_t *stream,
> >  svn_error_t *
> >  svn_stream_seek(svn_stream_t *stream, svn_stream_mark_t *mark);
> >  
> > +/** Move a @a mark for a generic @a stream by delta.
> > + * This function returns the #SVN_ERR_STREAM_SEEK_NOT_SUPPORTED error
> > + * if the stream doesn't implement seeking.
> > + *
> > + * @see svn_stream_mark()
> > + * @since New in 1.7.
> > + */
> > +svn_error_t *
> > +svn_stream_move_mark(svn_stream_t *stream, 
> > +                     svn_stream_mark_t *mark, 
> > +                     apr_off_t delta);