You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Greg Hudson <gh...@MIT.EDU> on 2002/05/11 15:30:50 UTC

Re: svn commit: rev 1912 - trunk/subversion/include trunk/subversion/libsvn_subr trunk/subversion/libsvn_repos

On Thu, 2002-05-09 at 00:23, sussman@tigris.org wrote:
> +  while (1)
> +    {
> +      numbytes = 1;
> +      SVN_ERR (svn_stream_read (stream, &c, &numbytes));
> +
> +      if ((c == '\n'))
> +        break;
> +
> +      svn_stringbuf_appendbytes (str, &c, 1);
> +    }

This seems likely to be a real performance killer.

I think you need to create a buffered stream object if you want to
abstract out line-reading.  Or maybe just reevaluate whether you really
need an svn_stream here, instead of something less generic like an APR
file.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: rev 1912 - trunk/subversion/include trunk/subversion/libsvn_subr trunk/subversion/libsvn_repos

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
Greg Hudson <gh...@MIT.EDU> writes:
> Sure, there's buffering going on before we hit the actual system call
> layer.  But for each byte in the dump file, you're performing a function
> call (svn_stream_read), a function call through a pointer (the stream's
> read function), and another function call (fread), with a certain amount
> of extra gook in each.  Since dump files are large, I imagine that will
> add up to a lot of cycles.
> 
> Well, feel free to press on as you are, but if reading dumps is slow,
> that would be the first place I'd look.

I think your last paragraph really sums it up -- this is something to
remember for optimization later, but not necessarily to change now.

-K


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: rev 1912 - trunk/subversion/include trunk/subversion/libsvn_subr trunk/subversion/libsvn_repos

Posted by Ben Collins-Sussman <su...@collab.net>.
Greg Hudson <gh...@MIT.EDU> writes:

> On Sat, 2002-05-11 at 12:48, Ben Collins-Sussman wrote:
> > Well, I'm certain that we want our dumper and loader to operate on
> > generic svn_stream_t's.  We want maximum flexibility.
> 
> Why?

Purely on faith that gstein had a purpose in mind when he requested
streams instead of apr_file_t's.  I think he imagines that someday we
can do dumps or loads over network pipes or something.  

> 
> > We certainly *could* write a buffered stream_t, but at the moment, I
> > don't think we have a real problem.  The stream being read by the
> > loader is usually a stdio FILE * anyway, which means the operating
> > system is already buffering for us, no?
> 
> Sure, there's buffering going on before we hit the actual system call
> layer.  But for each byte in the dump file, you're performing a function
> call (svn_stream_read), a function call through a pointer (the stream's
> read function), and another function call (fread), with a certain amount
> of extra gook in each.  Since dump files are large, I imagine that will
> add up to a lot of cycles.

Hmmm, true.   

However -- the "slow" readline functionality is really only being used
to read header blocks and property lists.  And these things, I reckon,
only make up maybe 10% of the data in a dumpstream.  The other 90% of
it is fulltexts, which we're slurping up in huge chunks of 102K at a
time.

And indeed, I don't notice any outrageous slowness.  Earlier today, I
tried doing this:

  $ svnadmin dump repo-bkp-1840 | svnadmin load newrepo

And it got about 100 revisions in pretty quickly.  The speed seemed
quite fast.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: rev 1912 - trunk/subversion/include trunk/subversion/libsvn_subr trunk/subversion/libsvn_repos

Posted by Greg Hudson <gh...@MIT.EDU>.
On Sat, 2002-05-11 at 12:48, Ben Collins-Sussman wrote:
> Well, I'm certain that we want our dumper and loader to operate on
> generic svn_stream_t's.  We want maximum flexibility.

Why?

> We certainly *could* write a buffered stream_t, but at the moment, I
> don't think we have a real problem.  The stream being read by the
> loader is usually a stdio FILE * anyway, which means the operating
> system is already buffering for us, no?

Sure, there's buffering going on before we hit the actual system call
layer.  But for each byte in the dump file, you're performing a function
call (svn_stream_read), a function call through a pointer (the stream's
read function), and another function call (fread), with a certain amount
of extra gook in each.  Since dump files are large, I imagine that will
add up to a lot of cycles.

Well, feel free to press on as you are, but if reading dumps is slow,
that would be the first place I'd look.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: rev 1912 - trunk/subversion/include trunk/subversion/libsvn_subr trunk/subversion/libsvn_repos

Posted by Ben Collins-Sussman <su...@collab.net>.
Greg Hudson <gh...@MIT.EDU> writes:

> On Thu, 2002-05-09 at 00:23, sussman@tigris.org wrote:
> > +  while (1)
> > +    {
> > +      numbytes = 1;
> > +      SVN_ERR (svn_stream_read (stream, &c, &numbytes));
> > +
> > +      if ((c == '\n'))
> > +        break;
> > +
> > +      svn_stringbuf_appendbytes (str, &c, 1);
> > +    }
> 
> This seems likely to be a real performance killer.
> 
> I think you need to create a buffered stream object if you want to
> abstract out line-reading.  Or maybe just reevaluate whether you really
> need an svn_stream here, instead of something less generic like an APR
> file.

Well, I'm certain that we want our dumper and loader to operate on
generic svn_stream_t's.  We want maximum flexibility.

We certainly *could* write a buffered stream_t, but at the moment, I
don't think we have a real problem.  The stream being read by the
loader is usually a stdio FILE * anyway, which means the operating
system is already buffering for us, no?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org