You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by km...@rockwellcollins.com on 2007/12/13 22:59:07 UTC

numerous small 512 byte working copy reads and writes

(First, let me mention I realize it is not advised to keep a working copy 
on a NAS.)

We noticed during a network trace of updating a working copy that was 
stored on a NAS,
that there were a lot of small 512 byte reads and writes to the network.

Curiously, I then fired up the sysinternals filemon utility and had it 
monitor
read and write access for svn.exe during a working copy update of one file 
on my local
pc.  It showed a large number of these small 512 byte file accesses.

For example:

READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS 
Offset: 0 Length: 512
WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS 
Offset: 0 Length: 512
READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS 
Offset: 512 Length: 512
WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS 
Offset: 512 Length: 512
READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS 
Offset: 1024 Length: 512
WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS 
Offset: 1024 Length: 512
READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS 
Offset: 1536 Length: 512
WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS 
Offset: 1536 Length: 512
READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS 
Offset: 2048 Length: 512
WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS 
Offset: 2048 Length: 512
READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS 
Offset: 2560 Length: 512
WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS 
Offset: 2560 Length: 512
READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS 
Offset: 3072 Length: 512
WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS 
Offset: 3072 Length: 512
...
READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS 
Offset: 1351168 Length: 512
WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS 
Offset: 1351168 Length: 512


A few things I noticed:

1) A file is named .tmp.tmp and stored in a "tmp" directory.
   Isn't that a little redundant?

2) It appears to be reading the text-base file 512 bytes at a time
   and then writing the (same???) 512 bytes out to the temp file.
   Isn't that buffer size fairly small and inefficient?
   Probably not noticeable on a local disk, but is quite an
   impact when on a high latency network where the 512 byte packets
   are acknowledged before the next one is transmitted.

I'm hoping someone familiar with the working copy code can comment on
this behavior.  I'd be willing to dig into this a little deeper, provided
someone that knows the code better doesn't give a valid reason for the
small buffer sizes.  (I'll admit I am completely ignorant of the
working copy code, but I'm always willing to learn.)


Command that was run:
  svn up -r HEAD docs\svn-book.pdf
  (After I had run an "svn up -r PREV docs\svn-book.pdf")

Here is the windows client info (The standard Collabnet windows one):

C:\>svn --version
svn, version 1.4.3 (r23084)
   compiled Jan 18 2007, 07:47:40

Server is running Subversion 1.4.5 on Windows 2003.

Thanks!
Kevin R.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: numerous small 512 byte working copy reads and writes

Posted by Erik Huelsmann <eh...@gmail.com>.

On Dec 18, 2007 4:54 PM,  <km...@rockwellcollins.com> wrote:
> "Norbert Unterberg" <nu...@gmail.com> wrote on 12/18/2007 06:27:06
> AM:
>
> > On Dec 18, 2007 12:05 PM, Erik Huelsmann <eh...@gmail.com> wrote:
> > >
> > > On 12/18/07, Norbert Unterberg <nu...@gmail.com> wrote:
> > > > You are on the wrong track here.The problem is not what BUFSIZ is
> > > > defined on the different platforms but how APR uses it.
> > > > BUFSIZ is the size of the buffer that can be set for buffered
> streams
> > > > with the setbuf() call. Nothing more. MSDN just says: "BUFSIZ is the
> > > > required user-allocated buffer for the setvbuf routine." BUFSIZ is
> NOT
> > > > a suggestion for a buffer size when dealing with large binary files.
> > > > So the bug is in APR where BUFSIZ is used for something it was
> > not designed for.
> > > >
> > > > This has already been discussed for years on this list, but
> > nothing has changed:
> > > >
> http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=82087
> > > >
> http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=113549
> > > >
> > > > So it is clearly an apr and not a windows issue.
> > >
> > > Providing a patch might help:
> >
> > Thank you for the invitation, but no thank you. Diagnosing if someone
> > is ill and actually performing the operation are two different things,
> > and I currently do not want to jump over the high hurdles of actually
> > creating a politically and stylistically correct patch to an APR
> > library function!
> >
> > > They define a buffer size of their own
> > > (APR_FILE_BUFSIZE, defined to 4096 on Win32) which should probably be
> > > used for binary file copies since it's used for buffered files as
> > > well.
> >
> > As already suggested three years ago, why not call the native
> > CopyFile() function on Windows which will likely have best possible
> > performance without filling with different buffer sizes?

Well, that may work for *copying* a file, but the same source file
(copy.c) also provides *appending* to files, leaving the same issues
to users of the APR api, thereby not having solved anything. I think
it would be better to address the problem integrally (for example by
choosing an applicable static buffer size, or better yet, by
identifying a good means of dynamically determining the right size).

> Nice to know I've stumbled across a 3 year old problem.  I can't believe
> file copying is implemented so potentially inefficiently.

So it seems, yes. But implying that since the problem was discussed
here and nothing changed means that nobody cares (enough to fix it)
won't fly: APR is a separate project and we need to make them aware of
the problem. Including a proposed fix usually helps resolving the
issue.

> I just checked apr-1.2.12, and it is still using BUFSIZ in the
> apr_file_transfer_contents() routine.

I found the same, yes.

> I may submit a patch to the APR people to see what they think.  However,
> I really wonder if Subversion should just be doing it's own thing here
> until/when APR is improved.  Subversion does a lot of file copying and
> this could be a big working copy performance improvement on slower
> access devices.

Nope. Subversion shouldn't do it's own thing. We use APR as our
portability layer, so do others. As soon as we find problems, we
should take it up with APR. If they fix it, everybody benefits. If
they don't *then* we get to consider to do it ourselves.

> Since I have a unix build environment (not windows where I observed this
> behavior), I may do some adhoc testing over nfs to see if/how it really
> affects working copy performance.

Of course a good deal of performance testing helps arguing the case
with the APR folks, so please do and please share the results from
your efforts!

bye,


Erik.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: numerous small 512 byte working copy reads and writes

Posted by km...@rockwellcollins.com.

"Norbert Unterberg" <nu...@gmail.com> wrote on 12/18/2007 06:27:06 
AM:
> On Dec 18, 2007 12:05 PM, Erik Huelsmann <eh...@gmail.com> wrote:
> >
> > On 12/18/07, Norbert Unterberg <nu...@gmail.com> wrote:
> > > You are on the wrong track here.The problem is not what BUFSIZ is
> > > defined on the different platforms but how APR uses it.
> > > BUFSIZ is the size of the buffer that can be set for buffered 
streams
> > > with the setbuf() call. Nothing more. MSDN just says: "BUFSIZ is the
> > > required user-allocated buffer for the setvbuf routine." BUFSIZ is 
NOT
> > > a suggestion for a buffer size when dealing with large binary files.
> > > So the bug is in APR where BUFSIZ is used for something it was 
> not designed for.
> > >
> > > This has already been discussed for years on this list, but 
> nothing has changed:
> > > 
http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=82087
> > > 
http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=113549
> > >
> > > So it is clearly an apr and not a windows issue.
> >
> > Providing a patch might help:
> 
> Thank you for the invitation, but no thank you. Diagnosing if someone
> is ill and actually performing the operation are two different things,
> and I currently do not want to jump over the high hurdles of actually
> creating a politically and stylistically correct patch to an APR
> library function!
> 
> > They define a buffer size of their own
> > (APR_FILE_BUFSIZE, defined to 4096 on Win32) which should probably be
> > used for binary file copies since it's used for buffered files as
> > well.
> 
> As already suggested three years ago, why not call the native
> CopyFile() function on Windows which will likely have best possible
> performance without filling with different buffer sizes?

Nice to know I've stumbled across a 3 year old problem.  I can't believe
file copying is implemented so potentially inefficiently.

I just checked apr-1.2.12, and it is still using BUFSIZ in the
apr_file_transfer_contents() routine.

I may submit a patch to the APR people to see what they think.  However,
I really wonder if Subversion should just be doing it's own thing here
until/when APR is improved.  Subversion does a lot of file copying and
this could be a big working copy performance improvement on slower
access devices.

Since I have a unix build environment (not windows where I observed this
behavior), I may do some adhoc testing over nfs to see if/how it really
affects working copy performance.

Kevin R.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: numerous small 512 byte working copy reads and writes

Posted by Norbert Unterberg <nu...@gmail.com>.

On Dec 18, 2007 12:05 PM, Erik Huelsmann <eh...@gmail.com> wrote:
>
> On 12/18/07, Norbert Unterberg <nu...@gmail.com> wrote:
> > You are on the wrong track here.The problem is not what BUFSIZ is
> > defined on the different platforms but how APR uses it.
> > BUFSIZ is the size of the buffer that can be set for buffered streams
> > with the setbuf() call. Nothing more. MSDN just says: "BUFSIZ is the
> > required user-allocated buffer for the setvbuf routine." BUFSIZ is NOT
> > a suggestion for a buffer size when dealing with large binary files.
> > So the bug is in APR where BUFSIZ is used for something it was not designed for.
> >
> > This has already been discussed for years on this list, but nothing has changed:
> > http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=82087
> > http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=113549
> >
> > So it is clearly an apr and not a windows issue.
>
> Providing a patch might help:

Thank you for the invitation, but no thank you. Diagnosing if someone
is ill and actually performing the operation are two different things,
and I currently do not want to jump over the high hurdles of actually
creating a politically and stylistically correct patch to an APR
library function!

> They define a buffer size of their own
> (APR_FILE_BUFSIZE, defined to 4096 on Win32) which should probably be
> used for binary file copies since it's used for buffered files as
> well.

As already suggested three years ago, why not call the native
CopyFile() function on Windows which will likely have best possible
performance without filling with different buffer sizes?

Norbert

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: numerous small 512 byte working copy reads and writes

Posted by km...@rockwellcollins.com.

"Erik Huelsmann" <eh...@gmail.com> wrote on 12/18/2007 05:05:01 AM:
> On 12/18/07, Norbert Unterberg <nu...@gmail.com> wrote:
> > On Dec 17, 2007 9:07 PM,  <km...@rockwellcollins.com> wrote:
> > > "Erik Huelsmann" <eh...@gmail.com> wrote on 12/17/2007 04:59:58 AM:
> >
> > > > The buffer size comes from BUFSIZ which is used to copy files in 
APR.
> > > > They probably use BUFSIZ because the definition of the constant is
> > > > that it should return a buffer size for efficiently doing file IO.
> > > >
> > > > Probably the CRT he's using is defining BUFSIZ to 512... (Which I
> > > > agree, is quite small...)
> > >
> > > Thanks for the pointers.  I was wondering if it could be an APR 
issue.
> > >
> > > I haven't yet verified, but I assume the windows command line client 
is
> > > compiled with visual studio, which would use the standard Microsoft 
C
> > > runtime.
> >
> > You are on the wrong track here.The problem is not what BUFSIZ is
> > defined on the different platforms but how APR uses it.
> > BUFSIZ is the size of the buffer that can be set for buffered streams
> > with the setbuf() call. Nothing more. MSDN just says: "BUFSIZ is the
> > required user-allocated buffer for the setvbuf routine." BUFSIZ is NOT
> > a suggestion for a buffer size when dealing with large binary files.
> > So the bug is in APR where BUFSIZ is used for something it was not
> designed for.
> >
> > This has already been discussed for years on this list, but 
> nothing has changed:
> > http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=82087
> > 
http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=113549
> >
> > So it is clearly an apr and not a windows issue.
> 
> Providing a patch might help: They define a buffer size of their own
> (APR_FILE_BUFSIZE, defined to 4096 on Win32) which should probably be
> used for binary file copies since it's used for buffered files as
> well.
> 
> 
> Though 16k seemed to perform best on Win32 a year or 2 ago, when
> tested by an svn dev; while no negative impact was detected on Unix
> systems when shrinking from larger sizes to this relatively small
> buffer size. All this was done on local filesystems.

I submitted a bug report to APR to increase the size of the buffer size.
http://issues.apache.org/bugzilla/show_bug.cgi?id=44193

Kevin R.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: numerous small 512 byte working copy reads and writes

Posted by Erik Huelsmann <eh...@gmail.com>.

On 12/18/07, Norbert Unterberg <nu...@gmail.com> wrote:
> On Dec 17, 2007 9:07 PM,  <km...@rockwellcollins.com> wrote:
> > "Erik Huelsmann" <eh...@gmail.com> wrote on 12/17/2007 04:59:58 AM:
>
> > > The buffer size comes from BUFSIZ which is used to copy files in APR.
> > > They probably use BUFSIZ because the definition of the constant is
> > > that it should return a buffer size for efficiently doing file IO.
> > >
> > > Probably the CRT he's using is defining BUFSIZ to 512... (Which I
> > > agree, is quite small...)
> >
> > Thanks for the pointers.  I was wondering if it could be an APR issue.
> >
> > I haven't yet verified, but I assume the windows command line client is
> > compiled with visual studio, which would use the standard Microsoft C
> > runtime.
>
> You are on the wrong track here.The problem is not what BUFSIZ is
> defined on the different platforms but how APR uses it.
> BUFSIZ is the size of the buffer that can be set for buffered streams
> with the setbuf() call. Nothing more. MSDN just says: "BUFSIZ is the
> required user-allocated buffer for the setvbuf routine." BUFSIZ is NOT
> a suggestion for a buffer size when dealing with large binary files.
> So the bug is in APR where BUFSIZ is used for something it was not designed for.
>
> This has already been discussed for years on this list, but nothing has changed:
> http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=82087
> http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=113549
>
> So it is clearly an apr and not a windows issue.

Providing a patch might help: They define a buffer size of their own
(APR_FILE_BUFSIZE, defined to 4096 on Win32) which should probably be
used for binary file copies since it's used for buffered files as
well.


Though 16k seemed to perform best on Win32 a year or 2 ago, when
tested by an svn dev; while no negative impact was detected on Unix
systems when shrinking from larger sizes to this relatively small
buffer size. All this was done on local filesystems.

HTH,

Erik.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: numerous small 512 byte working copy reads and writes

Posted by Norbert Unterberg <nu...@gmail.com>.

On Dec 17, 2007 9:07 PM,  <km...@rockwellcollins.com> wrote:
> "Erik Huelsmann" <eh...@gmail.com> wrote on 12/17/2007 04:59:58 AM:

> > The buffer size comes from BUFSIZ which is used to copy files in APR.
> > They probably use BUFSIZ because the definition of the constant is
> > that it should return a buffer size for efficiently doing file IO.
> >
> > Probably the CRT he's using is defining BUFSIZ to 512... (Which I
> > agree, is quite small...)
>
> Thanks for the pointers.  I was wondering if it could be an APR issue.
>
> I haven't yet verified, but I assume the windows command line client is
> compiled with visual studio, which would use the standard Microsoft C
> runtime.

You are on the wrong track here.The problem is not what BUFSIZ is
defined on the different platforms but how APR uses it.
BUFSIZ is the size of the buffer that can be set for buffered streams
with the setbuf() call. Nothing more. MSDN just says: "BUFSIZ is the
required user-allocated buffer for the setvbuf routine." BUFSIZ is NOT
a suggestion for a buffer size when dealing with large binary files.
So the bug is in APR where BUFSIZ is used for something it was not designed for.

This has already been discussed for years on this list, but nothing has changed:
http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=82087
http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=113549

So it is clearly an apr and not a windows issue.

Norbert

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: numerous small 512 byte working copy reads and writes

Posted by km...@rockwellcollins.com.

"Erik Huelsmann" <eh...@gmail.com> wrote on 12/17/2007 04:59:58 AM:
> On 12/17/07, Karl Fogel <kf...@red-bean.com> wrote:
> > kmradke@rockwellcollins.com writes:
> > > For example:
> > >
> > > READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base SUCCESS
> > > Offset: 0 Length: 512
[...]
> > > Offset: 1351168 Length: 512
> > >
> > >
> > > A few things I noticed:
> > >
> > > 1) A file is named .tmp.tmp and stored in a "tmp" directory.
> > >    Isn't that a little redundant?
> > >
> > > 2) It appears to be reading the text-base file 512 bytes at a time
> > >    and then writing the (same???) 512 bytes out to the temp file.
> > >    Isn't that buffer size fairly small and inefficient?
> > >    Probably not noticeable on a local disk, but is quite an
> > >    impact when on a high latency network where the 512 byte packets
> > >    are acknowledged before the next one is transmitted.
> > >
> > > I'm hoping someone familiar with the working copy code can comment 
on
> > > this behavior.  I'd be willing to dig into this a little deeper, 
provided
> > > someone that knows the code better doesn't give a valid reason for 
the
> > > small buffer sizes.  (I'll admit I am completely ignorant of the
> > > working copy code, but I'm always willing to learn.)
> >
> > I don't have time to trace that buffer size down, but if you do,
> > please let us know where it's happening.  I too would expect the code
> > to be using a much larger buffer size than that!
> 
> The buffer size comes from BUFSIZ which is used to copy files in APR.
> They probably use BUFSIZ because the definition of the constant is
> that it should return a buffer size for efficiently doing file IO.
> 
> Probably the CRT he's using is defining BUFSIZ to 512... (Which I
> agree, is quite small...)

Thanks for the pointers.  I was wondering if it could be an APR issue.

I haven't yet verified, but I assume the windows command line client is
compiled with visual studio, which would use the standard Microsoft C 
runtime.

I'll also test the 1.4.5 version as well as TortoiseSVN to see if they
suffer from the similar small buffer issues.

Kevin R.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: numerous small 512 byte working copy reads and writes

Posted by Erik Huelsmann <eh...@gmail.com>.

On 12/17/07, Karl Fogel <kf...@red-bean.com> wrote:
> kmradke@rockwellcollins.com writes:
> > For example:
> >
> > READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS
> > Offset: 0 Length: 512
> > WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS
> > Offset: 0 Length: 512
> > READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS
> > Offset: 512 Length: 512
> > WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS
> > Offset: 512 Length: 512
> > READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS
> > Offset: 1024 Length: 512
> > WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS
> > Offset: 1024 Length: 512
> > READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS
> > Offset: 1536 Length: 512
> > WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS
> > Offset: 1536 Length: 512
> > READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS
> > Offset: 2048 Length: 512
> > WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS
> > Offset: 2048 Length: 512
> > READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS
> > Offset: 2560 Length: 512
> > WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS
> > Offset: 2560 Length: 512
> > READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS
> > Offset: 3072 Length: 512
> > WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS
> > Offset: 3072 Length: 512
> > ...
> > READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS
> > Offset: 1351168 Length: 512
> > WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS
> > Offset: 1351168 Length: 512
> >
> >
> > A few things I noticed:
> >
> > 1) A file is named .tmp.tmp and stored in a "tmp" directory.
> >    Isn't that a little redundant?
> >
> > 2) It appears to be reading the text-base file 512 bytes at a time
> >    and then writing the (same???) 512 bytes out to the temp file.
> >    Isn't that buffer size fairly small and inefficient?
> >    Probably not noticeable on a local disk, but is quite an
> >    impact when on a high latency network where the 512 byte packets
> >    are acknowledged before the next one is transmitted.
> >
> > I'm hoping someone familiar with the working copy code can comment on
> > this behavior.  I'd be willing to dig into this a little deeper, provided
> > someone that knows the code better doesn't give a valid reason for the
> > small buffer sizes.  (I'll admit I am completely ignorant of the
> > working copy code, but I'm always willing to learn.)
>
> I don't have time to trace that buffer size down, but if you do,
> please let us know where it's happening.  I too would expect the code
> to be using a much larger buffer size than that!

The buffer size comes from BUFSIZ which is used to copy files in APR.
They probably use BUFSIZ because the definition of the constant is
that it should return a buffer size for efficiently doing file IO.

Probably the CRT he's using is defining BUFSIZ to 512... (Which I
agree, is quite small...)

HTH,


Erik.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: numerous small 512 byte working copy reads and writes

Posted by Karl Fogel <kf...@red-bean.com>.

kmradke@rockwellcollins.com writes:
> For example:
>
> READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS 
> Offset: 0 Length: 512
> WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS 
> Offset: 0 Length: 512
> READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS 
> Offset: 512 Length: 512
> WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS 
> Offset: 512 Length: 512
> READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS 
> Offset: 1024 Length: 512
> WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS 
> Offset: 1024 Length: 512
> READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS 
> Offset: 1536 Length: 512
> WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS 
> Offset: 1536 Length: 512
> READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS 
> Offset: 2048 Length: 512
> WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS 
> Offset: 2048 Length: 512
> READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS 
> Offset: 2560 Length: 512
> WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS 
> Offset: 2560 Length: 512
> READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS 
> Offset: 3072 Length: 512
> WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS 
> Offset: 3072 Length: 512
> ...
> READ    C:\docs\.svn\tmp\text-base\svn-book.pdf.svn-base        SUCCESS 
> Offset: 1351168 Length: 512
> WRITE   C:\docs\.svn\tmp\svn-book.pdf.tmp.tmp                   SUCCESS 
> Offset: 1351168 Length: 512
>
>
> A few things I noticed:
>
> 1) A file is named .tmp.tmp and stored in a "tmp" directory.
>    Isn't that a little redundant?
>
> 2) It appears to be reading the text-base file 512 bytes at a time
>    and then writing the (same???) 512 bytes out to the temp file.
>    Isn't that buffer size fairly small and inefficient?
>    Probably not noticeable on a local disk, but is quite an
>    impact when on a high latency network where the 512 byte packets
>    are acknowledged before the next one is transmitted.
>
> I'm hoping someone familiar with the working copy code can comment on
> this behavior.  I'd be willing to dig into this a little deeper, provided
> someone that knows the code better doesn't give a valid reason for the
> small buffer sizes.  (I'll admit I am completely ignorant of the
> working copy code, but I'm always willing to learn.)

I don't have time to trace that buffer size down, but if you do,
please let us know where it's happening.  I too would expect the code
to be using a much larger buffer size than that!

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org