You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by ha...@hjackson.org on 2012/06/28 19:53:33 UTC

svn/fs.py and buffering

Hi all,

I just ran into an interesting "feature" using svndbadmin. Basically
subprocess.Popen in fs.py is not buffered and if you run an strace on it you
can see thousands of system calls to reading one character at a time. I'm not
sure if this was a design issue or not but it certainly impacts performance
for me when using this command. On a small test repo with the viewvc database
fully purged and no buffering:

time /usr/lib/viewvc/bin/svndbadmin -v update /usr/local/vault/
real    3m47.653s
user    1m6.610s
sys     2m15.869s

with buffering at 4096
real    1m50.753s
user    0m25.862s
sys     1m1.334s

Note this is on a very small svn repo with only 36 revisions. The following
diff is to

subversion/bindings/swig/python/svn/fs.py

that shows the simple change needed to achieve the same buffering as the
local system, 4096 in my case.

117c117
<     p = _subprocess.Popen(cmd, stdout=_subprocess.PIPE,
---
> >     p = _subprocess.Popen(cmd, bufsize=-1, stdout=_subprocess.PIPE,

Initially the only reason I could think of for doing it this way is due to
large binary files with no newline character ie the ability to slurp a large
file into RAM and blow something up, but this is running "diff" and for
binary files it will only tell you if they differ ie two one meg binary files 
that differ give me:

$ diff test.img test2.img
Binary files test.img and test2.img differ

So is it safe to add buffering here?

-- 
Harry

Re: svn/fs.py and buffering

Posted by Philip Martin <ph...@wandisco.com>.
harry@hjackson.org writes:

> 117c117
> <     p = _subprocess.Popen(cmd, stdout=_subprocess.PIPE,
> ---
>> >     p = _subprocess.Popen(cmd, bufsize=-1, stdout=_subprocess.PIPE,
>

I've committed this as r1356668.  Thanks!

-- 
Philip