You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Trent Nelson <tr...@snakebite.org> on 2012/01/23 21:07:20 UTC

[PATCH] Fix memory leak in SWIG Python bindings

    Well, after a week of blood, sweat and tears, I finally tracked down the
    memory leak that's been slowly chipping away at my sanity for the past
    year.

    It's a pretty severe leak that affects the Python SWIG bindings: once
    allocated, an svn_merge_range_t object can never be freed.  This is
    due to a missing Py_DECREF call in swigutil_py.c's convert_rangelist
    method.

    The range lists can be huge, too, which makes this quite a nasty leak.
    I was seeing up to 750KB of RSS being eaten up on each invocation of
    svn_mergeinfo_parse() against non-trivial, real-world svn:mergeinfo
    data (such as that on /subversion/trunk, for example).

    Any of the mergeinfo methods that return a range list (which almost all
    do) are affected.  I haven't looked at the Perl or Ruby bindings in any
    capacity.

    Patch against trunk attached.

    Regards,

        Trent.

[[

Fix a memory leak in convert_rangelist by *always* Py_DECREF'ing the object
returned by convert_to_swigtype (a SWIG wrapper around svn_merge_range_t)
once we've established it's not NULL.

This is required because PyList_Append takes ownership of references passed
in (i.e. calls Py_INCREF).  Prior to this fix, the reference counts of any
svn_merge_range_t objects would always be off-by-one, preventing them from
ever being garbage collected, having dire effects on the memory usage of
long-running programs calling svn_mergeinfo_parse() on real-world data.

Patch by: Trent Nelson <tr...@snakebite.org>
Tested on: FreeBSD, OS X, Windows.

* subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c:
  (convert_rangelist): Make sure we call Py_DECREF on the object returned
   from convert_to_swigtype.  PyList_New might return NULL; check for that.
   Add 'apr_array_header_t *' and 'swig_type_info *' casts where necessary
   to suppress compiler warnings.

* subversion/bindings/swig/python/tests/mergeinfo.py:
  (get_svn_merge_range_t_objects): New helper method that returns a list of
   any 'svn_merge_range_t' objects being tracked by the garbage collector,
   which we use to detect memory leaks.
  (SubversionMergeinfoTestCase): Add two new tests to aid in detecting memory
   leaks.  They essentially test the same thing in two different ways; if one
   fails, the other *should* fail.  (Of course, if one fails and the other does
   not -- that's just as valuable, diagnostically, and points to an issue with
   the 'automatic pool memory management' logic.)
    (test_mergeinfo_leakage__incorrect_range_t_refcounts): New test.  Verify
     svn_merge_range_t objects returned from svn_mergeinfo_parse() have the
     correct refcounts set.
    (test_mergeinfo_leakage__lingering_range_t_objects_after_del): New test.
     Verify that there are no lingering svn_merge_range_t objects after we
     explicitly delete the results returned from svn_mergeinfo_parse().

]]

[[
Index: subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c
===================================================================
--- subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c	(revision 1234786)
+++ subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c	(working copy)
@@ -653,15 +653,27 @@
 static PyObject *convert_rangelist(void *value, void *ctx, PyObject *py_pool)
 {
   int i;
+  int result;
+  PyObject *obj;
   PyObject *list;
-  apr_array_header_t *array = value;
+  apr_array_header_t *array = (apr_array_header_t *)value;
 
   list = PyList_New(0);
+  if (list == NULL)
+    return NULL;
+
   for (i = 0; i < array->nelts; i++)
     {
       svn_merge_range_t *range = APR_ARRAY_IDX(array, i, svn_merge_range_t *);
-      if (PyList_Append(list, convert_to_swigtype(range, ctx, py_pool)) == -1)
+
+      obj = convert_to_swigtype(range, (swig_type_info *)ctx, py_pool);
+      if (obj == NULL)
         goto error;
+
+      result = PyList_Append(list, obj);
+      Py_DECREF(obj);
+      if (result == -1)
+        goto error;
     }
   return list;
  error:
Index: subversion/bindings/swig/python/tests/mergeinfo.py
===================================================================
--- subversion/bindings/swig/python/tests/mergeinfo.py	(revision 1234786)
+++ subversion/bindings/swig/python/tests/mergeinfo.py	(working copy)
@@ -18,7 +18,7 @@
 # under the License.
 #
 #
-import unittest, os
+import unittest, os, sys, gc, itertools
 from svn import core, repos, fs
 import utils
 
@@ -29,6 +29,15 @@
     self.start = start
     self.end = end
 
+def get_svn_merge_range_t_objects():
+  """Returns a list 'svn_merge_range_t' objects being tracked by the
+     garbage collector, used for detecting memory leaks."""
+  return [
+    o for o in gc.get_objects()
+      if hasattr(o, '__class__') and
+        o.__class__.__name__ == 'svn_merge_range_t'
+  ]
+
 class SubversionMergeinfoTestCase(unittest.TestCase):
   """Test cases for mergeinfo"""
 
@@ -116,6 +125,54 @@
       }
     self.compare_mergeinfo_catalogs(mergeinfo, expected_mergeinfo)
 
+  def test_mergeinfo_leakage__incorrect_range_t_refcounts(self):
+    """Ensure that the ref counts on svn_merge_range_t objects returned by
+       svn_mergeinfo_parse() are correct."""
+    # When reference counting is working properly, each svn_merge_range_t in
+    # the returned mergeinfo will have a ref count of 1...
+    mergeinfo = core.svn_mergeinfo_parse(self.TEXT_MERGEINFO1)
+    for (path, rangelist) in mergeinfo.items():
+      # ....and now 2 (incref during iteration of rangelist)
+
+      for (r, i) in zip(rangelist, itertools.count()):
+        # ....and now 3 (incref during iteration of each range object)
+
+        refcount = sys.getrefcount(r)
+        # ....and finally, 4 (getrefcount() also increfs)
+        expected = 4
+
+        # Note: if path and index are not '/trunk' and 0 respectively, then
+        # only some of the range objects are leaking, which is, as far as
+        # leaks go, even more impressive.
+        self.assertEquals(refcount, expected, (
+          "Memory leak!  Expected a ref count of %d for svn_merge_range_t "
+          "object, but got %d instead (path: %s, index: %d).  Probable "
+          "cause: incorrect Py_INCREF/Py_DECREF usage in libsvn_swig_py/"
+          "swigutil_py.c." % (expected, refcount, path, i)))
+
+    del mergeinfo
+    gc.collect()
+
+  def test_mergeinfo_leakage__lingering_range_t_objects_after_del(self):
+    """Ensure that there are no svn_merge_range_t objects being tracked by
+       the garbage collector after we explicitly `del` the results returned
+       by svn_mergeinfo_parse().  We disable gc before the svn_mergeinfo_parse
+       call and force an explicit collection cycle straight after the `del`;
+       if our reference counts are correct, the allocated svn_merge_range_t
+       objects will be garbage collected and thus, not appear in the list of
+       objects returned by gc.get_objects()."""
+    gc.disable()
+    mergeinfo = core.svn_mergeinfo_parse(self.TEXT_MERGEINFO1)
+    del mergeinfo
+    gc.collect()
+    lingering = get_svn_merge_range_t_objects()
+    self.assertEquals(lingering, list(), (
+      "Memory leak!  Found lingering svn_merge_range_t objects left over from "
+      "our call to svn_mergeinfo_parse(), even though we explicitly deleted "
+      "the returned mergeinfo object.  Probable cause: incorrect Py_INCREF/"
+      "Py_DECREF usage in libsvn_swig_py/swigutil_py.c.  Lingering objects:\n"
+      "%s" % lingering))
+
   def inspect_mergeinfo_dict(self, mergeinfo, merge_source, nbr_rev_ranges):
     rangelist = mergeinfo.get(merge_source)
     self.inspect_rangelist_tuple(rangelist, nbr_rev_ranges)
]]

Re: [PATCH] Fix memory leak in SWIG Python bindings

Posted by Daniel Shahaf <da...@elego.de>.
Trent Nelson wrote on Tue, Jan 24, 2012 at 10:17:24 -0500:
> On Tue, Jan 24, 2012 at 03:44:04AM -0800, Daniel Shahaf wrote:
> > Trent Nelson wrote on Mon, Jan 23, 2012 at 15:07:20 -0500:
> > > Index: subversion/bindings/swig/python/tests/mergeinfo.py
> > > ===================================================================
> > > --- subversion/bindings/swig/python/tests/mergeinfo.py	(revision 1234786)
> > > +++ subversion/bindings/swig/python/tests/mergeinfo.py	(working copy)
> > > +  def test_mergeinfo_leakage__lingering_range_t_objects_after_del(self):
> > > +    """Ensure that there are no svn_merge_range_t objects being tracked by
> > > +       the garbage collector after we explicitly `del` the results returned
> > > +       by svn_mergeinfo_parse().  We disable gc before the svn_mergeinfo_parse
> > > +       call and force an explicit collection cycle straight after the `del`;
> > > +       if our reference counts are correct, the allocated svn_merge_range_t
> > > +       objects will be garbage collected and thus, not appear in the list of
> > > +       objects returned by gc.get_objects()."""
> > > +    gc.disable()
> > > +    mergeinfo = core.svn_mergeinfo_parse(self.TEXT_MERGEINFO1)
> > > +    del mergeinfo
> > > +    gc.collect()
> >
> > It seems you need a gc.enable() call as well, to prevent the test from
> > having effects on global state.
> 
>     You know what, you can get rid of that gc.disable() call.  I put it
>     in by habit; when I was trying to find the leak, I had breakpoints
>     set within the CPython garbage collection routines -- if I didn't
>     disable automatic collection, they'd trigger at extremely annoying
>     times ;-)
> 
>     The gc.collect() call is sufficient.
> 

Okay.  Given that, and that it seems a good idea to add a regression
test for a bug that can cost many MB of memory if it repeats, I've
committed the testsuite part too: r1235296, and will nominate for
backport.

> > Do the swig-py tests support parallelized running?  The testsuite for
> > 'svn' does, and if the testsuite for swig-py does too then the 'gc' call
> > have effects on other threads.
> 
>     I have... no idea :-)  Given that check-swig-py basically just calls
>     `python subversion/bindings/swig/python/tests/run_all.py', and that
>     file just uses the standard Python unit test facilities... I'd say
>     no, they're not currently designed to support parallel running.
> 
>     I'm sure we could totally shave that 40s test run time down a few
>     notches if they, though ;-)
> 

I think the ROI will be better on optimizing the test suite of 'svn'
itself (build/run_tests.py and subversion/tests/cmdline/) :-)

>         Trent.
> 

Thanks,

Daniel

Re: [PATCH] Fix memory leak in SWIG Python bindings

Posted by Trent Nelson <tr...@snakebite.org>.
On Tue, Jan 24, 2012 at 03:44:04AM -0800, Daniel Shahaf wrote:
> Trent Nelson wrote on Mon, Jan 23, 2012 at 15:07:20 -0500:
> > Index: subversion/bindings/swig/python/tests/mergeinfo.py
> > ===================================================================
> > --- subversion/bindings/swig/python/tests/mergeinfo.py	(revision 1234786)
> > +++ subversion/bindings/swig/python/tests/mergeinfo.py	(working copy)
> > +  def test_mergeinfo_leakage__lingering_range_t_objects_after_del(self):
> > +    """Ensure that there are no svn_merge_range_t objects being tracked by
> > +       the garbage collector after we explicitly `del` the results returned
> > +       by svn_mergeinfo_parse().  We disable gc before the svn_mergeinfo_parse
> > +       call and force an explicit collection cycle straight after the `del`;
> > +       if our reference counts are correct, the allocated svn_merge_range_t
> > +       objects will be garbage collected and thus, not appear in the list of
> > +       objects returned by gc.get_objects()."""
> > +    gc.disable()
> > +    mergeinfo = core.svn_mergeinfo_parse(self.TEXT_MERGEINFO1)
> > +    del mergeinfo
> > +    gc.collect()
>
> It seems you need a gc.enable() call as well, to prevent the test from
> having effects on global state.

    You know what, you can get rid of that gc.disable() call.  I put it
    in by habit; when I was trying to find the leak, I had breakpoints
    set within the CPython garbage collection routines -- if I didn't
    disable automatic collection, they'd trigger at extremely annoying
    times ;-)

    The gc.collect() call is sufficient.

> Do the swig-py tests support parallelized running?  The testsuite for
> 'svn' does, and if the testsuite for swig-py does too then the 'gc' call
> have effects on other threads.

    I have... no idea :-)  Given that check-swig-py basically just calls
    `python subversion/bindings/swig/python/tests/run_all.py', and that
    file just uses the standard Python unit test facilities... I'd say
    no, they're not currently designed to support parallel running.

    I'm sure we could totally shave that 40s test run time down a few
    notches if they, though ;-)

        Trent.


Re: [PATCH] Fix memory leak in SWIG Python bindings

Posted by Daniel Shahaf <da...@elego.de>.
A few comments on the testsuite part of the patch.

First, I intend to test the patch using the attached pastebin script by
Trent, in addition to 'make check'.

Second:

Trent Nelson wrote on Mon, Jan 23, 2012 at 15:07:20 -0500:
> Index: subversion/bindings/swig/python/tests/mergeinfo.py
> ===================================================================
> --- subversion/bindings/swig/python/tests/mergeinfo.py	(revision 1234786)
> +++ subversion/bindings/swig/python/tests/mergeinfo.py	(working copy)
> +  def test_mergeinfo_leakage__lingering_range_t_objects_after_del(self):
> +    """Ensure that there are no svn_merge_range_t objects being tracked by
> +       the garbage collector after we explicitly `del` the results returned
> +       by svn_mergeinfo_parse().  We disable gc before the svn_mergeinfo_parse
> +       call and force an explicit collection cycle straight after the `del`;
> +       if our reference counts are correct, the allocated svn_merge_range_t
> +       objects will be garbage collected and thus, not appear in the list of
> +       objects returned by gc.get_objects()."""
> +    gc.disable()
> +    mergeinfo = core.svn_mergeinfo_parse(self.TEXT_MERGEINFO1)
> +    del mergeinfo
> +    gc.collect()

It seems you need a gc.enable() call as well, to prevent the test from
having effects on global state.

Do the swig-py tests support parallelized running?  The testsuite for
'svn' does, and if the testsuite for swig-py does too then the 'gc' call
have effects on other threads.

> +    lingering = get_svn_merge_range_t_objects()
> +    self.assertEquals(lingering, list(), (
> +      "Memory leak!  Found lingering svn_merge_range_t objects left over from "
> +      "our call to svn_mergeinfo_parse(), even though we explicitly deleted "
> +      "the returned mergeinfo object.  Probable cause: incorrect Py_INCREF/"
> +      "Py_DECREF usage in libsvn_swig_py/swigutil_py.c.  Lingering objects:\n"
> +      "%s" % lingering))

Re: [PATCH] Fix memory leak in SWIG Python bindings

Posted by Hyrum K Wright <hy...@wandisco.com>.
On Tue, Jan 24, 2012 at 5:58 AM, Trent Nelson <tr...@snakebite.org> wrote:
>
> On Jan 24, 2012, at 12:11 AM, Daniel Shahaf wrote:
>
>> Trent Nelson wrote on Mon, Jan 23, 2012 at 15:07:20 -0500:
>>> Fix a memory leak in convert_rangelist by *always* Py_DECREF'ing the object
>>> returned by convert_to_swigtype (a SWIG wrapper around svn_merge_range_t)
>>> once we've established it's not NULL.
>>>
>>> This is required because PyList_Append takes ownership of references passed
>>> in (i.e. calls Py_INCREF).  Prior to this fix, the reference counts of any
>>
>> Is svn_swig_py_c_strings_to_list() affected too?  It appends a string to
>> a list but doesn't Py_DECREF() the string.
>>
>
> Yeah, I saw that, but scheduled it for another day, as I didn't want to clutter my main patch.  Also, nothing calls that method -- which also factored into my decision to ignore it for this patch.
>
>>> svn_merge_range_t objects would always be off-by-one, preventing them from
>>> ever being garbage collected, having dire effects on the memory usage of
>>> long-running programs calling svn_mergeinfo_parse() on real-world data.
>>>
>>> Patch by: Trent Nelson <tr...@snakebite.org>
>>> Tested on: FreeBSD, OS X, Windows.
>>>
>>> * subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c:
>>>  (convert_rangelist): Make sure we call Py_DECREF on the object returned
>>>   from convert_to_swigtype.  PyList_New might return NULL; check for that.
>>>   Add 'apr_array_header_t *' and 'swig_type_info *' casts where necessary
>>>   to suppress compiler warnings.
>>
>> Those casts shouldn't be needed -- casts from 'void *' to other
>> pointer types are valid C.  What warnings were you seeing?
>
> I'll double-check in the morning.  Side bar: the bindings generate loads of warnings when compiled with the target platform's equivalent of -Wall.

Yes, this is partly due to the bindings themselves, and partly due to
SWIG.  Annoying either way.

> I found a bunch of other things whilst traipsing through the SWIG/Python stuff that I'll send follow up patches for this week.  Nothing as bad as the main memory leak, though.

Great!  Keep 'em coming!

Though I don't feel qualified to effectively review the patches
(though I suppose I could give it a rough go in a pinch), I am glad
somebody is looking at these problems, and you're making progress.
Thanks for the effort.

-Hyrum


-- 

uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com/

Re: [PATCH] Fix memory leak in SWIG Python bindings

Posted by Trent Nelson <tr...@snakebite.org>.
On Jan 24, 2012, at 12:11 AM, Daniel Shahaf wrote:

> Trent Nelson wrote on Mon, Jan 23, 2012 at 15:07:20 -0500:
>> Fix a memory leak in convert_rangelist by *always* Py_DECREF'ing the object
>> returned by convert_to_swigtype (a SWIG wrapper around svn_merge_range_t)
>> once we've established it's not NULL.
>> 
>> This is required because PyList_Append takes ownership of references passed
>> in (i.e. calls Py_INCREF).  Prior to this fix, the reference counts of any
> 
> Is svn_swig_py_c_strings_to_list() affected too?  It appends a string to
> a list but doesn't Py_DECREF() the string.
> 

Yeah, I saw that, but scheduled it for another day, as I didn't want to clutter my main patch.  Also, nothing calls that method -- which also factored into my decision to ignore it for this patch. 

>> svn_merge_range_t objects would always be off-by-one, preventing them from
>> ever being garbage collected, having dire effects on the memory usage of
>> long-running programs calling svn_mergeinfo_parse() on real-world data.
>> 
>> Patch by: Trent Nelson <tr...@snakebite.org>
>> Tested on: FreeBSD, OS X, Windows.
>> 
>> * subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c:
>>  (convert_rangelist): Make sure we call Py_DECREF on the object returned
>>   from convert_to_swigtype.  PyList_New might return NULL; check for that.
>>   Add 'apr_array_header_t *' and 'swig_type_info *' casts where necessary
>>   to suppress compiler warnings.
> 
> Those casts shouldn't be needed -- casts from 'void *' to other
> pointer types are valid C.  What warnings were you seeing?

I'll double-check in the morning.  Side bar: the bindings generate loads of warnings when compiled with the target platform's equivalent of -Wall.

I found a bunch of other things whilst traipsing through the SWIG/Python stuff that I'll send follow up patches for this week.  Nothing as bad as the main memory leak, though.


> 
>> Index: subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c
>> ===================================================================
>> --- subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c	(revision 1234786)
>> +++ subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c	(working copy)
>> @@ -653,15 +653,27 @@
>> static PyObject *convert_rangelist(void *value, void *ctx, PyObject *py_pool)
>> {
>>   int i;
>> +  int result;
>> +  PyObject *obj;
>>   PyObject *list;
>> -  apr_array_header_t *array = value;
>> +  apr_array_header_t *array = (apr_array_header_t *)value;
>> 
>>   list = PyList_New(0);
>> +  if (list == NULL)
>> +    return NULL;
>> +
>>   for (i = 0; i < array->nelts; i++)
>>     {
>>       svn_merge_range_t *range = APR_ARRAY_IDX(array, i, svn_merge_range_t *);
>> -      if (PyList_Append(list, convert_to_swigtype(range, ctx, py_pool)) == -1)
>> +
>> +      obj = convert_to_swigtype(range, (swig_type_info *)ctx, py_pool);
>> +      if (obj == NULL)
>>         goto error;
>> +
>> +      result = PyList_Append(list, obj);
>> +      Py_DECREF(obj);
>> +      if (result == -1)
>> +        goto error;
>>     }
>>   return list;
>>  error:


Re: [PATCH] Fix memory leak in SWIG Python bindings

Posted by Daniel Shahaf <da...@elego.de>.
Trent Nelson wrote on Mon, Jan 23, 2012 at 15:07:20 -0500:
> Fix a memory leak in convert_rangelist by *always* Py_DECREF'ing the object
> returned by convert_to_swigtype (a SWIG wrapper around svn_merge_range_t)
> once we've established it's not NULL.
> 
> This is required because PyList_Append takes ownership of references passed
> in (i.e. calls Py_INCREF).  Prior to this fix, the reference counts of any

Is svn_swig_py_c_strings_to_list() affected too?  It appends a string to
a list but doesn't Py_DECREF() the string.

> svn_merge_range_t objects would always be off-by-one, preventing them from
> ever being garbage collected, having dire effects on the memory usage of
> long-running programs calling svn_mergeinfo_parse() on real-world data.
> 
> Patch by: Trent Nelson <tr...@snakebite.org>
> Tested on: FreeBSD, OS X, Windows.
> 
> * subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c:
>   (convert_rangelist): Make sure we call Py_DECREF on the object returned
>    from convert_to_swigtype.  PyList_New might return NULL; check for that.
>    Add 'apr_array_header_t *' and 'swig_type_info *' casts where necessary
>    to suppress compiler warnings.

Those casts shouldn't be needed -- casts from 'void *' to other
pointer types are valid C.  What warnings were you seeing?

> Index: subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c
> ===================================================================
> --- subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c	(revision 1234786)
> +++ subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c	(working copy)
> @@ -653,15 +653,27 @@
>  static PyObject *convert_rangelist(void *value, void *ctx, PyObject *py_pool)
>  {
>    int i;
> +  int result;
> +  PyObject *obj;
>    PyObject *list;
> -  apr_array_header_t *array = value;
> +  apr_array_header_t *array = (apr_array_header_t *)value;
>  
>    list = PyList_New(0);
> +  if (list == NULL)
> +    return NULL;
> +
>    for (i = 0; i < array->nelts; i++)
>      {
>        svn_merge_range_t *range = APR_ARRAY_IDX(array, i, svn_merge_range_t *);
> -      if (PyList_Append(list, convert_to_swigtype(range, ctx, py_pool)) == -1)
> +
> +      obj = convert_to_swigtype(range, (swig_type_info *)ctx, py_pool);
> +      if (obj == NULL)
>          goto error;
> +
> +      result = PyList_Append(list, obj);
> +      Py_DECREF(obj);
> +      if (result == -1)
> +        goto error;
>      }
>    return list;
>   error:

Re: [PATCH] Fix memory leak in SWIG Python bindings

Posted by Daniel Shahaf <da...@elego.de>.
Daniel Shahaf wrote on Tue, Jan 24, 2012 at 13:36:07 +0200:
> Trent Nelson wrote on Mon, Jan 23, 2012 at 15:07:20 -0500:
> > Fix a memory leak in convert_rangelist by *always* Py_DECREF'ing the object
> > returned by convert_to_swigtype (a SWIG wrapper around svn_merge_range_t)
> > once we've established it's not NULL.
> > 
> > This is required because PyList_Append takes ownership of references passed
> > in (i.e. calls Py_INCREF).  Prior to this fix, the reference counts of any
> 
> Nice log message.
> 
> > svn_merge_range_t objects would always be off-by-one, preventing them from
> > ever being garbage collected, having dire effects on the memory usage of
> > long-running programs calling svn_mergeinfo_parse() on real-world data.
> > 
> > Index: subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c
> > ===================================================================
> > --- subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c	(revision 1234786)
> > +++ subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c	(working copy)
> > @@ -653,15 +653,27 @@
> >  static PyObject *convert_rangelist(void *value, void *ctx, PyObject *py_pool)
> >  {
> >    int i;
> > +  int result;
> > +  PyObject *obj;
> >    PyObject *list;
> > -  apr_array_header_t *array = value;
> > +  apr_array_header_t *array = (apr_array_header_t *)value;
> >  
> 
> As before: this cast shouldn't need to be explicit.  Also, I've moved
> the two new declarations to a more inner scope.
> 
> >    list = PyList_New(0);
> > +  if (list == NULL)
> > +    return NULL;
> > +
> >    for (i = 0; i < array->nelts; i++)
> >      {
> >        svn_merge_range_t *range = APR_ARRAY_IDX(array, i, svn_merge_range_t *);
> > -      if (PyList_Append(list, convert_to_swigtype(range, ctx, py_pool)) == -1)
> > +
> > +      obj = convert_to_swigtype(range, (swig_type_info *)ctx, py_pool);
> > +      if (obj == NULL)
> >          goto error;
> > +
> > +      result = PyList_Append(list, obj);
> > +      Py_DECREF(obj);
> > +      if (result == -1)
> > +        goto error;
> >      }
> >    return list;
> >   error:
> 
> I then ended up with the following patch:
> 
> +      if (result == -1);
>          goto error;

My version of the patch contained a bug.

When fixed, the memory usage in the pastebin test remained constant, so
I went ahead and applied the C part of the patch in r1235264.

I'll look at the py part next.

Thanks,

Daniel

Re: [PATCH] Fix memory leak in SWIG Python bindings

Posted by Daniel Shahaf <da...@elego.de>.
Trent Nelson wrote on Mon, Jan 23, 2012 at 15:07:20 -0500:
> Fix a memory leak in convert_rangelist by *always* Py_DECREF'ing the object
> returned by convert_to_swigtype (a SWIG wrapper around svn_merge_range_t)
> once we've established it's not NULL.
> 
> This is required because PyList_Append takes ownership of references passed
> in (i.e. calls Py_INCREF).  Prior to this fix, the reference counts of any

Nice log message.

> svn_merge_range_t objects would always be off-by-one, preventing them from
> ever being garbage collected, having dire effects on the memory usage of
> long-running programs calling svn_mergeinfo_parse() on real-world data.
> 
> Index: subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c
> ===================================================================
> --- subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c	(revision 1234786)
> +++ subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c	(working copy)
> @@ -653,15 +653,27 @@
>  static PyObject *convert_rangelist(void *value, void *ctx, PyObject *py_pool)
>  {
>    int i;
> +  int result;
> +  PyObject *obj;
>    PyObject *list;
> -  apr_array_header_t *array = value;
> +  apr_array_header_t *array = (apr_array_header_t *)value;
>  

As before: this cast shouldn't need to be explicit.  Also, I've moved
the two new declarations to a more inner scope.

>    list = PyList_New(0);
> +  if (list == NULL)
> +    return NULL;
> +
>    for (i = 0; i < array->nelts; i++)
>      {
>        svn_merge_range_t *range = APR_ARRAY_IDX(array, i, svn_merge_range_t *);
> -      if (PyList_Append(list, convert_to_swigtype(range, ctx, py_pool)) == -1)
> +
> +      obj = convert_to_swigtype(range, (swig_type_info *)ctx, py_pool);
> +      if (obj == NULL)
>          goto error;
> +
> +      result = PyList_Append(list, obj);
> +      Py_DECREF(obj);
> +      if (result == -1)
> +        goto error;
>      }
>    return list;
>   error:

I then ended up with the following patch:

[[[
Index: subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c
===================================================================
--- subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c	(revision 1235202)
+++ subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c	(working copy)
@@ -657,10 +657,22 @@ static PyObject *convert_rangelist(void *value, vo
   apr_array_header_t *array = value;
 
   list = PyList_New(0);
+  if (list == NULL)
+    return NULL;
+
   for (i = 0; i < array->nelts; i++)
     {
       svn_merge_range_t *range = APR_ARRAY_IDX(array, i, svn_merge_range_t *);
-      if (PyList_Append(list, convert_to_swigtype(range, ctx, py_pool)) == -1)
+      PyObject *obj;
+      int result;
+
+      obj = convert_to_swigtype(range, ctx, py_pool);
+      if (obj == NULL)
+        goto error;
+
+      result = PyList_Append(list, obj);
+      Py_DECREF(obj);
+      if (result == -1);
         goto error;
     }
   return list;
]]]

However, with that patch, 'make check-swig-py' fails:

[[[
======================================================================
ERROR: test_mergeinfo_get (mergeinfo.SubversionMergeinfoTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/daniel/src/svn/t1/subversion/bindings/swig/python/tests/mergeinfo.py", line 109, in test_mergeinfo_get
    False, None, None)
  File "/home/daniel/src/svn/t1/subversion/bindings/swig/python/libsvn/repos.py", line 650, in svn_repos_fs_get_mergeinfo
    return _repos.svn_repos_fs_get_mergeinfo(*args)
SystemError: error return without exception set

======================================================================
ERROR: Test svn_mergeinfo_merge()
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/daniel/src/svn/t1/subversion/bindings/swig/python/tests/mergeinfo.py", line 75, in test_mergeinfo_merge
    mergeinfo1 = core.svn_mergeinfo_parse(self.TEXT_MERGEINFO1)
  File "/home/daniel/src/svn/t1/subversion/bindings/swig/python/libsvn/core.py", line 4550, in svn_mergeinfo_parse
    return _core.svn_mergeinfo_parse(*args)
SystemError: error return without exception set

======================================================================
ERROR: Test svn_mergeinfo_parse()
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/daniel/src/svn/t1/subversion/bindings/swig/python/tests/mergeinfo.py", line 61, in test_mergeinfo_parse
    mergeinfo = core.svn_mergeinfo_parse(self.TEXT_MERGEINFO1)
  File "/home/daniel/src/svn/t1/subversion/bindings/swig/python/libsvn/core.py", line 4550, in svn_mergeinfo_parse
    return _core.svn_mergeinfo_parse(*args)
SystemError: error return without exception set

======================================================================
ERROR: test_mergeinfo_sort (mergeinfo.SubversionMergeinfoTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/daniel/src/svn/t1/subversion/bindings/swig/python/tests/mergeinfo.py", line 93, in test_mergeinfo_sort
    mergeinfo = core.svn_mergeinfo_parse(self.TEXT_MERGEINFO1)
  File "/home/daniel/src/svn/t1/subversion/bindings/swig/python/libsvn/core.py", line 4550, in svn_mergeinfo_parse
    return _core.svn_mergeinfo_parse(*args)
SystemError: error return without exception set

======================================================================
ERROR: test_rangelist_merge (mergeinfo.SubversionMergeinfoTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/daniel/src/svn/t1/subversion/bindings/swig/python/tests/mergeinfo.py", line 66, in test_rangelist_merge
    mergeinfo1 = core.svn_mergeinfo_parse(self.TEXT_MERGEINFO1)
  File "/home/daniel/src/svn/t1/subversion/bindings/swig/python/libsvn/core.py", line 4550, in svn_mergeinfo_parse
    return _core.svn_mergeinfo_parse(*args)
SystemError: error return without exception set

======================================================================
ERROR: test_rangelist_reverse (mergeinfo.SubversionMergeinfoTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/daniel/src/svn/t1/subversion/bindings/swig/python/tests/mergeinfo.py", line 82, in test_rangelist_reverse
    mergeinfo = core.svn_mergeinfo_parse(self.TEXT_MERGEINFO1)
  File "/home/daniel/src/svn/t1/subversion/bindings/swig/python/libsvn/core.py", line 4550, in svn_mergeinfo_parse
    return _core.svn_mergeinfo_parse(*args)
SystemError: error return without exception set

----------------------------------------------------------------------
]]]

and the failure disappears once I revert the patch.

Could you look into that, please?  I suspect the C code is buggy, but
I'm not sure whether it's due to your patch or to my mangling of it.

Thanks,

Daniel