You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Ivan Zhakov <ch...@gmail.com> on 2006/10/10 15:54:49 UTC

Subversion 1.4.0 crashes in libapr.dll on Windows

Well, we have several user reports of crashes in libapr.dll on Windows.
1. Crash on svn export and svn checkout:
Mail: http://svn.haxx.se/dev/archive-2006-09/0738.shtml
Minudump can be found here: http://mysite.verizon.net/jodys7/svn.dmp
2. Another crash on svn export reported to me directly
Minidump can be found here: http://chemodax.googlepages.com/svn.dmp
3. Crash at same place when executing svn merge
Mail: http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=120663
I've asked for minidump to look on it, but we already have stack trace.

So all of these three crashes occur at same place:
 	libapr.dll!find_entry
 	libapr.dll!apr_hash_get
	svn.exe!get_xlate_handle_node
 	svn.exe!svn_cmdline_cstring_from_utf8
 	svn.exe!svn_cmdline_printf

And at least cases (1) and (2) have workaround by renaming
TortoiseSVN's libapr_tsvn.dll to libapr.dll. So my initial idea was
some DLL hell, but uninstalling TortoiseSVN doesn't help!

I've looked to source code of function get_xlate_handle_node
(libsvn_subr/utf.c):
  if (userdata_key)
    {
      if (xlate_handle_hash)
        {
#if APR_HAS_THREADS
          apr_err = apr_thread_mutex_lock(xlate_handle_mutex);
          if (apr_err != APR_SUCCESS)
            return svn_error_create(apr_err, NULL,
                                    _("Can't lock charset translation mutex"));
#endif
          old_node_p = apr_hash_get(xlate_handle_hash, userdata_key,
                                    APR_HASH_KEY_STRING);
          if (old_node_p)
            old_node = *old_node_p;


And notice that one minidump has xlate_handle_hash == 0, but other has
xlate_handle_hash==0x00af00d0. How is it possible if we check it
several lines before?? But maybe it's problem with minidump format.
I'm continue investigating this problem and will be glad to any
suggestions/pointers.

-- 
Ivan Zhakov

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Subversion 1.4.0 crashes in libapr.dll on Windows

Posted by Malcolm Rowe <ma...@farside.org.uk>.
On Thu, Oct 26, 2006 at 06:24:51PM -0700, Kenneth Porter wrote:
> Could this be related to use of the MS gflags utility on Windows to force 
> page frees to happen immediately?

I'm not familiar with the utility, but it seems unlikely - Subversion
uses APR pools, and so doesn't actually return any memory to the OS
until it exits.

Regards,
Malcolm

Re: Subversion 1.4.0 crashes in libapr.dll on Windows

Posted by Kenneth Porter <sh...@sewingwitch.com>.
--On Thursday, October 26, 2006 5:30 PM -0700 Kenneth Porter 
<sh...@sewingwitch.com> wrote:

> I also chased the next pointer and looked at its key, and it has the same
> issue. So the block holding the keys for this table seems to have been
> released from VM.

Could this be related to use of the MS gflags utility on Windows to force 
page frees to happen immediately? I'd used this utility on some other apps 
to detect late access to released memory and I thought it marked what apps 
it would apply to. I don't think it automatically affects every app used on 
the system. It's possible this bug occurs on other systems but they don't 
release the freed pages right away, masking an invalid read to released 
memory.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Subversion 1.4.0 crashes in libapr.dll on Windows

Posted by Kenneth Porter <sh...@sewingwitch.com>.
--On Friday, October 27, 2006 10:04 AM -0400 Garrett Rooney 
<ro...@electricjellyfish.net> wrote:

> Could you possibly get a backtrace?  The fact that the error occurs in
> the apr hash code is interesting, but in order for that information to
> be really useful we'd need to know which hash it's talking about, what
> is going on when it gets to this point, etc.

I just shut down the debugger so I could get work done. ;)

I had imported a large directory tree and the "committed" message was about 
to be printed. This is happening in the code that gets a translation handle 
to convert from UTF8 to CP1252 to print the message to the console. The key 
pointer in the hash entry was pointing to unallocated memory after the hash 
table. From what I could discover statically, the key is allocated from the 
same pool using the APR strcat function. I wasn't able to discover where it 
would get freed and thought perhaps it was garbage-collected with the pool 
destruction.

Meanwhile, I used gflags.exe to request pageheap processing on svn.exe, so 
guard pages will get put around kernel-level allocations and kernel-level 
frees will happen immediately. This may help to uncover where the premature 
free is happening.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Subversion 1.4.0 crashes in libapr.dll on Windows

Posted by "D.J. Heap" <dj...@gmail.com>.
On 10/27/06, Garrett Rooney <ro...@electricjellyfish.net> wrote:
> On 10/26/06, Kenneth Porter <sh...@sewingwitch.com> wrote:
[snip]
> > The failure occurs in the memcmp (which is inlined with a REPE CMPS).
> > he->key (0xFAD6B8) is pointing to non-existent memory. It seems to be
> > pointing to just after the hash entry memory. (he is 0xFAB8D0.) The rest of
> > the hash entry looks sane (has correct hash value and key length). I'm
> > wondering if the key was prematurely freed and the page was released.
> >
> > I also chased the next pointer and looked at its key, and it has the same
> > issue. So the block holding the keys for this table seems to have been
> > released from VM.
>
> Could you possibly get a backtrace?  The fact that the error occurs in
> the apr hash code is interesting, but in order for that information to
> be really useful we'd need to know which hash it's talking about, what
> is going on when it gets to this point, etc.


Even more helpful would be a reproduction recipe or at least the data
(and possibly the repository if it doesn't fail with a fresh one) that
causes the crash...is that possible?

DJ

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Subversion 1.4.0 crashes in libapr.dll on Windows

Posted by Garrett Rooney <ro...@electricjellyfish.net>.
On 10/26/06, Kenneth Porter <sh...@sewingwitch.com> wrote:
> --On Tuesday, October 10, 2006 7:06 PM -0600 "D.J. Heap" <dj...@gmail.com>
> wrote:
>
> > Has this been recreated with a trunk build?  I provided a trunk build
> > to the first reporter (I think that is who it was) and they said it
> > started working fine.  Maybe we just need to identify a fix to
> > backport.
> >
> > I haven't been able to get far with the minidumps.  We really need to
> > reproduce it or have someone who can be willing to debug it
> > themselves.
>
> I just encountered it again on an svn import. I let VS2005 load it for
> post-mortem. I grabbed the 1.4.0 source zips and pointed the debugger at
> those. It has a little trouble lining up the symbols but I think I see
> what's happening from looking at the disassembly and the structure
> definitions.
>
> In this case the failure happens after the commit has completed and when
> the committed message is about to be sent to stdout.
>
> Around line 242 of apr_hash.c is this loop:
>
>     for (hep = &ht->array[hash & ht->max], he = *hep;
>          he; hep = &he->next, he = *hep) {
>         if (he->hash == hash
>             && he->klen == klen
>             && memcmp(he->key, key, klen) == 0)
>             break;
>     }
>
> The failure occurs in the memcmp (which is inlined with a REPE CMPS).
> he->key (0xFAD6B8) is pointing to non-existent memory. It seems to be
> pointing to just after the hash entry memory. (he is 0xFAB8D0.) The rest of
> the hash entry looks sane (has correct hash value and key length). I'm
> wondering if the key was prematurely freed and the page was released.
>
> I also chased the next pointer and looked at its key, and it has the same
> issue. So the block holding the keys for this table seems to have been
> released from VM.

Could you possibly get a backtrace?  The fact that the error occurs in
the apr hash code is interesting, but in order for that information to
be really useful we'd need to know which hash it's talking about, what
is going on when it gets to this point, etc.

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Subversion 1.4.0 crashes in libapr.dll on Windows

Posted by Kenneth Porter <sh...@sewingwitch.com>.
--On Tuesday, October 10, 2006 7:06 PM -0600 "D.J. Heap" <dj...@gmail.com> 
wrote:

> Has this been recreated with a trunk build?  I provided a trunk build
> to the first reporter (I think that is who it was) and they said it
> started working fine.  Maybe we just need to identify a fix to
> backport.
>
> I haven't been able to get far with the minidumps.  We really need to
> reproduce it or have someone who can be willing to debug it
> themselves.

I just encountered it again on an svn import. I let VS2005 load it for 
post-mortem. I grabbed the 1.4.0 source zips and pointed the debugger at 
those. It has a little trouble lining up the symbols but I think I see 
what's happening from looking at the disassembly and the structure 
definitions.

In this case the failure happens after the commit has completed and when 
the committed message is about to be sent to stdout.

Around line 242 of apr_hash.c is this loop:

    for (hep = &ht->array[hash & ht->max], he = *hep;
         he; hep = &he->next, he = *hep) {
        if (he->hash == hash
            && he->klen == klen
            && memcmp(he->key, key, klen) == 0)
            break;
    }

The failure occurs in the memcmp (which is inlined with a REPE CMPS). 
he->key (0xFAD6B8) is pointing to non-existent memory. It seems to be 
pointing to just after the hash entry memory. (he is 0xFAB8D0.) The rest of 
the hash entry looks sane (has correct hash value and key length). I'm 
wondering if the key was prematurely freed and the page was released.

I also chased the next pointer and looked at its key, and it has the same 
issue. So the block holding the keys for this table seems to have been 
released from VM.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Subversion 1.4.0 crashes in libapr.dll on Windows

Posted by "D.J. Heap" <dj...@gmail.com>.
On 10/10/06, Ivan Zhakov <ch...@gmail.com> wrote:
> On 10/10/06, Ivan Zhakov <ch...@gmail.com> wrote:
> > Well, we have several user reports of crashes in libapr.dll on Windows.
> > 1. Crash on svn export and svn checkout:
> > Mail: http://svn.haxx.se/dev/archive-2006-09/0738.shtml
> > Minudump can be found here: http://mysite.verizon.net/jodys7/svn.dmp
> > 2. Another crash on svn export reported to me directly
> > Minidump can be found here: http://chemodax.googlepages.com/svn.dmp
> > 3. Crash at same place when executing svn merge
> > Mail: http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=120663
> > I've asked for minidump to look on it, but we already have stack trace.
> Kenneth has sent minidump to me. I've uploaded it to:
> http://chemodax.googlepages.com/svn-merge-crash-libapr.dmp
>
> BTW: In this minidump xlate_handle_hash==0
>


Has this been recreated with a trunk build?  I provided a trunk build
to the first reporter (I think that is who it was) and they said it
started working fine.  Maybe we just need to identify a fix to
backport.

I haven't been able to get far with the minidumps.  We really need to
reproduce it or have someone who can be willing to debug it
themselves.

DJ

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Subversion 1.4.0 crashes in libapr.dll on Windows

Posted by Ivan Zhakov <ch...@gmail.com>.
On 10/10/06, Ivan Zhakov <ch...@gmail.com> wrote:
> Well, we have several user reports of crashes in libapr.dll on Windows.
> 1. Crash on svn export and svn checkout:
> Mail: http://svn.haxx.se/dev/archive-2006-09/0738.shtml
> Minudump can be found here: http://mysite.verizon.net/jodys7/svn.dmp
> 2. Another crash on svn export reported to me directly
> Minidump can be found here: http://chemodax.googlepages.com/svn.dmp
> 3. Crash at same place when executing svn merge
> Mail: http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=120663
> I've asked for minidump to look on it, but we already have stack trace.
Kenneth has sent minidump to me. I've uploaded it to:
http://chemodax.googlepages.com/svn-merge-crash-libapr.dmp

BTW: In this minidump xlate_handle_hash==0

>
> So all of these three crashes occur at same place:
>         libapr.dll!find_entry
>         libapr.dll!apr_hash_get
>         svn.exe!get_xlate_handle_node
>         svn.exe!svn_cmdline_cstring_from_utf8
>         svn.exe!svn_cmdline_printf
>
> And at least cases (1) and (2) have workaround by renaming
> TortoiseSVN's libapr_tsvn.dll to libapr.dll. So my initial idea was
> some DLL hell, but uninstalling TortoiseSVN doesn't help!
>
> I've looked to source code of function get_xlate_handle_node
> (libsvn_subr/utf.c):
>   if (userdata_key)
>     {
>       if (xlate_handle_hash)
>         {
> #if APR_HAS_THREADS
>           apr_err = apr_thread_mutex_lock(xlate_handle_mutex);
>           if (apr_err != APR_SUCCESS)
>             return svn_error_create(apr_err, NULL,
>                                     _("Can't lock charset translation mutex"));
> #endif
>           old_node_p = apr_hash_get(xlate_handle_hash, userdata_key,
>                                     APR_HASH_KEY_STRING);
>           if (old_node_p)
>             old_node = *old_node_p;
>
>
> And notice that one minidump has xlate_handle_hash == 0, but other has
> xlate_handle_hash==0x00af00d0. How is it possible if we check it
> several lines before?? But maybe it's problem with minidump format.
> I'm continue investigating this problem and will be glad to any
> suggestions/pointers.
>
> --
> Ivan Zhakov
>


-- 
Ivan Zhakov

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org