You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Jack Repenning <jr...@collab.net> on 2009/03/19 20:58:46 UTC

Possible recrudescence of issue 3113 (ra_serf update "APR does not understand this error code")

In Issue 3113, lgo reported intermittent failures in update-like  
operations over ra_serf, including the message "APR does not  
understand this error code." He investigated, but was not able to  
reproduce it in the trunk of that time. He and Kamesh each postulated  
some revision that might have fixed it, and it was resolved on "unable  
to reproduce" grounds, as much as anything else.

I think it just happened to me, in 1.6.0-rc3 (the download tarballs).

Complete with the annoying "and then it went away" feature.

So, maybe there really is a problem here, and it's intermittent based  
on some variable none of us has thought to try. Or maybe not. And  
anyway, none of us yet has thought what to try, so it's still at most  
"irreproducible." I know, I know: this report is rapidly approaching  
"useless."

Here are the details of my experience, for what they're worth. Maybe  
they'll click in someone's head?

I built on OS X 10.5.6, under Xcode 3.1.2, from the downloaded  
tarballs subversion-1.6.0-rc3.tar.gz and subversion-deps-1.6.0- 
rc3.tar.gz. I also download, build, and incorporate BDB db-4.2.52  
(with patches 1-5) and expat-1.95.8. I unpack the deps into the  
subversion build tree, and build them all together. Configure stati  
are attached for the truly curious. My build produces both the command  
line and C libraries; my principal interest is to bind the libraries  
to my application, SCPlugin.

Having built SCPlugin, I found (with 100% reproducibility) that  
checkout and update failed, reporting "APR does not understand this  
error code." It's notable, however, that the command-line binary built  
in the same "make" did not display this problem. I used otool to  
confirm that my program was not mixing APR versions, did a little  
debugging, came up dry, and knocked off for the day.

Today, I fire up the debugger again, and ... everything works just  
fine. No rebuilds of any kind in between. No reboots of any kind in  
between. I am, today, working on a new launch of my app, but I had  
killed and relaunched it several times, last night. All runs,  
successful and failing, were under the Xcode IDE and gdb debugger, and  
I am also, today, working on a new launch of the IDE, and I don't  
recall restarting it yesterday, so it's possible the variation is the  
IDE's fault.

Today (but not yesterday, sadly), I have used Activity Monitor's list  
of open files to confirm that my app has open only APR 1.x libraries.


-==-
Jack Repenning
Chief Technology Officer
CollabNet, Inc.
8000 Marina Boulevard, Suite 600
Brisbane, California 94005
office: +1 650.228.2562
mobile: +1 408.835.8090
raindance: +1 877.326.2337, x844.7461
aim: jackrepenning
skype: jackrepenning
twitter: http://twitter.com/jrep

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1357891

RE: Possible recrudescence of issue 3113 (ra_serf update "APR does not understand this error code")

Posted by Bert Huijben <rh...@sharpsvn.net>.
> -----Original Message-----
> From: Branko Cibej [mailto:brane@xbc.nu]
> Sent: Friday, March 20, 2009 1:01 AM
> To: Jack Repenning
> Cc: dev@subversion.tigris.org
> Subject: Re: Possible recrudescence of issue 3113 (ra_serf update "APR
> does not understand this error code")
> 
> Sounds like something somewhere is sometimes stomping on the heap or
> stack. What happens if you run the magic checkout or update with
> valgrind?
> 
> -- Brane
> 
> Jack Repenning wrote:
> > In Issue 3113, lgo reported intermittent failures in update-like
> > operations over ra_serf, including the message "APR does not
> > understand this error code." He investigated, but was not able to
> > reproduce it in the trunk of that time. He and Kamesh each postulated
> > some revision that might have fixed it, and it was resolved on
> "unable
> > to reproduce" grounds, as much as anything else.
> >
> > I think it just happened to me, in 1.6.0-rc3 (the download tarballs).
> >
> > Complete with the annoying "and then it went away" feature.
> >
> > So, maybe there really is a problem here, and it's intermittent based
> > on some variable none of us has thought to try. Or maybe not. And
> > anyway, none of us yet has thought what to try, so it's still at most
> > "irreproducible." I know, I know: this report is rapidly approaching
> > "useless."
> >
> > Here are the details of my experience, for what they're worth. Maybe
> > they'll click in someone's head?
> >
> > I built on OS X 10.5.6, under Xcode 3.1.2, from the downloaded
> > tarballs subversion-1.6.0-rc3.tar.gz and subversion-deps-1.6.0-
> > rc3.tar.gz. I also download, build, and incorporate BDB db-4.2.52
> > (with patches 1-5) and expat-1.95.8. I unpack the deps into the
> > subversion build tree, and build them all together. Configure stati
> > are attached for the truly curious. My build produces both the
> command
> > line and C libraries; my principal interest is to bind the libraries
> > to my application, SCPlugin.
> >
> > Having built SCPlugin, I found (with 100% reproducibility) that
> > checkout and update failed, reporting "APR does not understand this
> > error code." It's notable, however, that the command-line binary
> built
> > in the same "make" did not display this problem. I used otool to
> > confirm that my program was not mixing APR versions, did a little
> > debugging, came up dry, and knocked off for the day.
> >
> > Today, I fire up the debugger again, and ... everything works just
> > fine. No rebuilds of any kind in between. No reboots of any kind in
> > between. I am, today, working on a new launch of my app, but I had
> > killed and relaunched it several times, last night. All runs,
> > successful and failing, were under the Xcode IDE and gdb debugger,
> and
> > I am also, today, working on a new launch of the IDE, and I don't
> > recall restarting it yesterday, so it's possible the variation is the
> > IDE's fault.
> >
> > Today (but not yesterday, sadly), I have used Activity Monitor's list
> > of open files to confirm that my app has open only APR 1.x libraries.
> >

	Hi,

I saw this error yesterday once when running the test set over serf in
parallel mode, against my local httpd. (I usually don't run the tests in
parallel mode as that reduces the chance on file in use errors considerably)

But like you said: I couldn't reproduce it later.

	Bert

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1359218

Re: Possible recrudescence of issue 3113 (ra_serf update "APR does not understand this error code")

Posted by Branko Cibej <br...@xbc.nu>.
Sounds like something somewhere is sometimes stomping on the heap or
stack. What happens if you run the magic checkout or update with valgrind?

-- Brane

Jack Repenning wrote:
> In Issue 3113, lgo reported intermittent failures in update-like  
> operations over ra_serf, including the message "APR does not  
> understand this error code." He investigated, but was not able to  
> reproduce it in the trunk of that time. He and Kamesh each postulated  
> some revision that might have fixed it, and it was resolved on "unable  
> to reproduce" grounds, as much as anything else.
>
> I think it just happened to me, in 1.6.0-rc3 (the download tarballs).
>
> Complete with the annoying "and then it went away" feature.
>
> So, maybe there really is a problem here, and it's intermittent based  
> on some variable none of us has thought to try. Or maybe not. And  
> anyway, none of us yet has thought what to try, so it's still at most  
> "irreproducible." I know, I know: this report is rapidly approaching  
> "useless."
>
> Here are the details of my experience, for what they're worth. Maybe  
> they'll click in someone's head?
>
> I built on OS X 10.5.6, under Xcode 3.1.2, from the downloaded  
> tarballs subversion-1.6.0-rc3.tar.gz and subversion-deps-1.6.0- 
> rc3.tar.gz. I also download, build, and incorporate BDB db-4.2.52  
> (with patches 1-5) and expat-1.95.8. I unpack the deps into the  
> subversion build tree, and build them all together. Configure stati  
> are attached for the truly curious. My build produces both the command  
> line and C libraries; my principal interest is to bind the libraries  
> to my application, SCPlugin.
>
> Having built SCPlugin, I found (with 100% reproducibility) that  
> checkout and update failed, reporting "APR does not understand this  
> error code." It's notable, however, that the command-line binary built  
> in the same "make" did not display this problem. I used otool to  
> confirm that my program was not mixing APR versions, did a little  
> debugging, came up dry, and knocked off for the day.
>
> Today, I fire up the debugger again, and ... everything works just  
> fine. No rebuilds of any kind in between. No reboots of any kind in  
> between. I am, today, working on a new launch of my app, but I had  
> killed and relaunched it several times, last night. All runs,  
> successful and failing, were under the Xcode IDE and gdb debugger, and  
> I am also, today, working on a new launch of the IDE, and I don't  
> recall restarting it yesterday, so it's possible the variation is the  
> IDE's fault.
>
> Today (but not yesterday, sadly), I have used Activity Monitor's list  
> of open files to confirm that my app has open only APR 1.x libraries.
>

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1359118

Re: Possible recrudescence of issue 3113 (ra_serf update "APR does not understand this error code")

Posted by Jack Repenning <jr...@collab.net>.
On Mar 19, 2009, at 5:45 PM, Konstantin Kolinko wrote:

> Was it rainy yesterday? Can it be some network problems?

Heheh. Not here, no, but I was using our company WiFi, which gets very  
flaky in the afternoons. You might be on to something.

> Whoever called that apr_strerror() should know the context where the
> issue occurred and the actual code.  Were those included in the  
> message,
> or it just relied on the standard message being available?


No additional info for the one case I did dig into. libsvn_client  
calls lower levels to do an update, the operation completes, "error"  
is non-zero, so it's handed to apr, who says this helpful thing.

I've spent some time this evening on this. Haven't seen the mystery  
error again. I don't know if it's related or not, but I have found  
that these same operations are intermittently leaking teeny bits of  
memory (64 bytes). Not every time, maybe once in four, tonight. That  
"not every time" sounds rather like this APR thing. And code (probably  
mine) that nondeterministically leaks is doing SOMETHING wrong, dang  
it, and it seems possible it otherwhen uses a bad pointer from the  
same botch.

But Bert's case is surely not related to my leaks.

-==-
Jack Repenning
Chief Technology Officer
CollabNet, Inc.
8000 Marina Boulevard, Suite 600
Brisbane, California 94005
office: +1 650.228.2562
mobile: +1 408.835.8090
raindance: +1 877.326.2337, x844.7461
aim: jackrepenning
skype: jackrepenning
twitter: http://twitter.com/jrep

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1361997

Re: Possible recrudescence of issue 3113 (ra_serf update "APR does not understand this error code")

Posted by Konstantin Kolinko <kn...@gmail.com>.
2009/3/19 Jack Repenning <jr...@collab.net>:
>(...)
>
> Today, I fire up the debugger again, and ... everything works just
> fine. No rebuilds of any kind in between. No reboots of any kind in
> between. I am, today, working on a new launch of my app, but I had
> killed and relaunched it several times, last night. All runs,
> successful and failing, were under the Xcode IDE and gdb debugger, and
> I am also, today, working on a new launch of the IDE, and I don't
> recall restarting it yesterday, so it's possible the variation is the
> IDE's fault.
>
> Today (but not yesterday, sadly), I have used Activity Monitor's list
> of open files to confirm that my app has open only APR 1.x libraries.
>

Was it rainy yesterday? Can it be some network problems?

"APR does not understand this error code" is (looking into APR
sources, /misc/unix/errorcodes.c), just a more wordy equivalent
of "unknown error code". The apr_strerror() function (or native_strerror()
called by it) returns this message.

Whoever called that apr_strerror() should know the context where the
issue occurred and the actual code.  Were those included in the message,
or it just relied on the standard message being available?

And without knowing the code it is impossible to know in what error code
range it falls, and whether it is the same every time.

For reference:
The following thread came up on Apache Tomcat webserver Users' list
yesterday. It caught me, that the title mentions this apr message,
though I think there is no other relation to this very problem.
http://www.mail-archive.com/users@tomcat.apache.org/msg58623.html


Best regard,
Konstantin Kolinko

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1359460