You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Christian Daudt <cs...@daudt.org> on 2003/06/19 16:56:34 UTC

core dump with svn r6178

Hi,
 I'm starting to roll out subversion for a number of people in my group and 
I've run into many of them getting coredumps on checkouts (of course I never 
got one...). Anyways, I've tracked it down to parsing this piece of xml being 
received: <S:open-directory rev=\"-1\">
 And with that, start_element in fetch.c (in libsvn_ra_dav) is running this 
piece of code:
          name = get_attr(atts, "name");
          /* ### verify we got it. punt on error. */
          svn_stringbuf_set(rb->namestr, name);

Since there is no "name" attribute, name (the variable) gets set to null and 
svn_stringbuf_set is being called with its value param == null which it is 
not set up to handle. The attached patch should fix the coredump (by 
protecting svn_stringbuf_set) but I'm only addressing the consequence, not 
the cause. I don't know the protocol to know whether that xml snippet is 
wrong, or if it is valid and the handler isn't handling it properly so I'll 
defer that part to someone that has a clue of what's going there.

 thanks,
   csd

Re: core dump with svn r6178

Posted by cm...@collab.net.
Christian Daudt <cs...@daudt.org> writes:

> I have zero knowledge of the svn code right now (from your comments
> I take it the server is misbehaving, I didn't even know if the
> client or the server was the one at fault up to know). And I haven't
> seen it firsthand, it's just been reported to me by a couple of
> users (who of course haven't seen it since). I was hoping for
> someone to look at my email and say "ah-ha ! I know what's going
> on". Guess that isn't going to happen :-( I'll try to set up some
> debugging on the server to try to keep tabs on what's going on. If
> anyone wants more info (or the coredump) let me know. I'll also file
> a bug report on this one.

If you are familiar with the likes of 'ethereal', and can locate one of
your users who can consistently reproduce the problem, an ethereal
network trace would be quite helpful in determining what might be
going on.

Again, thanks for understanding.  And sorry we had to swap so many
bytes just to get here.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: core dump with svn r6178

Posted by Christian Daudt <cs...@daudt.org>.
On Thursday 19 June 2003 13:43, cmpilato@collab.net wrote:
> Christian Daudt <cs...@daudt.org> writes:
> > So someone else doesn't spend a morning trying to figure why they're
> > getting coredumps all of the sudden...
>
> Do I detect (in your patch) a note of sarcasm? :-)
>

 Only a bit ;-) Seriously, you can generate a core dump and still tell people 
that aren't familiar with the code that you meant to do just that 
(printf/abort maybe). I understand that the intent is to catch errors there 
but if you want to extract more useful cycles from people that are more 
marginally involved in the development process (i.e. me) it's not a bad idea 
to give hints someone debugging an error that they are barking up the wrong 
tree.

> Dude, I'm sorry you wasted time looking at the wrong thing.  But the
> clues are present.  In the very libsvn_ra_dav code you pasted, there
> is a comment indicating that that section of code expects to get that
> name= attribute.  

Unfortunately I have a very deep distrust of comments in code from chasing red 
herrings because of out-of-date comments one too many times (I know that goes 
against the last patch I sent - ok I admit it. It was a sarcastic one ;-).

> Now, I dunno why you're seeing a case where this
> attribute isn't present, but if I could be so daring as to risk
> annoying you -- would you mind trying to solved *that* problem?  I
> mean, if you have a Subversion server that's sending your client a
> bogus REPORT response, by golly we want to know about it.

I have zero knowledge of the svn code right now (from your comments I take it 
the server is misbehaving, I didn't even know if the client or the server was 
the one at fault up to know). And I haven't seen it firsthand, it's just been 
reported to me by a couple of users (who of course haven't seen it since). I 
was hoping for someone to look at my email and say "ah-ha ! I know what's 
going on". Guess that isn't going to happen :-( I'll try to set up some 
debugging on the server to try to keep tabs on what's going on. If anyone 
wants more info (or the coredump) let me know. I'll also file a bug report on 
this one.

> Please help us, and please don't be offended if your alternative patch
> falls by the wayside.  (Our entire svn_string.c function set does not
> try to protect against NULL input, not just that one function).

I'll try (to help, not to be offended ;-)

 cheers,
   csd


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: core dump with svn r6178

Posted by Christian Daudt <cs...@netbox.com>.
On June 19, 2003 06:22 pm, Greg Stein wrote:
> On Fri, Jun 20, 2003 at 12:34:44AM +0200, Sander Striker wrote:
> >...
> >
> > > mean, if you have a Subversion server that's sending your client a
> > > bogus REPORT response, by golly we want to know about it.
> >
> > In any case, we should not segfault on it.  It is bad mojo if the server
> > can let the client crash (or the other way around) by bad data.
>
> Well, yah. That is an entirely different situation. We're talking about
> functions being passed bad parameters. If the system gets bad *data* from
> somewhere, then yup: we should gracefully error.
>

But isn't the bad XML string that caused the segfault (i.e. "<S:open-directory 
rev=\"-1\">") received from the server ? If so, who should have gracefully 
errored and didn't ? 


csd

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: core dump with svn r6178

Posted by Greg Stein <gs...@lyra.org>.
On Fri, Jun 20, 2003 at 12:34:44AM +0200, Sander Striker wrote:
>...
> > mean, if you have a Subversion server that's sending your client a
> > bogus REPORT response, by golly we want to know about it.  
> 
> In any case, we should not segfault on it.  It is bad mojo if the server
> can let the client crash (or the other way around) by bad data.

Well, yah. That is an entirely different situation. We're talking about
functions being passed bad parameters. If the system gets bad *data* from
somewhere, then yup: we should gracefully error.

Thus, the ### comments in that parsing code. But that is about the parsing
out of the data. If we pass NULL to svn_string_FOO, then those ought
segfault before you can say "who's your daddy!?!"

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: core dump with svn r6178

Posted by Christian Daudt <cs...@daudt.org>.
This can certainly be what happened in my case also. We were having nfs 
problems that day so checkouts (which write to an nfs mounted drive) were 
terribly slow. I'm sure timeouts could have ocurred there and this would 
explain why it was only seen by a few during that day.

 csd

On June 19, 2003 09:47 pm, Patrick Mayweg wrote:
> Hi,
> I had a NULL-pointer access at the same place last week. I think I have it
> tracked down to where it realy went wrong.
> In the file subversion/libsvn_ra_dav/fetch.c is the function
> static svn_error_t * reporter_finish_report(void *report_baton)
>
> It contains the following code:
>
>   err = svn_ra_dav__get_vcc(&vcc, rb->ras->sess, rb->ras->url,
> rb->ras->pool); if (err)
>     {
>       (void) apr_file_close(rb->tmpfile);
>       return err;
>     }
>
>   err = svn_ra_dav__parsed_request(rb->ras->sess, "REPORT", vcc,
>                                    NULL, rb->tmpfile, NULL,
>                                    report_elements, validate_element,
>                                    start_element, end_element, rb,
>                                    NULL, rb->ras->pool);
>
>   if (err)
>     {
>       /* If running the update-report on the VCC failed, it's probably
>          an older server.  Fall back to the old-style, by requesting
>          the report on the src-url itself.  This runs a risk of
>          choking on issue #891 ("REPORT fails on item not in HEAD"),
>          but hey, it's better than nothing.  */
>
>       /* ### someday, remove this fallback code. */
>
>       svn_error_clear (err);
>       err = svn_ra_dav__parsed_request(rb->ras->sess, "REPORT",
>                                        rb->ras->root.path,
>                                        NULL, rb->tmpfile, NULL,
>                                        report_elements, validate_element,
>                                        start_element, end_element, rb,
>                                        NULL, rb->ras->pool);
>     }
>
> The retry after the error is the problem. In my case the REPORT request was
> send ok by the server but the client network code returned "connection
> reset" error after most of the report request was processed. In this case
> you cannot restart the request on the same report_beacon, because there is
> state in the beacon. Like that rb->dirs->nelts is not 0. The code in
> static int start_element(void *userdata, const struct ne_xml_elm *elm,
>                          const char **atts)
> assummes that only on the first OPEN-DIRECTORY element there is no "name"
> attribute. Unfortunately I do not known enough about the report_beacon, to
> provide a patch which tests if it has already be used.
> Regrads,
> Patrick
>
> Christian Daudt wrote:
> > On Thursday 19 June 2003 14:49, Greg Stein wrote:
> > > On Thu, Jun 19, 2003 at 03:43:41PM -0500, cmpilato@collab.net wrote:
> > > >...
> > > > mean, if you have a Subversion server that's sending your client a
> > > > bogus REPORT response, by golly we want to know about it.
> > >
> > > Yup. What version of the server do you have and which client? What
> > > version of Neon is being used?
> >
> > server/client (svn command-line client) at r6178, neon @ 0.23.7. All
> > running on Linux 2.4.19 (Suse 8.1)
> >
> > csd
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> > For additional commands, e-mail: dev-help@subversion.tigris.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: core dump with svn r6178

Posted by Patrick Mayweg <ma...@qint.de>.
Hi Ben,

Ben Collins-Sussman wrote:

> Patrick Mayweg <ma...@qint.de> writes:
>
> > Hi,
> > I had a NULL-pointer access at the same place last week. I think I have it
> > tracked down to where it realy went wrong.
> > In the file subversion/libsvn_ra_dav/fetch.c is the function
> > static svn_error_t * reporter_finish_report(void *report_baton)
> >
> > [...]
> >
> > The retry after the error is the problem. In my case the REPORT
> > request was send ok by the server but the client network code
> > returned "connection reset" error after most of the report request
> > was processed. In this case you cannot restart the request on the
> > same report_beacon, because there is state in the beacon.  Like that
> > rb->dirs->nelts is not 0. The code in static int start_element(void
> > *userdata, const struct ne_xml_elm *elm, const char **atts) assummes
> > that only on the first OPEN-DIRECTORY element there is no "name"
> > attribute. Unfortunately I do not known enough about the
> > report_beacon, to provide a patch which tests if it has already be
> > used.  Regrads, Patrick
>
> Patrick, I think this bug is now fixed in r6377.

that looks much better. I am not when I will be able to test it.
Thanks,
Patrick


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: core dump with svn r6178

Posted by Ben Collins-Sussman <su...@collab.net>.
Patrick Mayweg <ma...@qint.de> writes:

> Hi,
> I had a NULL-pointer access at the same place last week. I think I have it
> tracked down to where it realy went wrong.
> In the file subversion/libsvn_ra_dav/fetch.c is the function
> static svn_error_t * reporter_finish_report(void *report_baton) 
> 
> [...]
> 
> The retry after the error is the problem. In my case the REPORT
> request was send ok by the server but the client network code
> returned "connection reset" error after most of the report request
> was processed. In this case you cannot restart the request on the
> same report_beacon, because there is state in the beacon.  Like that
> rb->dirs->nelts is not 0. The code in static int start_element(void
> *userdata, const struct ne_xml_elm *elm, const char **atts) assummes
> that only on the first OPEN-DIRECTORY element there is no "name"
> attribute. Unfortunately I do not known enough about the
> report_beacon, to provide a patch which tests if it has already be
> used.  Regrads, Patrick

Patrick, I think this bug is now fixed in r6377.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: core dump with svn r6178

Posted by Patrick Mayweg <ma...@qint.de>.
Hi,
I had a NULL-pointer access at the same place last week. I think I have it
tracked down to where it realy went wrong.
In the file subversion/libsvn_ra_dav/fetch.c is the function
static svn_error_t * reporter_finish_report(void *report_baton)

It contains the following code:

  err = svn_ra_dav__get_vcc(&vcc, rb->ras->sess, rb->ras->url, rb->ras->pool);
  if (err)
    {
      (void) apr_file_close(rb->tmpfile);
      return err;
    }

  err = svn_ra_dav__parsed_request(rb->ras->sess, "REPORT", vcc,
                                   NULL, rb->tmpfile, NULL,
                                   report_elements, validate_element,
                                   start_element, end_element, rb,
                                   NULL, rb->ras->pool);

  if (err)
    {
      /* If running the update-report on the VCC failed, it's probably
         an older server.  Fall back to the old-style, by requesting
         the report on the src-url itself.  This runs a risk of
         choking on issue #891 ("REPORT fails on item not in HEAD"),
         but hey, it's better than nothing.  */

      /* ### someday, remove this fallback code. */

      svn_error_clear (err);
      err = svn_ra_dav__parsed_request(rb->ras->sess, "REPORT",
                                       rb->ras->root.path,
                                       NULL, rb->tmpfile, NULL,
                                       report_elements, validate_element,
                                       start_element, end_element, rb,
                                       NULL, rb->ras->pool);
    }

The retry after the error is the problem. In my case the REPORT request was send
ok by the server but the client network code returned "connection reset" error
after most of the report request was processed. In this case you cannot restart
the request on the same report_beacon, because there is state in the beacon.
Like that rb->dirs->nelts is not 0. The code in
static int start_element(void *userdata, const struct ne_xml_elm *elm,
                         const char **atts)
assummes that only on the first OPEN-DIRECTORY element there is no "name"
attribute. Unfortunately I do not known enough about the report_beacon, to
provide a patch which tests if it has already be used.
Regrads,
Patrick



Christian Daudt wrote:

> On Thursday 19 June 2003 14:49, Greg Stein wrote:
> > On Thu, Jun 19, 2003 at 03:43:41PM -0500, cmpilato@collab.net wrote:
> > >...
> > > mean, if you have a Subversion server that's sending your client a
> > > bogus REPORT response, by golly we want to know about it.
> >
> > Yup. What version of the server do you have and which client? What version
> > of Neon is being used?
> >
>
> server/client (svn command-line client) at r6178, neon @ 0.23.7. All running
> on Linux 2.4.19 (Suse 8.1)
>
> csd
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: core dump with svn r6178

Posted by Christian Daudt <cs...@daudt.org>.
On Thursday 19 June 2003 14:49, Greg Stein wrote:
> On Thu, Jun 19, 2003 at 03:43:41PM -0500, cmpilato@collab.net wrote:
> >...
> > mean, if you have a Subversion server that's sending your client a
> > bogus REPORT response, by golly we want to know about it.
>
> Yup. What version of the server do you have and which client? What version
> of Neon is being used?
>

server/client (svn command-line client) at r6178, neon @ 0.23.7. All running 
on Linux 2.4.19 (Suse 8.1)

csd

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Interface strictness (was Re: core dump with svn r6178)

Posted by Greg Hudson <gh...@MIT.EDU>.
On Thu, 2003-06-19 at 19:43, Christian Daudt wrote:
> it's only that I grew to a different mantra: "be liberal in what you receive
> and strict in what you send"

Unfortunately, this mantra has been used to justify some subtle, and
awful mistakes.  If you try to write an RFC822 (or 2822) parser today,
you'll find that the paper standards are of limited help, because every
MUA author was "liberal in what they accepted" in arbitrary and
undocumented ways.

The more important principle is "document what you accept," whether
that's liberal or conservative.  I'll give an example of a place where
we violate this principle: svn_path_uri_encode is documented as:

/** Return a URI-encoded copy of @a path, allocated in @a pool. */

From this documentation (and the fact that PATH is a const char *),
you'd expect that PATH should be a zero-terminated C string.  But it
turns out that PATH can also be NULL, in which case the function returns
NULL, due to an explicit check.

By supporting this laxity and not documenting it, the Subversion API is
placed into an uncomfortable state.  Code may come to rely on this
undocumented behavior, perhaps by accident.  We won't know.  If
svn_path_uri_encode is ever rewritten for some reason, the author of the
new code may not notice that the old function accepted a NULL argument
and may change that behavior, breaking the code which relied on it.

As with all good principles, there are compromises to be made.  The svn
editor interface documents certain restrictions on the editor call
sequence, but many editors do not enforce these restrictions.  They work
just fine if you (for example) open and manipulate two sibling
directories at once.  And we've run into situations where callers did
the wrong thing and got away with it until CMike noticed and fixed them,
but it would be too expensive to check and enforce the ordering
constraints in every editor.  (The editor interface could be changed to
eliminate the directory handles, eliminating this problem, but there are
other examples of complex constraints which simply can't be checked.)

Nevertheless, if you've specifically gone out of your way to be lax--for
example, to write "if (path == NULL) return NULL;" in
svn_path_uri_encode, there is no excuse for not documenting this
behavior.  (And one should first consider whether there is any value in
accepting NULL; would any non-broken code ever want to pass in a NULL
path?)

</treatise>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: core dump with svn r6178

Posted by Michael <ma...@yahoo.com>.
--- Christian Daudt <cs...@daudt.org> wrote:
>  It also requires that anyone getting into an existing project learn
> all of  the contracts between all of the subsystems before they can
> touch any code. It also requires multiple teams working on multiple
> parts of the system to have much more intimate knowledge of all the
> interfaces. That is a very costly thing.

I'm assuming you haven't read the book or you'd know its all in eiffel
and that the preconditions and postconditions are checked unless turned
off for production code.

In any event, if you do follow the "check for every possible input and
then some" mantra then the only thing that happens differently is that
your programmers introduce bugs into the code that fail nicely rather
than poorly. They then go and acquire an intimate knowledge of the
interfaces (which they should have done in the first place).

Michael


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: core dump with svn r6178

Posted by Christian Daudt <cs...@daudt.org>.
On Thursday 19 June 2003 14:59, Michael wrote:
> --- Greg Stein <gs...@lyra.org> wrote:
> > It isn't just Subversion. APR, APRUTIL, and Apache httpd also use this
> > convention. "Give me good inputs, or I'll crash." The crash is actually
> > quite nice because we can use a traceback to figure out the codepath
> > which
> > contains the problem.
>
> Indeed. Some people (like Bertrand Meyer) go to great lengths to convince
> people to code this way. You have his design by contract -- if you
> promise to call my routine with the precondition satisfied then I, in
> return, promise to deliver a final state in which the postcondition is
> satisfied. Then you have his non-redundancy principle -- under no
> circumstances shall the body of a routine ever test for the routine's
> precondition.

 I don't think that the code mentioned here was written as you are stating it 
was. The precondition of (name!=null) *is* being test for and being acted on. 
Only thing is that it is done implicitly (by letting the strlen call blow up 
on the derefence) by design.
 I didn't mean to start a long-winded thread on this subject (it started 
strictly from my ignorance of design decisions made for subversion), it's 
only that I grew to a different mantra: "be liberal in what you receive and 
strict in what you send". I tend to catch everything in my code (even if only 
to spew a message and then blow up) because that makes the intent 
unambiguous. As it stands, if one only looks at the svn_stringbuf_set 
function one can't determine if NULL is not being handled by design or error. 
Given my coding practices I assumed error when it was design in this 
particular case. 

> If you violate the precondition then get can expect bad things -- like
> core dumps.

 If you're not testing (explictly or not) for the preconditions I'd say that 
you're very lucky if you get a core dump. Testing preconditions has avoided 
many sleepless nights looking for bugs that exhibit random behaviour in my 
life. The reason I know that is that I have lost many sleeples nights looking 
for bugs that would have been avoided by testing for preconditions ;-)

>
> Of course all this assumes the contract with the user is clearly spelled
> out.
>

 It also requires that anyone getting into an existing project learn all of 
the contracts between all of the subsystems before they can touch any code. 
It also requires multiple teams working on multiple parts of the system to 
have much more intimate knowledge of all the interfaces. That is a very 
costly thing.


 csd



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: core dump with svn r6178

Posted by Michael <ma...@yahoo.com>.
--- Greg Stein <gs...@lyra.org> wrote:
> It isn't just Subversion. APR, APRUTIL, and Apache httpd also use this
> convention. "Give me good inputs, or I'll crash." The crash is actually
> quite nice because we can use a traceback to figure out the codepath
> which
> contains the problem.

Indeed. Some people (like Bertrand Meyer) go to great lengths to convince
people to code this way. You have his design by contract -- if you
promise to call my routine with the precondition satisfied then I, in
return, promise to deliver a final state in which the postcondition is
satisfied. Then you have his non-redundancy principle -- under no
circumstances shall the body of a routine ever test for the routine's
precondition.

If you violate the precondition then get can expect bad things -- like
core dumps.

Of course all this assumes the contract with the user is clearly spelled
out.

Michael

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: core dump with svn r6178

Posted by Greg Stein <gs...@lyra.org>.
On Thu, Jun 19, 2003 at 03:43:41PM -0500, cmpilato@collab.net wrote:
>...
> mean, if you have a Subversion server that's sending your client a
> bogus REPORT response, by golly we want to know about it.

Yup. What version of the server do you have and which client? What version
of Neon is being used?

> Please help us, and please don't be offended if your alternative patch
> falls by the wayside.  (Our entire svn_string.c function set does not
> try to protect against NULL input, not just that one function).

It isn't just Subversion. APR, APRUTIL, and Apache httpd also use this
convention. "Give me good inputs, or I'll crash." The crash is actually
quite nice because we can use a traceback to figure out the codepath which
contains the problem.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: core dump with svn r6178

Posted by rb...@rkbloom.net.

On Fri, 20 Jun 2003, Sander Striker wrote:

> > From: cmpilato@collab.net [mailto:cmpilato@collab.net]
> > Sent: Thursday, June 19, 2003 10:44 PM
>
> > Christian Daudt <cs...@daudt.org> writes:
> >
> > > So someone else doesn't spend a morning trying to figure why they're getting
> > > coredumps all of the sudden...
> >
> > Do I detect (in your patch) a note of sarcasm? :-)
> >
> > Dude, I'm sorry you wasted time looking at the wrong thing.  But the
> > clues are present.  In the very libsvn_ra_dav code you pasted, there
> > is a comment indicating that that section of code expects to get that
> > name= attribute.  Now, I dunno why you're seeing a case where this
> > attribute isn't present, but if I could be so daring as to risk
> > annoying you -- would you mind trying to solved *that* problem?  I
> > mean, if you have a Subversion server that's sending your client a
> > bogus REPORT response, by golly we want to know about it.
>
> In any case, we should not segfault on it.  It is bad mojo if the server
> can let the client crash (or the other way around) by bad data.
>
> cvs is not hostile data proof and it's a pain in the butt.

As a guy from the side-lines, I agree fully with Sander here.  The whole
"don't check errors, segfault!" logic is great... in some situations.  The
whole point of that type of coding is to find errors as early as possible
and to make it so that you can't ignore them when you do find them.  That
works great for some things, but not for all possible cases.  I always
think of it this way:

There are three types of errors in code, and three appropriate ways to
resolve them.

1)  Syntax errors.
Resolution:  The compiler had better print an error and quit.

2)  Semantic errors.  ie the syntax is right, but the arguments are wrong
or your understanding of how the arguments are used is wrong.
Resolution:  Segfault.  This type of error should happen often enough that
the code shouldn't be committed without somebody seeing the segfault
during testing.

3)  Logic errors.  The code is using all of the APIs correctly, but there
is a logic error and something is being filled out wrong.
Resolution:  Print an error message and fail gracefully.  This type of
error is the rarest type, and is likely to sneak through testing.

The biggest problem is identifying the difference between type 2 and 3
when you are writing code.   :-)

As proof that even Apache follows this model, try this sometime:

telnet apache_server 80
GET / HTTP/1.1
Host:


Apache doesn't segfault, even though the data it is receiving is bad.
This is the same type of error that you are discussing in this thread.
The syntax is right, the understanding of the API is right, but the logic
is wrong someplace.

Ryan



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: core dump with svn r6178

Posted by Sander Striker <st...@apache.org>.
> From: cmpilato@collab.net [mailto:cmpilato@collab.net]
> Sent: Thursday, June 19, 2003 10:44 PM

> Christian Daudt <cs...@daudt.org> writes:
> 
> > So someone else doesn't spend a morning trying to figure why they're getting 
> > coredumps all of the sudden...
> 
> Do I detect (in your patch) a note of sarcasm? :-)
> 
> Dude, I'm sorry you wasted time looking at the wrong thing.  But the
> clues are present.  In the very libsvn_ra_dav code you pasted, there
> is a comment indicating that that section of code expects to get that
> name= attribute.  Now, I dunno why you're seeing a case where this
> attribute isn't present, but if I could be so daring as to risk
> annoying you -- would you mind trying to solved *that* problem?  I
> mean, if you have a Subversion server that's sending your client a
> bogus REPORT response, by golly we want to know about it.  

In any case, we should not segfault on it.  It is bad mojo if the server
can let the client crash (or the other way around) by bad data.

cvs is not hostile data proof and it's a pain in the butt.
 
> Please help us, and please don't be offended if your alternative patch
> falls by the wayside.  (Our entire svn_string.c function set does not
> try to protect against NULL input, not just that one function).

In this case the location with the comment should do an actual check
and throw an appropiate error.


Sander

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: core dump with svn r6178

Posted by cm...@collab.net.
Christian Daudt <cs...@daudt.org> writes:

> So someone else doesn't spend a morning trying to figure why they're getting 
> coredumps all of the sudden...

Do I detect (in your patch) a note of sarcasm? :-)

Dude, I'm sorry you wasted time looking at the wrong thing.  But the
clues are present.  In the very libsvn_ra_dav code you pasted, there
is a comment indicating that that section of code expects to get that
name= attribute.  Now, I dunno why you're seeing a case where this
attribute isn't present, but if I could be so daring as to risk
annoying you -- would you mind trying to solved *that* problem?  I
mean, if you have a Subversion server that's sending your client a
bogus REPORT response, by golly we want to know about it.  

Please help us, and please don't be offended if your alternative patch
falls by the wayside.  (Our entire svn_string.c function set does not
try to protect against NULL input, not just that one function).

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: core dump with svn r6178

Posted by Christian Daudt <cs...@daudt.org>.
On Thursday 19 June 2003 11:42, cmpilato@collab.net wrote:
> Christian Daudt <cs...@daudt.org> writes:
> > But the fact is that, regardless of the problem that caused the code
> > to call stringbuf_set with null (which remains to be fixed) ,
> > svn_stringbuf_set shouldn't coredump because it received a value it
> > doesn't like.
>
> On the contrary, we specifically allow that function to segfault on
> bad input because it helps us to find *real* bugs in the code.  If we
> simply papered over the problem, we'd have a much harder time tracking
> down, for example, the bug you're seeing.  And for what gain?
>

So someone else doesn't spend a morning trying to figure why they're getting 
coredumps all of the sudden...
 Here is an alternative patch.

 csd

Re: core dump with svn r6178

Posted by Branko Čibej <br...@xbc.nu>.
Kevin Pilch-Bisson wrote:

>On Thu, Jun 19, 2003 at 10:58:15PM -0400, Brandon Ehle wrote:
>  
>
>>>Different projects have different conventions for this.  In
>>>Subversion, we seem to have decided that it's okay for code to seg
>>>fault in "can't happen" situations, because if the can't-happen ever
>>>*does* happen, it's a bug we must fix anyway :-).  A verbose error and
>>>exit would only be marginally friendlier to the user -- it wouldn't
>>>really get them any closer to a completed operation than the seg fault
>>>does.
>>>
>>>
>>>      
>>>
>>The only problem with a seg fault is that many users won't be running on 
>>OS's that generate core dumps and don't have the time or resources to 
>>track down an "Exception Occured at 0xABACADBA", but they could easily 
>>cut and paste a verbose error message into an email pretty easily.
>>
>>    
>>
>Hmm, this conversation has made me wonder what collab.net needs to do in order
>to get Watson reports for svn reported to them.
>
Hehe, how come you started thinking of Windows here? :-)

>I know its possible, but I don't know what the procedure is.  
>
The very first prerequisite is to get the debug symbols installed with
the binaries. That should be trivial, since my precognitive abilities
caused me to upload debug symbols for the binaries. :-)

>I'll try to look into it.
>
Would the fact that the Subversion URL is mentioned in the .exe's and
.so's version info help?


-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: core dump with svn r6178

Posted by Kevin Pilch-Bisson <ke...@pilchie.homeip.net>.
On Thu, Jun 19, 2003 at 10:58:15PM -0400, Brandon Ehle wrote:
> 
> >
> >Different projects have different conventions for this.  In
> >Subversion, we seem to have decided that it's okay for code to seg
> >fault in "can't happen" situations, because if the can't-happen ever
> >*does* happen, it's a bug we must fix anyway :-).  A verbose error and
> >exit would only be marginally friendlier to the user -- it wouldn't
> >really get them any closer to a completed operation than the seg fault
> >does.
> > 
> >
> The only problem with a seg fault is that many users won't be running on 
> OS's that generate core dumps and don't have the time or resources to 
> track down an "Exception Occured at 0xABACADBA", but they could easily 
> cut and paste a verbose error message into an email pretty easily.
> 
Hmm, this conversation has made me wonder what collab.net needs to do in order
to get Watson reports for svn reported to them.  I know its possible, but I
don't know what the procedure is.  

I'll try to look into it.
-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Kevin Pilch-Bisson                    http://www.pilch-bisson.net
     "Historically speaking, the presences of wheels in Unix
     has never precluded their reinvention." - Larry Wall
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: core dump with svn r6178

Posted by Brandon Ehle <az...@yahoo.com>.
>
>Different projects have different conventions for this.  In
>Subversion, we seem to have decided that it's okay for code to seg
>fault in "can't happen" situations, because if the can't-happen ever
>*does* happen, it's a bug we must fix anyway :-).  A verbose error and
>exit would only be marginally friendlier to the user -- it wouldn't
>really get them any closer to a completed operation than the seg fault
>does.
>  
>
The only problem with a seg fault is that many users won't be running on 
OS's that generate core dumps and don't have the time or resources to 
track down an "Exception Occured at 0xABACADBA", but they could easily 
cut and paste a verbose error message into an email pretty easily.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: core dump with svn r6178

Posted by kf...@collab.net.
Christian Daudt <cs...@daudt.org> writes:
> I agree completely - that's why I said my patch was only addressing the 
> consequence, not the cause . But the fact is that, regardless of the problem 
> that caused the code to call stringbuf_set with null (which remains to be 
> fixed) , svn_stringbuf_set shouldn't coredump because it received a value it 
> doesn't like. It should either set the string to empty (which is what my 
> patch did) or return an error and then get the callers to test for that 
> error.

Different projects have different conventions for this.  In
Subversion, we seem to have decided that it's okay for code to seg
fault in "can't happen" situations, because if the can't-happen ever
*does* happen, it's a bug we must fix anyway :-).  A verbose error and
exit would only be marginally friendlier to the user -- it wouldn't
really get them any closer to a completed operation than the seg fault
does.

-K

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: core dump with svn r6178

Posted by cm...@collab.net.
Christian Daudt <cs...@daudt.org> writes:

> But the fact is that, regardless of the problem that caused the code
> to call stringbuf_set with null (which remains to be fixed) ,
> svn_stringbuf_set shouldn't coredump because it received a value it
> doesn't like.

On the contrary, we specifically allow that function to segfault on
bad input because it helps us to find *real* bugs in the code.  If we
simply papered over the problem, we'd have a much harder time tracking
down, for example, the bug you're seeing.  And for what gain?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: core dump with svn r6178

Posted by Christian Daudt <cs...@daudt.org>.
On Thursday 19 June 2003 10:54, cmpilato@collab.net wrote:
> Christian Daudt <cs...@daudt.org> writes:
> > Hi,
> >  I'm starting to roll out subversion for a number of people in my group
> > and I've run into many of them getting coredumps on checkouts (of course
> > I never got one...). Anyways, I've tracked it down to parsing this piece
> > of xml being received: <S:open-directory rev=\"-1\">
> >  And with that, start_element in fetch.c (in libsvn_ra_dav) is running
> > this piece of code:
> >           name = get_attr(atts, "name");
> >           /* ### verify we got it. punt on error. */
> >           svn_stringbuf_set(rb->namestr, name);
>
> While I appreciate the patch attempt (omitted for brevity), ultimately
> I think the problem should be solved a little more directly.  That is,
> we should figure out why we getting an open-directory with no "name"
> attribute.  The code obviously thinks this case shouldn't occur.
>

I agree completely - that's why I said my patch was only addressing the 
consequence, not the cause . But the fact is that, regardless of the problem 
that caused the code to call stringbuf_set with null (which remains to be 
fixed) , svn_stringbuf_set shouldn't coredump because it received a value it 
doesn't like. It should either set the string to empty (which is what my 
patch did) or return an error and then get the callers to test for that 
error.

 csd

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: core dump with svn r6178

Posted by cm...@collab.net.
Christian Daudt <cs...@daudt.org> writes:

> Hi,
>  I'm starting to roll out subversion for a number of people in my group and 
> I've run into many of them getting coredumps on checkouts (of course I never 
> got one...). Anyways, I've tracked it down to parsing this piece of xml being 
> received: <S:open-directory rev=\"-1\">
>  And with that, start_element in fetch.c (in libsvn_ra_dav) is running this 
> piece of code:
>           name = get_attr(atts, "name");
>           /* ### verify we got it. punt on error. */
>           svn_stringbuf_set(rb->namestr, name);

While I appreciate the patch attempt (omitted for brevity), ultimately
I think the problem should be solved a little more directly.  That is,
we should figure out why we getting an open-directory with no "name"
attribute.  The code obviously thinks this case shouldn't occur.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org