You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Daniel Berlin <db...@dberlin.org> on 2005/02/18 22:49:59 UTC

[PATCH]: Fix off-by-one error in xdelta

If we had a source_len of 1, and a target_len of >1, and source didn't
match target, we would insert copy operations anyway (even though it was
invalid to do so), due to a check that was < instead of <= 
(we checked apos + alen < asize, which fails when apos == 0, alen == 1,
and asize == 1).

You only get matches this small when source length % blocksize ==
something small, but >0.

Rather than try to keep that complicated logic in there, which was
basically trying to say "if the copy size is too small, it's not
profitable to use a copy operation", i've simply made it always insert
copies when it finds a match, regardless of match size.
This both fixes the bug, and simplifies the logic.

We can always add some more complex logic back later to decide whether
to insert copy if we find this doesn't work as well as we want in some
way.  It's highly unlikely it makes any difference in practice. :)

I also converted the apr_uint32_t's to apr_size_t to match what
svn_txdelta_op actually uses.

I'll commit this in a few hours, giving someone time to pick nits if
they want :)

(Changelog included in patch)
--Dan




Re: Boolean tests etc. [was: [PATCH]: Fix off-by-one error in xdelta]

Posted by Daniel Berlin <db...@dberlin.org>.
>> 
>> I'll edit it
>
> Thanks.

I've edited both the file, and the log message, and committed the changed 
file, with an approriate log message (including a summary).


>> of habit.  Especially since svn_boolean_type isn't a real boolean type, 
>
> (I agree it isn't.  You have to move to C'99 to get a real Boolean type in 
> C.)
>> you'd want to know if it accidently took on some non-TRUE/non-FALSE 
>
> Huh?  Unless you write "if (b == TRUE) ...; else if (b == FALSE) ...; else 
> error();" how would you know if it accidentally took on some 
> non-TRUE/non-FALSE  value?

Either that or

if (b == TRUE); else ... assert (b == FALSE);

Anything other than this or the formulation you wrote would silently 
error.

I'll also note that you are more likely to catch semi-uninitialized (IE 
along one path) use of a variable that the compiler doesn't notice if you 
assert you are getting the values you expect.
In fact, that's the other reason it was written the way it was. The 
compiler didn't notice that, along the way to fixing it, i had accidently 
returned an uninitialized value into match along one path in 
find_match.[1]  The assert caught it though.

GCC's code, in most places, has a "real" boolean type accessible that the 
compiler will bitch about if you mess around with in bad ways, so we don't 
have to do this except in a small number of places. This only happened in 
the past year or so though, and old habits die hard :).


Anyway, let me know if there is anything else you want me to remember to 
do in the future that isn't in the HACKING file or obvious from most other 
code. (or, add it to HACKING :P)

[1]  I personally know the limitations of gcc and xlc's uninitialized 
variable analysis. It's a war with users you can
never win. The better you are at detection, the more they expect you to
not miss other cases, even though you can't possibly detect every 
"possibly uninitialized" variable without whole-program analysis.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Boolean tests etc. [was: [PATCH]: Fix off-by-one error in xdelta]

Posted by Julian Foad <ju...@btopenworld.com>.
Daniel Berlin wrote:
> On Fri, 18 Feb 2005, Julian Foad wrote:
>> Of course you were intending to start the log message with a summary, 
>> and give it some indentation :-)
> 
> I'll edit it

Thanks.

>> Please may I beg you not to compare Boolean values with TRUE or FALSE, 
>> but just write "if (match)" or in this case "if (! match)".
> 
> GCC tends to require the exact opposite in almost all cases, so it's out 

(Meaning "The GCC developers require ... in their source code".)

> of habit.  Especially since svn_boolean_type isn't a real boolean type, 

(I agree it isn't.  You have to move to C'99 to get a real Boolean type in C.)

> you'd want to know if it accidently took on some non-TRUE/non-FALSE 

Huh?  Unless you write "if (b == TRUE) ...; else if (b == FALSE) ...; else 
error();" how would you know if it accidentally took on some non-TRUE/non-FALSE 
  value?  Also, values other than 1 are commonly generated to mean TRUE by 
various functions (typically those that use bitwise AND, e.g. isdigit), because 
"non-zero means true" is a fundamental part of the C language.  (For this 
reason, comparison with FALSE is safe, whereas comparison with TRUE is dangerous.)

> value (I was raised in the "if you simulate boolean with int or char, 
> make sure you compare it against true or false" school :P)

I can't blame you for doing what you've been taught, but I'd like to read about 
the rationale for that school of thought, as I'd like to understand it.

> However, i'll remember not to do it in the future.  It certainly seems 
> odd to me.

Thanks.

>> Oh, I see you've just committed it.  (That wasn't even one hour, was 
>> it?  :-) 
> 
> I sent this email roughly 4 hours beforehand.  It's possible my mail 
> didn't make it to my mail server until i reconnected, but it didn't 
> appear to be sitting in my outbox

Understand that it really isn't a problem; I'm only continuing this discussion 
for amusement and interest ... so, this is from your original post:

> Received: (qmail 17101 invoked from network); 18 Feb 2005 22:50:39 -0000
> From: Daniel Berlin <db...@dberlin.org>
> To: dev@subversion.tigris.org
> Content-Type: multipart/mixed; boundary="=-zkvFo21q4ftSaNEJ9a5n"
> Date: Fri, 18 Feb 2005 17:49:59 -0500

... showing that your mail client dated the message just a few minutes before 
it reached the list.  It must have been sitting in some sort of local "out box" 
until you reconnected.

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH]: Fix off-by-one error in xdelta

Posted by Daniel Berlin <db...@dberlin.org>.
On Fri, 18 Feb 2005, Julian Foad wrote:

> Daniel Berlin wrote:
>> I'll commit this in a few hours, giving someone time to pick nits if
>> they want :)
>
> Oh, me, me, please let it be me!
>
>> ------------------------------------------------------------------------
>> 
>> * subversion/libsvn_delta/xdelta.c (init_matches_table): use apr_size_t 
>> instead
>> of apr_uint32_t to match what svn_txdelta_op does.
>> (find_match): Return an svn_boolean_t now.
>> (compute_delta): Use apr_size_t here too. Also use
>> return value of find_match, instead of size of match, to determine
>> whether to copy or not.
>
> Of course you were intending to start the log message with a summary, and 
> give it some indentation :-)

I'll edit it

>
>
>> Index: subversion/libsvn_delta/xdelta.c
>> ===================================================================
>> --- subversion/libsvn_delta/xdelta.c	(revision 13030)
>> +++ subversion/libsvn_delta/xdelta.c	(working copy)
> [...]
>>        /* If we didn't find a real match, insert the byte at the target
>>           position into the pending insert.  */
>> -      if (alen < MATCH_BLOCKSIZE &&
>> -          (apos + alen < asize))
>> +      if (match == FALSE)
>
> Please may I beg you not to compare Boolean values with TRUE or FALSE, but 
> just write "if (match)" or in this case "if (! match)".

GCC tends to require the exact opposite in almost all cases, so it's out 
of habit.  Especially since svn_boolean_type isn't a real boolean type, 
you'd want to know if it accidently took on some non-TRUE/non-FALSE value 
(I was raised in the "if you simulate boolean with int or char, make sure 
you compare it against true or false" school :P)
However, i'll remember not to do it in the future.  It certainly seems odd 
to me.

>
>
> Oh, I see you've just committed it.  (That wasn't even one hour, was it?  :-) 
I sent this email roughly 4 hours beforehand.  It's possible my mail 
didn't make it to my mail server until i reconnected, but it didn't appear 
to be sitting in my outbox
  :)

> Did you know you can edit your log message with "svn propedit --revprop 
> -r13058 svn:log"?

I'll edit it


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH]: Fix off-by-one error in xdelta

Posted by Julian Foad <ju...@btopenworld.com>.
Daniel Berlin wrote:
> I'll commit this in a few hours, giving someone time to pick nits if
> they want :)

Oh, me, me, please let it be me!

> ------------------------------------------------------------------------
> 
> * subversion/libsvn_delta/xdelta.c (init_matches_table): use apr_size_t instead
> of apr_uint32_t to match what svn_txdelta_op does.
> (find_match): Return an svn_boolean_t now.
> (compute_delta): Use apr_size_t here too. Also use
> return value of find_match, instead of size of match, to determine
> whether to copy or not.

Of course you were intending to start the log message with a summary, and give 
it some indentation :-)


> Index: subversion/libsvn_delta/xdelta.c
> ===================================================================
> --- subversion/libsvn_delta/xdelta.c	(revision 13030)
> +++ subversion/libsvn_delta/xdelta.c	(working copy)
[...]
>        /* If we didn't find a real match, insert the byte at the target
>           position into the pending insert.  */
> -      if (alen < MATCH_BLOCKSIZE &&
> -          (apos + alen < asize))
> +      if (match == FALSE)

Please may I beg you not to compare Boolean values with TRUE or FALSE, but just 
write "if (match)" or in this case "if (! match)".

>          {
> +
>            if (pending_insert != NULL)
>              svn_stringbuf_appendbytes (pending_insert, b + lo, 1);
>            else
> @@ -285,6 +288,12 @@
>          }
>        else
>          {
> +          /* The only legal way to end up here is to find a match.  If we
> +             didn't find a match, we are going to generate a copy instruction
> +             when we should have generated an insert, so something about the
> +             condition above, or what the match routine did, is wrong.  */
> +          assert (match == TRUE);

and here

> +
>            if (pending_insert)
>              {
>                svn_txdelta__insert_op (build_baton, svn_txdelta_new,
> 

Oh, I see you've just committed it.  (That wasn't even one hour, was it?  :-) 
Did you know you can edit your log message with "svn propedit --revprop -r13058 
svn:log"?

Thanks,

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org