You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Stefan Sperling <st...@elego.de> on 2010/12/05 10:42:29 UTC

Re: svn commit: r1042294 - /subversion/trunk/subversion/libsvn_subr/checksum.c

On Sun, Dec 05, 2010 at 09:16:53AM -0000, peters@apache.org wrote:
> Author: peters
> Date: Sun Dec  5 09:15:42 2010
> New Revision: 1042294
> 
> URL: http://svn.apache.org/viewvc?rev=1042294&view=rev
> Log:
> Optimize and simplify svn_checksum_parse_hex().  In my tests, the new
> version is 20% to 30% faster.  The EBCDIC port would probably need an
> #ifdef here.
> 
> Also happens to fix the case of input with uppercase A-F characters.
> We neither parsed them correctly nor rejected them as invalid.
> Now we accept _and_ parse them correctly.
> 
> * subversion/libsvn_subr/checksum.c
>   (svn_checksum_parse_hex): Rewrite to do a single table lookup to find
>    the validity and value of a hex byte, rather than two lookups
>    (svn_ctype_isxdigit, svn_ctype_isalpha) plus some ASCII math.
>    Also, don't bother to qualify char as unsigned except where actually
>    needed, and rename is_zeros to is_nonzero to match its meaning.
> 
> Modified:
>     subversion/trunk/subversion/libsvn_subr/checksum.c
 
> -      is_zeros |= (*checksum)->digest[i];
> +      is_nonzero |= ((char *)(*checksum)->digest)[i] = x1 << 4 | x2;

At the very least, this needs parenthesis unless you want everyone
to pull out their copy of K&R to check operator precedence rules.

Can we do one assignment per line instead? Maybe that's easier to parse.

Stefan

Re: svn commit: r1042294 - /subversion/trunk/subversion/libsvn_subr/checksum.c

Posted by Stefan Sperling <st...@elego.de>.
On Sun, Dec 05, 2010 at 05:14:20AM -0600, Peter Samuelson wrote:
> 
> [Stefan Sperling]
> > > -      is_zeros |= (*checksum)->digest[i];
> > > +      is_nonzero |= ((char *)(*checksum)->digest)[i] = x1 << 4 | x2;
> > 
> > At the very least, this needs parenthesis unless you want everyone
> > to pull out their copy of K&R to check operator precedence rules.
> 
> Can do.  x1 << 4 | x2 didn't seem too ambiguous to me, but I can see
> how it might be.
> 
> > Can we do one assignment per line instead? Maybe that's easier to parse.
> 
> Hmmm.
> 
>       ((char *)(*checksum)->digest)[i] = x1 << 4 | x2;
>       is_nonzero |= ((char *)(*checksum)->digest)[i];
> 
> ...The repeated expression is complex, so you have to visually verify
> that it is indeed identical.  That seems harder, to me.  Or did you
> mean:
> 
>       is_nonzero |= 
>         ((char *)(*checksum)->digest)[i] = x1 << 4 | x2;
> 
> ...Which doesn't seem much easier to parse either.

What about something like this (pseudo code):

      x = x1 << 4 | x2;
      ((char *)(*checksum)->digest)[i] = x;
      is_nonzero |= x;

Stefan

Re: svn commit: r1042294 - /subversion/trunk/subversion/libsvn_subr/checksum.c

Posted by Peter Samuelson <pe...@p12n.org>.
[Daniel Shahaf]
>     char *digest = (char *) checksum->digest;
>     for (i = 0; ...)
>       {
>       	is_nonzero |= (x1 | x2);
>       	digest[i] = (x1 << 4) | x2;
>       }

r1042319.  I left both lines as (x1 << 4) | x2, even though is_nonzero
doesn't need the << 4, so that the compiler will only calculate a
single expression.

Re: svn commit: r1042294 - /subversion/trunk/subversion/libsvn_subr/checksum.c

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Peter Samuelson wrote on Sun, Dec 05, 2010 at 05:14:20 -0600:
> 
> [Stefan Sperling]
> > > -      is_zeros |= (*checksum)->digest[i];
> > > +      is_nonzero |= ((char *)(*checksum)->digest)[i] = x1 << 4 | x2;
> > 
> > At the very least, this needs parenthesis unless you want everyone
> > to pull out their copy of K&R to check operator precedence rules.
> 
> Can do.  x1 << 4 | x2 didn't seem too ambiguous to me, but I can see
> how it might be.
> 
> > Can we do one assignment per line instead? Maybe that's easier to parse.
> 
> Hmmm.
> 
>       ((char *)(*checksum)->digest)[i] = x1 << 4 | x2;
>       is_nonzero |= ((char *)(*checksum)->digest)[i];
> 
> ...The repeated expression is complex, so you have to visually verify
> that it is indeed identical.  That seems harder, to me.  Or did you
> mean:
> 
>       is_nonzero |= 
>         ((char *)(*checksum)->digest)[i] = x1 << 4 | x2;
> 

    char *digest = (char *) checksum->digest;
    for (i = 0; ...)
      {
      	is_nonzero |= (x1 | x2);
      	digest[i] = (x1 << 4) | x2;
      }

> ...Which doesn't seem much easier to parse either.
> 
> Peter

Re: svn commit: r1042294 - /subversion/trunk/subversion/libsvn_subr/checksum.c

Posted by Peter Samuelson <pe...@p12n.org>.
[Stefan Sperling]
> > -      is_zeros |= (*checksum)->digest[i];
> > +      is_nonzero |= ((char *)(*checksum)->digest)[i] = x1 << 4 | x2;
> 
> At the very least, this needs parenthesis unless you want everyone
> to pull out their copy of K&R to check operator precedence rules.

Can do.  x1 << 4 | x2 didn't seem too ambiguous to me, but I can see
how it might be.

> Can we do one assignment per line instead? Maybe that's easier to parse.

Hmmm.

      ((char *)(*checksum)->digest)[i] = x1 << 4 | x2;
      is_nonzero |= ((char *)(*checksum)->digest)[i];

...The repeated expression is complex, so you have to visually verify
that it is indeed identical.  That seems harder, to me.  Or did you
mean:

      is_nonzero |= 
        ((char *)(*checksum)->digest)[i] = x1 << 4 | x2;

...Which doesn't seem much easier to parse either.

Peter