You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@subversion.apache.org by pe...@apache.org on 2010/12/05 10:16:53 UTC
svn commit: r1042294 - /subversion/trunk/subversion/libsvn_subr/checksum.c
Author: peters
Date: Sun Dec 5 09:15:42 2010
New Revision: 1042294
URL: http://svn.apache.org/viewvc?rev=1042294&view=rev
Log:
Optimize and simplify svn_checksum_parse_hex(). In my tests, the new
version is 20% to 30% faster. The EBCDIC port would probably need an
#ifdef here.
Also happens to fix the case of input with uppercase A-F characters.
We neither parsed them correctly nor rejected them as invalid.
Now we accept _and_ parse them correctly.
* subversion/libsvn_subr/checksum.c
(svn_checksum_parse_hex): Rewrite to do a single table lookup to find
the validity and value of a hex byte, rather than two lookups
(svn_ctype_isxdigit, svn_ctype_isalpha) plus some ASCII math.
Also, don't bother to qualify char as unsigned except where actually
needed, and rename is_zeros to is_nonzero to match its meaning.
Modified:
subversion/trunk/subversion/libsvn_subr/checksum.c
Modified: subversion/trunk/subversion/libsvn_subr/checksum.c
URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_subr/checksum.c?rev=1042294&r1=1042293&r2=1042294&view=diff
==============================================================================
--- subversion/trunk/subversion/libsvn_subr/checksum.c (original)
+++ subversion/trunk/subversion/libsvn_subr/checksum.c Sun Dec 5 09:15:42 2010
@@ -193,9 +193,27 @@ svn_checksum_parse_hex(svn_checksum_t **
const char *hex,
apr_pool_t *pool)
{
- int len;
- int i;
- unsigned char is_zeros = '\0';
+ int i, len;
+ char is_nonzero = '\0';
+ static const char xdigitval[256] =
+ {
+ -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,
+ -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,
+ -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,
+ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,-1,-1,-1,-1,-1,-1, /* 0-9 */
+ -1,10,11,12,13,14,15,-1,-1,-1,-1,-1,-1,-1,-1,-1, /* A-F */
+ -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,
+ -1,10,11,12,13,14,15,-1,-1,-1,-1,-1,-1,-1,-1,-1, /* a-f */
+ -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,
+ -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,
+ -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,
+ -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,
+ -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,
+ -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,
+ -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,
+ -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,
+ -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,
+ };
if (hex == NULL)
{
@@ -210,19 +228,15 @@ svn_checksum_parse_hex(svn_checksum_t **
for (i = 0; i < len; i++)
{
- if ((! svn_ctype_isxdigit(hex[i * 2])) ||
- (! svn_ctype_isxdigit(hex[i * 2 + 1])))
+ char x1 = xdigitval[(unsigned char)hex[i * 2]];
+ char x2 = xdigitval[(unsigned char)hex[i * 2 + 1]];
+ if (x1 == (char)-1 || x2 == (char)-1)
return svn_error_create(SVN_ERR_BAD_CHECKSUM_PARSE, NULL, NULL);
- ((unsigned char *)(*checksum)->digest)[i] =
- ((svn_ctype_isalpha(hex[i*2]) ? hex[i*2] - 'a' + 10
- : hex[i*2] - '0') << 4) |
- (svn_ctype_isalpha(hex[i*2+1]) ? hex[i*2+1] - 'a' + 10
- : hex[i*2+1] - '0');
- is_zeros |= (*checksum)->digest[i];
+ is_nonzero |= ((char *)(*checksum)->digest)[i] = x1 << 4 | x2;
}
- if (is_zeros == '\0')
+ if (!is_nonzero)
*checksum = NULL;
return SVN_NO_ERROR;
Re: svn commit: r1042294 -
/subversion/trunk/subversion/libsvn_subr/checksum.c
Posted by Stefan Sperling <st...@elego.de>.
On Sun, Dec 05, 2010 at 05:14:20AM -0600, Peter Samuelson wrote:
>
> [Stefan Sperling]
> > > - is_zeros |= (*checksum)->digest[i];
> > > + is_nonzero |= ((char *)(*checksum)->digest)[i] = x1 << 4 | x2;
> >
> > At the very least, this needs parenthesis unless you want everyone
> > to pull out their copy of K&R to check operator precedence rules.
>
> Can do. x1 << 4 | x2 didn't seem too ambiguous to me, but I can see
> how it might be.
>
> > Can we do one assignment per line instead? Maybe that's easier to parse.
>
> Hmmm.
>
> ((char *)(*checksum)->digest)[i] = x1 << 4 | x2;
> is_nonzero |= ((char *)(*checksum)->digest)[i];
>
> ...The repeated expression is complex, so you have to visually verify
> that it is indeed identical. That seems harder, to me. Or did you
> mean:
>
> is_nonzero |=
> ((char *)(*checksum)->digest)[i] = x1 << 4 | x2;
>
> ...Which doesn't seem much easier to parse either.
What about something like this (pseudo code):
x = x1 << 4 | x2;
((char *)(*checksum)->digest)[i] = x;
is_nonzero |= x;
Stefan
Re: svn commit: r1042294 -
/subversion/trunk/subversion/libsvn_subr/checksum.c
Posted by Peter Samuelson <pe...@p12n.org>.
[Daniel Shahaf]
> char *digest = (char *) checksum->digest;
> for (i = 0; ...)
> {
> is_nonzero |= (x1 | x2);
> digest[i] = (x1 << 4) | x2;
> }
r1042319. I left both lines as (x1 << 4) | x2, even though is_nonzero
doesn't need the << 4, so that the compiler will only calculate a
single expression.
Re: svn commit: r1042294 -
/subversion/trunk/subversion/libsvn_subr/checksum.c
Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Peter Samuelson wrote on Sun, Dec 05, 2010 at 05:14:20 -0600:
>
> [Stefan Sperling]
> > > - is_zeros |= (*checksum)->digest[i];
> > > + is_nonzero |= ((char *)(*checksum)->digest)[i] = x1 << 4 | x2;
> >
> > At the very least, this needs parenthesis unless you want everyone
> > to pull out their copy of K&R to check operator precedence rules.
>
> Can do. x1 << 4 | x2 didn't seem too ambiguous to me, but I can see
> how it might be.
>
> > Can we do one assignment per line instead? Maybe that's easier to parse.
>
> Hmmm.
>
> ((char *)(*checksum)->digest)[i] = x1 << 4 | x2;
> is_nonzero |= ((char *)(*checksum)->digest)[i];
>
> ...The repeated expression is complex, so you have to visually verify
> that it is indeed identical. That seems harder, to me. Or did you
> mean:
>
> is_nonzero |=
> ((char *)(*checksum)->digest)[i] = x1 << 4 | x2;
>
char *digest = (char *) checksum->digest;
for (i = 0; ...)
{
is_nonzero |= (x1 | x2);
digest[i] = (x1 << 4) | x2;
}
> ...Which doesn't seem much easier to parse either.
>
> Peter
Re: svn commit: r1042294 -
/subversion/trunk/subversion/libsvn_subr/checksum.c
Posted by Peter Samuelson <pe...@p12n.org>.
[Stefan Sperling]
> > - is_zeros |= (*checksum)->digest[i];
> > + is_nonzero |= ((char *)(*checksum)->digest)[i] = x1 << 4 | x2;
>
> At the very least, this needs parenthesis unless you want everyone
> to pull out their copy of K&R to check operator precedence rules.
Can do. x1 << 4 | x2 didn't seem too ambiguous to me, but I can see
how it might be.
> Can we do one assignment per line instead? Maybe that's easier to parse.
Hmmm.
((char *)(*checksum)->digest)[i] = x1 << 4 | x2;
is_nonzero |= ((char *)(*checksum)->digest)[i];
...The repeated expression is complex, so you have to visually verify
that it is indeed identical. That seems harder, to me. Or did you
mean:
is_nonzero |=
((char *)(*checksum)->digest)[i] = x1 << 4 | x2;
...Which doesn't seem much easier to parse either.
Peter
Re: svn commit: r1042294 -
/subversion/trunk/subversion/libsvn_subr/checksum.c
Posted by Stefan Sperling <st...@elego.de>.
On Sun, Dec 05, 2010 at 09:16:53AM -0000, peters@apache.org wrote:
> Author: peters
> Date: Sun Dec 5 09:15:42 2010
> New Revision: 1042294
>
> URL: http://svn.apache.org/viewvc?rev=1042294&view=rev
> Log:
> Optimize and simplify svn_checksum_parse_hex(). In my tests, the new
> version is 20% to 30% faster. The EBCDIC port would probably need an
> #ifdef here.
>
> Also happens to fix the case of input with uppercase A-F characters.
> We neither parsed them correctly nor rejected them as invalid.
> Now we accept _and_ parse them correctly.
>
> * subversion/libsvn_subr/checksum.c
> (svn_checksum_parse_hex): Rewrite to do a single table lookup to find
> the validity and value of a hex byte, rather than two lookups
> (svn_ctype_isxdigit, svn_ctype_isalpha) plus some ASCII math.
> Also, don't bother to qualify char as unsigned except where actually
> needed, and rename is_zeros to is_nonzero to match its meaning.
>
> Modified:
> subversion/trunk/subversion/libsvn_subr/checksum.c
> - is_zeros |= (*checksum)->digest[i];
> + is_nonzero |= ((char *)(*checksum)->digest)[i] = x1 << 4 | x2;
At the very least, this needs parenthesis unless you want everyone
to pull out their copy of K&R to check operator precedence rules.
Can we do one assignment per line instead? Maybe that's easier to parse.
Stefan