You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@subversion.apache.org by st...@apache.org on 2012/09/22 17:14:25 UTC
svn commit: r1388816 - in /subversion/branches/10Gb/subversion:
include/private/svn_pseudo_md5.h libsvn_subr/cache-membuffer.c
libsvn_subr/pseudo_md5.c tests/libsvn_subr/checksum-test.c
Author: stefan2
Date: Sat Sep 22 15:14:25 2012
New Revision: 1388816
URL: http://svn.apache.org/viewvc?rev=1388816&view=rev
Log:
On the 10Gb branch: Introduce MD5-based hash functions optimized
for short input lengths. Use these to speed up membuffer access.
* subversion/include/private/svn_pseudo_md5.h: new header
(svn__pseudo_md5_15,
svn__pseudo_md5_31,
svn__pseudo_md5_63): declare new private API
* subversion/libsvn_subr/pseudo_md5.c: new source
(svn__pseudo_md5_15,
svn__pseudo_md5_31,
svn__pseudo_md5_63): implement new private API
* subversion/libsvn_subr/cache-membuffer.c
(combine_key): use the new hash API
* subversion/tests/libsvn_subr/checksum-test.c
(test_pseudo_md5): new test case
(test_funcs): register it
Added:
subversion/branches/10Gb/subversion/include/private/svn_pseudo_md5.h
subversion/branches/10Gb/subversion/libsvn_subr/pseudo_md5.c
Modified:
subversion/branches/10Gb/subversion/libsvn_subr/cache-membuffer.c
subversion/branches/10Gb/subversion/tests/libsvn_subr/checksum-test.c
Added: subversion/branches/10Gb/subversion/include/private/svn_pseudo_md5.h
URL: http://svn.apache.org/viewvc/subversion/branches/10Gb/subversion/include/private/svn_pseudo_md5.h?rev=1388816&view=auto
==============================================================================
--- subversion/branches/10Gb/subversion/include/private/svn_pseudo_md5.h (added)
+++ subversion/branches/10Gb/subversion/include/private/svn_pseudo_md5.h Sat Sep 22 15:14:25 2012
@@ -0,0 +1,83 @@
+/**
+ * @copyright
+ * ====================================================================
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ * ====================================================================
+ * @endcopyright
+ *
+ * @file svn_pseudo_md5.h
+ * @brief Subversion hash sum calculation for runtime data (only)
+ */
+
+#ifndef SVN_PSEUDO_MD5_H
+#define SVN_PSEUDO_MD5_H
+
+#include <apr.h> /* for apr_uint32_t */
+
+#ifdef __cplusplus
+extern "C" {
+#endif /* __cplusplus */
+
+
+/**
+ * Calculates a hash sum for 15 bytes in @a x and returns it in @a digest.
+ * The most significant byte in @a x must be 0 (independent of being on a
+ * little or big endian machine).
+ *
+ * @note Use for runtime data hashing only.
+ *
+ * @note The output is NOT an MD5 digest shares has the same basic
+ * cryptographic properties. Collisions with proper MD5 on the same
+ * or other input data is equally unlikely as any MD5 collision.
+ */
+void svn__pseudo_md5_15(apr_uint32_t digest[4],
+ const apr_uint32_t x[4]);
+
+/**
+ * Calculates a hash sum for 31 bytes in @a x and returns it in @a digest.
+ * The most significant byte in @a x must be 0 (independent of being on a
+ * little or big endian machine).
+ *
+ * @note Use for runtime data hashing only.
+ *
+ * @note The output is NOT an MD5 digest shares has the same basic
+ * cryptographic properties. Collisions with proper MD5 on the same
+ * or other input data is equally unlikely as any MD5 collision.
+ */
+void svn__pseudo_md5_31(apr_uint32_t digest[4],
+ const apr_uint32_t x[8]);
+
+/**
+ * Calculates a hash sum for 63 bytes in @a x and returns it in @a digest.
+ * The most significant byte in @a x must be 0 (independent of being on a
+ * little or big endian machine).
+ *
+ * @note Use for runtime data hashing only.
+ *
+ * @note The output is NOT an MD5 digest shares has the same basic
+ * cryptographic properties. Collisions with proper MD5 on the same
+ * or other input data is equally unlikely as any MD5 collision.
+ */
+void svn__pseudo_md5_63(apr_uint32_t digest[4],
+ const apr_uint32_t x[16]);
+
+#ifdef __cplusplus
+}
+#endif /* __cplusplus */
+
+#endif /* SVN_PSEUDO_MD5_H */
Modified: subversion/branches/10Gb/subversion/libsvn_subr/cache-membuffer.c
URL: http://svn.apache.org/viewvc/subversion/branches/10Gb/subversion/libsvn_subr/cache-membuffer.c?rev=1388816&r1=1388815&r2=1388816&view=diff
==============================================================================
--- subversion/branches/10Gb/subversion/libsvn_subr/cache-membuffer.c (original)
+++ subversion/branches/10Gb/subversion/libsvn_subr/cache-membuffer.c Sat Sep 22 15:14:25 2012
@@ -33,6 +33,7 @@
#include "svn_string.h"
#include "private/svn_dep_compat.h"
#include "private/svn_mutex.h"
+#include "private/svn_pseudo_md5.h"
/*
* This svn_cache__t implementation actually consists of two parts:
@@ -1713,7 +1714,31 @@ combine_key(svn_membuffer_cache_t *cache
if (key_len == APR_HASH_KEY_STRING)
key_len = strlen((const char *) key);
- apr_md5((unsigned char*)cache->combined_key, key, key_len);
+ if (key_len < 16)
+ {
+ apr_uint32_t data[4] = { 0 };
+ memcpy(data, key, key_len);
+
+ svn__pseudo_md5_15((apr_uint32_t *)cache->combined_key, data);
+ }
+ else if (key_len < 32)
+ {
+ apr_uint32_t data[8] = { 0 };
+ memcpy(data, key, key_len);
+
+ svn__pseudo_md5_31((apr_uint32_t *)cache->combined_key, data);
+ }
+ else if (key_len < 64)
+ {
+ apr_uint32_t data[16] = { 0 };
+ memcpy(data, key, key_len);
+
+ svn__pseudo_md5_63((apr_uint32_t *)cache->combined_key, data);
+ }
+ else
+ {
+ apr_md5((unsigned char*)cache->combined_key, key, key_len);
+ }
cache->combined_key[0] ^= cache->prefix[0];
cache->combined_key[1] ^= cache->prefix[1];
Added: subversion/branches/10Gb/subversion/libsvn_subr/pseudo_md5.c
URL: http://svn.apache.org/viewvc/subversion/branches/10Gb/subversion/libsvn_subr/pseudo_md5.c?rev=1388816&view=auto
==============================================================================
--- subversion/branches/10Gb/subversion/libsvn_subr/pseudo_md5.c (added)
+++ subversion/branches/10Gb/subversion/libsvn_subr/pseudo_md5.c Sat Sep 22 15:14:25 2012
@@ -0,0 +1,422 @@
+/*
+ * This is work is derived from material Copyright RSA Data Security, Inc.
+ *
+ * The RSA copyright statement and Licence for that original material is
+ * included below. This is followed by the Apache copyright statement and
+ * licence for the modifications made to that material.
+ */
+
+/* MD5C.C - RSA Data Security, Inc., MD5 message-digest algorithm
+ */
+
+/* Copyright (C) 1991-2, RSA Data Security, Inc. Created 1991. All
+ rights reserved.
+
+ License to copy and use this software is granted provided that it
+ is identified as the "RSA Data Security, Inc. MD5 Message-Digest
+ Algorithm" in all material mentioning or referencing this software
+ or this function.
+
+ License is also granted to make and use derivative works provided
+ that such works are identified as "derived from the RSA Data
+ Security, Inc. MD5 Message-Digest Algorithm" in all material
+ mentioning or referencing the derived work.
+
+ RSA Data Security, Inc. makes no representations concerning either
+ the merchantability of this software or the suitability of this
+ software for any particular purpose. It is provided "as is"
+ without express or implied warranty of any kind.
+
+ These notices must be retained in any copies of any part of this
+ documentation and/or software.
+ */
+
+/* Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/*
+ * The apr_md5_encode() routine uses much code obtained from the FreeBSD 3.0
+ * MD5 crypt() function, which is licenced as follows:
+ * ----------------------------------------------------------------------------
+ * "THE BEER-WARE LICENSE" (Revision 42):
+ * <ph...@login.dknet.dk> wrote this file. As long as you retain this notice you
+ * can do whatever you want with this stuff. If we meet some day, and you think
+ * this stuff is worth it, you can buy me a beer in return. Poul-Henning Kamp
+ * ----------------------------------------------------------------------------
+ */
+
+/*
+ * pseudo_md5.c: md5-esque hash sum calculation for short data blocks.
+ * Code taken and adapted from the APR (see licenses above).
+ */
+#include "svn_checksum.h"
+
+/* Constants for MD5 calculation.
+ */
+
+#define S11 7
+#define S12 12
+#define S13 17
+#define S14 22
+#define S21 5
+#define S22 9
+#define S23 14
+#define S24 20
+#define S31 4
+#define S32 11
+#define S33 16
+#define S34 23
+#define S41 6
+#define S42 10
+#define S43 15
+#define S44 21
+
+/* F, G, H and I are basic MD5 functions.
+ */
+#define F(x, y, z) (((x) & (y)) | ((~x) & (z)))
+#define G(x, y, z) (((x) & (z)) | ((y) & (~z)))
+#define H(x, y, z) ((x) ^ (y) ^ (z))
+#define I(x, y, z) ((y) ^ ((x) | (~z)))
+
+/* ROTATE_LEFT rotates x left n bits.
+ */
+#if defined(_MSC_VER) && _MSC_VER >= 1310
+#pragma intrinsic(_rotl)
+#define ROTATE_LEFT(x, n) (_rotl(x,n))
+#else
+#define ROTATE_LEFT(x, n) (((x) << (n)) | ((x) >> (32-(n))))
+#endif
+
+/* FF, GG, HH, and II transformations for rounds 1, 2, 3, and 4.
+ * Rotation is separate from addition to prevent recomputation.
+ */
+#define FF(a, b, c, d, x, s, ac) { \
+ (a) += F ((b), (c), (d)) + (x) + (apr_uint32_t)(ac); \
+ (a) = ROTATE_LEFT ((a), (s)); \
+ (a) += (b); \
+ }
+#define GG(a, b, c, d, x, s, ac) { \
+ (a) += G ((b), (c), (d)) + (x) + (apr_uint32_t)(ac); \
+ (a) = ROTATE_LEFT ((a), (s)); \
+ (a) += (b); \
+ }
+#define HH(a, b, c, d, x, s, ac) { \
+ (a) += H ((b), (c), (d)) + (x) + (apr_uint32_t)(ac); \
+ (a) = ROTATE_LEFT ((a), (s)); \
+ (a) += (b); \
+ }
+#define II(a, b, c, d, x, s, ac) { \
+ (a) += I ((b), (c), (d)) + (x) + (apr_uint32_t)(ac); \
+ (a) = ROTATE_LEFT ((a), (s)); \
+ (a) += (b); \
+ }
+
+/* The idea of the functions below is as follows:
+ *
+ * - The core MD5 algorithm does not assume that the "important" data
+ * is at the begin of the encryption block, followed by e.g. 0.
+ * Instead, all bits are equally relevant.
+ *
+ * - If some bytes in the input are known to be 0, we may hard-code them.
+ * With the previous property, it is safe to move them to the upper end
+ * of the encryption block to maximize the number of steps that can be
+ * pre-calculated.
+ *
+ * - Variable-length streams will use the upper 8 byte of the last
+ * encryption block to store the stream length in bits (to make 0, 00,
+ * 000, ... etc. produce different hash sums).
+ *
+ * - We will hash at most 63 bytes, i.e. 504 bits. In the standard stream
+ * implementation, the upper 6 bytes of the last encryption block would
+ * be 0. We will put at least one non-NULL value in the last 4 bytes.
+ * Therefore, our input will always be different to a standard MD5 stream
+ * implementation in either block count, content or both.
+ *
+ * - Our length indicator also varies with the number bytes in the input.
+ * Hence, different pseudo-MD5 input length produces different output
+ * (with "cryptographic probability") even if the content is all 0 or
+ * otherwise identical.
+ *
+ * - Collisions between pseudo-MD5 and pseudo-MD5 as well as pseudo-MD5
+ * and standard MD5 are as likely as any other MD5 collision.
+ */
+
+void svn__pseudo_md5_15(apr_uint32_t digest[4],
+ const apr_uint32_t x[4])
+{
+ apr_uint32_t a = 0x67452301;
+ apr_uint32_t b = 0xefcdab89;
+ apr_uint32_t c = 0x98badcfe;
+ apr_uint32_t d = 0x10325476;
+
+ /* make sure byte 63 gets the marker independently of BE / LE */
+ apr_uint32_t x3n = x[3] ^ 0xffffffff;
+
+ /* Round 1 */
+ FF(a, b, c, d, 0, S11, 0xd76aa478); /* 1 */
+ FF(d, a, b, c, 0, S12, 0xe8c7b756); /* 2 */
+ FF(c, d, a, b, 0, S13, 0x242070db); /* 3 */
+ FF(b, c, d, a, 0, S14, 0xc1bdceee); /* 4 */
+ FF(a, b, c, d, 0, S11, 0xf57c0faf); /* 5 */
+ FF(d, a, b, c, 0, S12, 0x4787c62a); /* 6 */
+ FF(c, d, a, b, 0, S13, 0xa8304613); /* 7 */
+ FF(b, c, d, a, 0, S14, 0xfd469501); /* 8 */
+ FF(a, b, c, d, 0, S11, 0x698098d8); /* 9 */
+ FF(d, a, b, c, 0, S12, 0x8b44f7af); /* 10 */
+ FF(c, d, a, b, 0, S13, 0xffff5bb1); /* 11 */
+ FF(b, c, d, a, 0, S14, 0x895cd7be); /* 12 */
+ FF(a, b, c, d, x[0], S11, 0x6b901122); /* 13 */
+ FF(d, a, b, c, x[1], S12, 0xfd987193); /* 14 */
+ FF(c, d, a, b, x[2], S13, 0xa679438e); /* 15 */
+ FF(b, c, d, a, x3n, S14, 0x49b40821); /* 16 */
+
+ /* Round 2 */
+ GG(a, b, c, d, 0, S21, 0xf61e2562); /* 17 */
+ GG(d, a, b, c, 0, S22, 0xc040b340); /* 18 */
+ GG(c, d, a, b, 0, S23, 0x265e5a51); /* 19 */
+ GG(b, c, d, a, 0, S24, 0xe9b6c7aa); /* 20 */
+ GG(a, b, c, d, 0, S21, 0xd62f105d); /* 21 */
+ GG(d, a, b, c, 0, S22, 0x2441453); /* 22 */
+ GG(c, d, a, b, x3n, S23, 0xd8a1e681); /* 23 */
+ GG(b, c, d, a, 0, S24, 0xe7d3fbc8); /* 24 */
+ GG(a, b, c, d, 0, S21, 0x21e1cde6); /* 25 */
+ GG(d, a, b, c, x[2], S22, 0xc33707d6); /* 26 */
+ GG(c, d, a, b, 0, S23, 0xf4d50d87); /* 27 */
+ GG(b, c, d, a, 0, S24, 0x455a14ed); /* 28 */
+ GG(a, b, c, d, x[1], S21, 0xa9e3e905); /* 29 */
+ GG(d, a, b, c, 0, S22, 0xfcefa3f8); /* 30 */
+ GG(c, d, a, b, 0, S23, 0x676f02d9); /* 31 */
+ GG(b, c, d, a, x[0], S24, 0x8d2a4c8a); /* 32 */
+
+ /* Round 3 */
+ HH(a, b, c, d, 0, S31, 0xfffa3942); /* 33 */
+ HH(d, a, b, c, 0, S32, 0x8771f681); /* 34 */
+ HH(c, d, a, b, 0, S33, 0x6d9d6122); /* 35 */
+ HH(b, c, d, a, x[2], S34, 0xfde5380c); /* 36 */
+ HH(a, b, c, d, 0, S31, 0xa4beea44); /* 37 */
+ HH(d, a, b, c, 0, S32, 0x4bdecfa9); /* 38 */
+ HH(c, d, a, b, 0, S33, 0xf6bb4b60); /* 39 */
+ HH(b, c, d, a, 0, S34, 0xbebfbc70); /* 40 */
+ HH(a, b, c, d, x[1], S31, 0x289b7ec6); /* 41 */
+ HH(d, a, b, c, 0, S32, 0xeaa127fa); /* 42 */
+ HH(c, d, a, b, 0, S33, 0xd4ef3085); /* 43 */
+ HH(b, c, d, a, 0, S34, 0x4881d05); /* 44 */
+ HH(a, b, c, d, 0, S31, 0xd9d4d039); /* 45 */
+ HH(d, a, b, c, x[0], S32, 0xe6db99e5); /* 46 */
+ HH(c, d, a, b, x3n, S33, 0x1fa27cf8); /* 47 */
+ HH(b, c, d, a, 0, S34, 0xc4ac5665); /* 48 */
+
+ /* Round 4 */
+ II(a, b, c, d, 0, S41, 0xf4292244); /* 49 */
+ II(d, a, b, c, 0, S42, 0x432aff97); /* 50 */
+ II(c, d, a, b, x[2], S43, 0xab9423a7); /* 51 */
+ II(b, c, d, a, 0, S44, 0xfc93a039); /* 52 */
+ II(a, b, c, d, x[0], S41, 0x655b59c3); /* 53 */
+ II(d, a, b, c, 0, S42, 0x8f0ccc92); /* 54 */
+ II(c, d, a, b, 0, S43, 0xffeff47d); /* 55 */
+ II(b, c, d, a, 0, S44, 0x85845dd1); /* 56 */
+ II(a, b, c, d, 0, S41, 0x6fa87e4f); /* 57 */
+ II(d, a, b, c, x3n, S42, 0xfe2ce6e0); /* 58 */
+ II(c, d, a, b, 0, S43, 0xa3014314); /* 59 */
+ II(b, c, d, a, x[1], S44, 0x4e0811a1); /* 60 */
+ II(a, b, c, d, 0, S41, 0xf7537e82); /* 61 */
+ II(d, a, b, c, 0, S42, 0xbd3af235); /* 62 */
+ II(c, d, a, b, 0, S43, 0x2ad7d2bb); /* 63 */
+ II(b, c, d, a, 0, S44, 0xeb86d391); /* 64 */
+
+ digest[0] = a;
+ digest[1] = b;
+ digest[2] = c;
+ digest[3] = d;
+}
+
+void svn__pseudo_md5_31(apr_uint32_t digest[4],
+ const apr_uint32_t x[8])
+{
+ apr_uint32_t a = 0x67452301;
+ apr_uint32_t b = 0xefcdab89;
+ apr_uint32_t c = 0x98badcfe;
+ apr_uint32_t d = 0x10325476;
+
+ /* make sure byte 63 gets the marker independently of BE / LE */
+ apr_uint32_t x7n = x[7] ^ 0xfefefefe;
+
+ /* Round 1 */
+ FF(a, b, c, d, 0, S11, 0xd76aa478); /* 1 */
+ FF(d, a, b, c, 0, S12, 0xe8c7b756); /* 2 */
+ FF(c, d, a, b, 0, S13, 0x242070db); /* 3 */
+ FF(b, c, d, a, 0, S14, 0xc1bdceee); /* 4 */
+ FF(a, b, c, d, 0, S11, 0xf57c0faf); /* 5 */
+ FF(d, a, b, c, 0, S12, 0x4787c62a); /* 6 */
+ FF(c, d, a, b, 0, S13, 0xa8304613); /* 7 */
+ FF(b, c, d, a, 0, S14, 0xfd469501); /* 8 */
+ FF(a, b, c, d, x[0], S11, 0x698098d8); /* 9 */
+ FF(d, a, b, c, x[1], S12, 0x8b44f7af); /* 10 */
+ FF(c, d, a, b, x[2], S13, 0xffff5bb1); /* 11 */
+ FF(b, c, d, a, x[3], S14, 0x895cd7be); /* 12 */
+ FF(a, b, c, d, x[4], S11, 0x6b901122); /* 13 */
+ FF(d, a, b, c, x[5], S12, 0xfd987193); /* 14 */
+ FF(c, d, a, b, x[6], S13, 0xa679438e); /* 15 */
+ FF(b, c, d, a, x7n, S14, 0x49b40821); /* 16 */
+
+ /* Round 2 */
+ GG(a, b, c, d, 0, S21, 0xf61e2562); /* 17 */
+ GG(d, a, b, c, 0, S22, 0xc040b340); /* 18 */
+ GG(c, d, a, b, x[3], S23, 0x265e5a51); /* 19 */
+ GG(b, c, d, a, 0, S24, 0xe9b6c7aa); /* 20 */
+ GG(a, b, c, d, 0, S21, 0xd62f105d); /* 21 */
+ GG(d, a, b, c, x[2], S22, 0x2441453); /* 22 */
+ GG(c, d, a, b, x7n, S23, 0xd8a1e681); /* 23 */
+ GG(b, c, d, a, 0, S24, 0xe7d3fbc8); /* 24 */
+ GG(a, b, c, d, x[1], S21, 0x21e1cde6); /* 25 */
+ GG(d, a, b, c, x[6], S22, 0xc33707d6); /* 26 */
+ GG(c, d, a, b, 0, S23, 0xf4d50d87); /* 27 */
+ GG(b, c, d, a, x[0], S24, 0x455a14ed); /* 28 */
+ GG(a, b, c, d, x[5], S21, 0xa9e3e905); /* 29 */
+ GG(d, a, b, c, 0, S22, 0xfcefa3f8); /* 30 */
+ GG(c, d, a, b, 0, S23, 0x676f02d9); /* 31 */
+ GG(b, c, d, a, x[4], S24, 0x8d2a4c8a); /* 32 */
+
+ /* Round 3 */
+ HH(a, b, c, d, 0, S31, 0xfffa3942); /* 33 */
+ HH(d, a, b, c, x[0], S32, 0x8771f681); /* 34 */
+ HH(c, d, a, b, x[3], S33, 0x6d9d6122); /* 35 */
+ HH(b, c, d, a, x[6], S34, 0xfde5380c); /* 36 */
+ HH(a, b, c, d, 0, S31, 0xa4beea44); /* 37 */
+ HH(d, a, b, c, 0, S32, 0x4bdecfa9); /* 38 */
+ HH(c, d, a, b, 0, S33, 0xf6bb4b60); /* 39 */
+ HH(b, c, d, a, x[2], S34, 0xbebfbc70); /* 40 */
+ HH(a, b, c, d, x[5], S31, 0x289b7ec6); /* 41 */
+ HH(d, a, b, c, 0, S32, 0xeaa127fa); /* 42 */
+ HH(c, d, a, b, 0, S33, 0xd4ef3085); /* 43 */
+ HH(b, c, d, a, 0, S34, 0x4881d05); /* 44 */
+ HH(a, b, c, d, x[1], S31, 0xd9d4d039); /* 45 */
+ HH(d, a, b, c, x[4], S32, 0xe6db99e5); /* 46 */
+ HH(c, d, a, b, x7n, S33, 0x1fa27cf8); /* 47 */
+ HH(b, c, d, a, 0, S34, 0xc4ac5665); /* 48 */
+
+ /* Round 4 */
+ II(a, b, c, d, 0, S41, 0xf4292244); /* 49 */
+ II(d, a, b, c, 0, S42, 0x432aff97); /* 50 */
+ II(c, d, a, b, x[6], S43, 0xab9423a7); /* 51 */
+ II(b, c, d, a, 0, S44, 0xfc93a039); /* 52 */
+ II(a, b, c, d, x[4], S41, 0x655b59c3); /* 53 */
+ II(d, a, b, c, 0, S42, 0x8f0ccc92); /* 54 */
+ II(c, d, a, b, x[2], S43, 0xffeff47d); /* 55 */
+ II(b, c, d, a, 0, S44, 0x85845dd1); /* 56 */
+ II(a, b, c, d, x[0], S41, 0x6fa87e4f); /* 57 */
+ II(d, a, b, c, x7n, S42, 0xfe2ce6e0); /* 58 */
+ II(c, d, a, b, 0, S43, 0xa3014314); /* 59 */
+ II(b, c, d, a, x[5], S44, 0x4e0811a1); /* 60 */
+ II(a, b, c, d, 0, S41, 0xf7537e82); /* 61 */
+ II(d, a, b, c, x[3], S42, 0xbd3af235); /* 62 */
+ II(c, d, a, b, 0, S43, 0x2ad7d2bb); /* 63 */
+ II(b, c, d, a, x[1], S44, 0xeb86d391); /* 64 */
+
+ digest[0] = a;
+ digest[1] = b;
+ digest[2] = c;
+ digest[3] = d;
+}
+
+void svn__pseudo_md5_63(apr_uint32_t digest[4],
+ const apr_uint32_t x[16])
+{
+ apr_uint32_t a = 0x67452301;
+ apr_uint32_t b = 0xefcdab89;
+ apr_uint32_t c = 0x98badcfe;
+ apr_uint32_t d = 0x10325476;
+
+ /* make sure byte 63 gets the marker independently of BE / LE */
+ apr_uint32_t x15n = x[15] ^ 0xfcfcfcfc;
+
+ /* Round 1 */
+ FF(a, b, c, d, x[0], S11, 0xd76aa478); /* 1 */
+ FF(d, a, b, c, x[1], S12, 0xe8c7b756); /* 2 */
+ FF(c, d, a, b, x[2], S13, 0x242070db); /* 3 */
+ FF(b, c, d, a, x[3], S14, 0xc1bdceee); /* 4 */
+ FF(a, b, c, d, x[4], S11, 0xf57c0faf); /* 5 */
+ FF(d, a, b, c, x[5], S12, 0x4787c62a); /* 6 */
+ FF(c, d, a, b, x[6], S13, 0xa8304613); /* 7 */
+ FF(b, c, d, a, x[7], S14, 0xfd469501); /* 8 */
+ FF(a, b, c, d, x[8], S11, 0x698098d8); /* 9 */
+ FF(d, a, b, c, x[9], S12, 0x8b44f7af); /* 10 */
+ FF(c, d, a, b, x[10], S13, 0xffff5bb1); /* 11 */
+ FF(b, c, d, a, x[11], S14, 0x895cd7be); /* 12 */
+ FF(a, b, c, d, x[12], S11, 0x6b901122); /* 13 */
+ FF(d, a, b, c, x[13], S12, 0xfd987193); /* 14 */
+ FF(c, d, a, b, x[14], S13, 0xa679438e); /* 15 */
+ FF(b, c, d, a, x15n, S14, 0x49b40821); /* 16 */
+
+ /* Round 2 */
+ GG(a, b, c, d, x[1], S21, 0xf61e2562); /* 17 */
+ GG(d, a, b, c, x[6], S22, 0xc040b340); /* 18 */
+ GG(c, d, a, b, x[11], S23, 0x265e5a51); /* 19 */
+ GG(b, c, d, a, x[0], S24, 0xe9b6c7aa); /* 20 */
+ GG(a, b, c, d, x[5], S21, 0xd62f105d); /* 21 */
+ GG(d, a, b, c, x[10], S22, 0x2441453); /* 22 */
+ GG(c, d, a, b, x15n, S23, 0xd8a1e681); /* 23 */
+ GG(b, c, d, a, x[4], S24, 0xe7d3fbc8); /* 24 */
+ GG(a, b, c, d, x[9], S21, 0x21e1cde6); /* 25 */
+ GG(d, a, b, c, x[14], S22, 0xc33707d6); /* 26 */
+ GG(c, d, a, b, x[3], S23, 0xf4d50d87); /* 27 */
+ GG(b, c, d, a, x[8], S24, 0x455a14ed); /* 28 */
+ GG(a, b, c, d, x[13], S21, 0xa9e3e905); /* 29 */
+ GG(d, a, b, c, x[2], S22, 0xfcefa3f8); /* 30 */
+ GG(c, d, a, b, x[7], S23, 0x676f02d9); /* 31 */
+ GG(b, c, d, a, x[12], S24, 0x8d2a4c8a); /* 32 */
+
+ /* Round 3 */
+ HH(a, b, c, d, x[5], S31, 0xfffa3942); /* 33 */
+ HH(d, a, b, c, x[8], S32, 0x8771f681); /* 34 */
+ HH(c, d, a, b, x[11], S33, 0x6d9d6122); /* 35 */
+ HH(b, c, d, a, x[14], S34, 0xfde5380c); /* 36 */
+ HH(a, b, c, d, x[1], S31, 0xa4beea44); /* 37 */
+ HH(d, a, b, c, x[4], S32, 0x4bdecfa9); /* 38 */
+ HH(c, d, a, b, x[7], S33, 0xf6bb4b60); /* 39 */
+ HH(b, c, d, a, x[10], S34, 0xbebfbc70); /* 40 */
+ HH(a, b, c, d, x[13], S31, 0x289b7ec6); /* 41 */
+ HH(d, a, b, c, x[0], S32, 0xeaa127fa); /* 42 */
+ HH(c, d, a, b, x[3], S33, 0xd4ef3085); /* 43 */
+ HH(b, c, d, a, x[6], S34, 0x4881d05); /* 44 */
+ HH(a, b, c, d, x[9], S31, 0xd9d4d039); /* 45 */
+ HH(d, a, b, c, x[12], S32, 0xe6db99e5); /* 46 */
+ HH(c, d, a, b, x15n, S33, 0x1fa27cf8); /* 47 */
+ HH(b, c, d, a, x[2], S34, 0xc4ac5665); /* 48 */
+
+ /* Round 4 */
+ II(a, b, c, d, x[0], S41, 0xf4292244); /* 49 */
+ II(d, a, b, c, x[7], S42, 0x432aff97); /* 50 */
+ II(c, d, a, b, x[14], S43, 0xab9423a7); /* 51 */
+ II(b, c, d, a, x[5], S44, 0xfc93a039); /* 52 */
+ II(a, b, c, d, x[12], S41, 0x655b59c3); /* 53 */
+ II(d, a, b, c, x[3], S42, 0x8f0ccc92); /* 54 */
+ II(c, d, a, b, x[10], S43, 0xffeff47d); /* 55 */
+ II(b, c, d, a, x[1], S44, 0x85845dd1); /* 56 */
+ II(a, b, c, d, x[8], S41, 0x6fa87e4f); /* 57 */
+ II(d, a, b, c, x15n, S42, 0xfe2ce6e0); /* 58 */
+ II(c, d, a, b, x[6], S43, 0xa3014314); /* 59 */
+ II(b, c, d, a, x[13], S44, 0x4e0811a1); /* 60 */
+ II(a, b, c, d, x[4], S41, 0xf7537e82); /* 61 */
+ II(d, a, b, c, x[11], S42, 0xbd3af235); /* 62 */
+ II(c, d, a, b, x[2], S43, 0x2ad7d2bb); /* 63 */
+ II(b, c, d, a, x[9], S44, 0xeb86d391); /* 64 */
+
+ digest[0] = a;
+ digest[1] = b;
+ digest[2] = c;
+ digest[3] = d;
+}
Modified: subversion/branches/10Gb/subversion/tests/libsvn_subr/checksum-test.c
URL: http://svn.apache.org/viewvc/subversion/branches/10Gb/subversion/tests/libsvn_subr/checksum-test.c?rev=1388816&r1=1388815&r2=1388816&view=diff
==============================================================================
--- subversion/branches/10Gb/subversion/tests/libsvn_subr/checksum-test.c (original)
+++ subversion/branches/10Gb/subversion/tests/libsvn_subr/checksum-test.c Sat Sep 22 15:14:25 2012
@@ -24,6 +24,7 @@
#include <apr_pools.h>
#include "svn_error.h"
+#include "private/svn_pseudo_md5.h"
#include "../svn_test.h"
@@ -80,6 +81,38 @@ test_checksum_empty(apr_pool_t *pool)
return SVN_NO_ERROR;
}
+static svn_error_t *
+test_pseudo_md5(apr_pool_t *pool)
+{
+ apr_uint32_t input[16] = { 0 };
+ apr_uint32_t digest_15[4] = { 0 };
+ apr_uint32_t digest_31[4] = { 0 };
+ apr_uint32_t digest_63[4] = { 0 };
+ svn_checksum_t *checksum;
+
+ /* input is all 0s but the hash shall be different
+ (due to different input sizes)*/
+ svn__pseudo_md5_15(digest_15, input);
+ svn__pseudo_md5_31(digest_31, input);
+ svn__pseudo_md5_63(digest_63, input);
+
+ SVN_TEST_ASSERT(memcmp(digest_15, digest_31, sizeof(digest_15)));
+ SVN_TEST_ASSERT(memcmp(digest_15, digest_63, sizeof(digest_15)));
+ SVN_TEST_ASSERT(memcmp(digest_31, digest_63, sizeof(digest_15)));
+
+ /* the checksums shall also be different from "proper" MD5 */
+ SVN_ERR(svn_checksum(&checksum, svn_checksum_md5, input, 15, pool));
+ SVN_TEST_ASSERT(memcmp(digest_15, checksum->digest, sizeof(digest_15)));
+
+ SVN_ERR(svn_checksum(&checksum, svn_checksum_md5, input, 31, pool));
+ SVN_TEST_ASSERT(memcmp(digest_31, checksum->digest, sizeof(digest_15)));
+
+ SVN_ERR(svn_checksum(&checksum, svn_checksum_md5, input, 63, pool));
+ SVN_TEST_ASSERT(memcmp(digest_63, checksum->digest, sizeof(digest_15)));
+
+ return SVN_NO_ERROR;
+}
+
/* An array of all test functions */
struct svn_test_descriptor_t test_funcs[] =
{
@@ -88,5 +121,7 @@ struct svn_test_descriptor_t test_funcs[
"checksum parse"),
SVN_TEST_PASS2(test_checksum_empty,
"checksum emptiness"),
+ SVN_TEST_PASS2(test_pseudo_md5,
+ "pseudo-md5 compatibility"),
SVN_TEST_NULL
};
Re: svn commit: r1388816 - in /subversion/branches/10Gb/subversion:
include/private/svn_pseudo_md5.h libsvn_subr/cache-membuffer.c
libsvn_subr/pseudo_md5.c tests/libsvn_subr/checksum-test.c
Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Sun, Sep 23, 2012 at 2:58 PM, Stefan Sperling <st...@elego.de> wrote:
> On Sun, Sep 23, 2012 at 02:49:16PM +0200, Stefan Fuhrmann wrote:
> > I downloaded the original version from some admittedly
> > obscure location in the rather shady party of the interwebs:
> >
> >
> https://svn.apache.org/repos/asf/apr/apr-util/tags/1.3.12/crypto/apr_md5.c
> >
> > -- Stefan^2.
>
> To comply with the licence of the code you've added you'll need to
> update the NOTICE file on your branch as well:
> https://svn.apache.org/repos/asf//subversion/branches/10Gb/NOTICE
>
> See https://svn.apache.org/repos/asf/apr/apr-util/tags/1.3.12/NOTICE
>
Thanks for noticing. Committed in r1389044.
-- Stefan^2.
--
*
Join us this October at Subversion Live
2012<http://www.wandisco.com/svn-live-2012>
for two days of best practice SVN training, networking, live demos,
committer meet and greet, and more! Space is limited, so get signed up
today<http://www.wandisco.com/svn-live-2012>
!
*
Re: svn commit: r1388816 - in /subversion/branches/10Gb/subversion:
include/private/svn_pseudo_md5.h libsvn_subr/cache-membuffer.c
libsvn_subr/pseudo_md5.c tests/libsvn_subr/checksum-test.c
Posted by Stefan Sperling <st...@elego.de>.
On Sun, Sep 23, 2012 at 02:49:16PM +0200, Stefan Fuhrmann wrote:
> I downloaded the original version from some admittedly
> obscure location in the rather shady party of the interwebs:
>
> https://svn.apache.org/repos/asf/apr/apr-util/tags/1.3.12/crypto/apr_md5.c
>
> -- Stefan^2.
To comply with the licence of the code you've added you'll need to
update the NOTICE file on your branch as well:
https://svn.apache.org/repos/asf//subversion/branches/10Gb/NOTICE
See https://svn.apache.org/repos/asf/apr/apr-util/tags/1.3.12/NOTICE
Re: svn commit: r1388816 - in /subversion/branches/10Gb/subversion:
include/private/svn_pseudo_md5.h libsvn_subr/cache-membuffer.c
libsvn_subr/pseudo_md5.c tests/libsvn_subr/checksum-test.c
Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Sat, Sep 22, 2012 at 8:12 PM, Blair Zajac <bl...@orcaware.com> wrote:
> On 09/22/2012 08:14 AM, stefan2@apache.org wrote:
>
>> Author: stefan2
>> Date: Sat Sep 22 15:14:25 2012
>> New Revision: 1388816
>>
>> URL: http://svn.apache.org/viewvc?**rev=1388816&view=rev<http://svn.apache.org/viewvc?rev=1388816&view=rev>
>> Log:
>> On the 10Gb branch: Introduce MD5-based hash functions optimized
>> for short input lengths. Use these to speed up membuffer access.
>>
>
> How much faster is it than a plain MD5?
>
The 16-byte version is twice as fast as the MD5 core
due to the fact that we know much of the input to be 0
and can hard-code it as such. In addition to that, the
APR implementation has a ~100% overhead (setting
up the context etc.) for strings that fit into a single
encoding block. In total, the pseudo-MD5 code is
3..4 times as fast as apr_md5.
> If we only need it for hashing, did you look at using a more well known
> hashing function, e.g. FNV [1] or murmur [2]?
>
FNV-128 would be just as slow as pseudo-MD5 as it
takes one iteration per byte and about 17(?) operations
per iteration.
murmur is not exactly "well-known" as it is quite new.
However, non of that really matters. The key is that we
need cryptographic strength for the hashes we use in
membuffer because they are the *only* identification for
any object stored therein. Basically the same scheme
as the SHA-1 usage in our working copy.
> Also, can you include URLs where you downloaded the code from in the log
> message and code.
>
So, you did not read the code ;) Simply read the first
65 lines of pseudo_md5.c
I downloaded the original version from some admittedly
obscure location in the rather shady party of the interwebs:
https://svn.apache.org/repos/asf/apr/apr-util/tags/1.3.12/crypto/apr_md5.c
-- Stefan^2.
--
*
Join us this October at Subversion Live
2012<http://www.wandisco.com/svn-live-2012>
for two days of best practice SVN training, networking, live demos,
committer meet and greet, and more! Space is limited, so get signed up
today<http://www.wandisco.com/svn-live-2012>
!
*
Re: svn commit: r1388816 - in /subversion/branches/10Gb/subversion:
include/private/svn_pseudo_md5.h libsvn_subr/cache-membuffer.c libsvn_subr/pseudo_md5.c
tests/libsvn_subr/checksum-test.c
Posted by Blair Zajac <bl...@orcaware.com>.
On 09/22/2012 08:14 AM, stefan2@apache.org wrote:
> Author: stefan2
> Date: Sat Sep 22 15:14:25 2012
> New Revision: 1388816
>
> URL: http://svn.apache.org/viewvc?rev=1388816&view=rev
> Log:
> On the 10Gb branch: Introduce MD5-based hash functions optimized
> for short input lengths. Use these to speed up membuffer access.
How much faster is it than a plain MD5?
If we only need it for hashing, did you look at using a more well known
hashing function, e.g. FNV [1] or murmur [2]?
Also, can you include URLs where you downloaded the code from in the log
message and code.
Blair
[1] http://isthe.com/chongo/tech/comp/fnv/
[2] https://sites.google.com/site/murmurhash/