You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Philip Martin <ph...@wandisco.com> on 2014/03/02 02:54:23 UTC

Byte order issue in FSFS 7

There is a problem with the FNV1a checksums in format 7: the on-disk
representation for big-endian systems, like SPARC, is different from
that of little-endian systems, like x86.  Both systems calculate the
same checksum value, however the checksum code calls htonl() before
returning the value to the caller.  On SPARC this has no effect and on
x86 it reverses the byte order, changing the value.  If we were to write
the resulting memory directly to disk as a block of data this would be
good because the disk representations would be the same, but that is not
what happens. The value is passed to encode_uint() which uses arithmetic
to write out a representation of the bytes.  Since the htonl() call
changes the x86 value this means the disk representation changes.

To get the same disk representation on both systems we do one of:
  1/ add a swap on little-endian systems
  2/ remove a swap on little-endian systems
  3/ add a swap on big-endian systems.

Since there is no ready-made swap for big-endian systems I have
discounted 3/.  I've written patches for 1/ and 2/; they are more or
less equivalent the difference is the in-memory representation.  I
prefer 2/.

[If we ignore exotic systems then htonl() and ntohl() are the same
function, both are noops on SPARC and both reverse on x86, which name to
use where is really just documentation.]

Here's 1/ calling ntohl() before encode_uint() and htonl() on the value
retrieved from disk:

Index: subversion/libsvn_fs_fs/index.c
===================================================================
--- subversion/libsvn_fs_fs/index.c	(revision 1573204)
+++ subversion/libsvn_fs_fs/index.c	(working copy)
@@ -1691,7 +1691,8 @@ svn_fs_fs__p2l_index_create(svn_fs_t *fs,
                                   encode_int(encoded, rev_diff),
                                   iter_pool));
       SVN_ERR(svn_spillbuf__write(buffer, (const char *)encoded,
-                                  encode_uint(encoded, entry.fnv1_checksum),
+                                  encode_uint(encoded,
+                                              ntohl(entry.fnv1_checksum)),
                                   iter_pool));
 
       last_entry_end = entry_end;
@@ -2018,7 +2019,7 @@ read_entry(svn_fs_fs__packed_number_stream_t *stre
   entry.item.revision = *last_revision;
 
   SVN_ERR(packed_stream_get(&value, stream));
-  entry.fnv1_checksum = (apr_uint32_t)value;
+  entry.fnv1_checksum = htonl((apr_uint32_t)value);
 
   APR_ARRAY_PUSH(result, svn_fs_fs__p2l_entry_t) = entry;
   *item_offset += entry.size;

Here's 2/ removing htonl() from the checksum return and reversing the
in-memory representation (there is a still an htonl() call inside the
32x4 checksum but that is correct):

Index: subversion/libsvn_subr/checksum.c
===================================================================
--- subversion/libsvn_subr/checksum.c	(revision 1573204)
+++ subversion/libsvn_subr/checksum.c	(working copy)
@@ -54,12 +54,12 @@ static const unsigned char sha1_empty_string_diges
 
 /* The FNV-1a digest for the empty string. */
 static const unsigned char fnv1a_32_empty_string_digest_array[] = {
-  0x81, 0x1c, 0x9d, 0xc5
+  0xc5, 0x9d, 0x1c, 0x81
 };
 
 /* The FNV-1a digest for the empty string. */
 static const unsigned char fnv1a_32x4_empty_string_digest_array[] = {
-  0xcd, 0x6d, 0x9a, 0x85
+  0x85, 0x9a, 0x6d, 0xcd
 };
 
 /* Digests for an empty string, indexed by checksum type */
Index: subversion/libsvn_subr/fnv1a.c
===================================================================
--- subversion/libsvn_subr/fnv1a.c	(revision 1573204)
+++ subversion/libsvn_subr/fnv1a.c	(working copy)
@@ -109,15 +109,15 @@ finalize_fnv1a_32x4(apr_uint32_t hashes[SCALING],
   if (len)
     memcpy(final_data + sizeof(apr_uint32_t) * SCALING, input, len);
 
-  return htonl(fnv1a_32(FNV1_BASE_32,
-                        final_data,
-                        sizeof(apr_uint32_t) * SCALING + len));
+  return fnv1a_32(FNV1_BASE_32,
+                  final_data,
+                  sizeof(apr_uint32_t) * SCALING + len);
 }
 
 apr_uint32_t
 svn__fnv1a_32(const void *input, apr_size_t len)
 {
-  return htonl(fnv1a_32(FNV1_BASE_32, input, len));
+  return fnv1a_32(FNV1_BASE_32, input, len);
 }
 
 apr_uint32_t
@@ -157,7 +157,7 @@ svn_fnv1a_32__update(svn_fnv1a_32__context_t *cont
 apr_uint32_t
 svn_fnv1a_32__finalize(svn_fnv1a_32__context_t *context)
 {
-  return htonl(context->hash);
+  return context->hash;
 }
 
In both cases we need to change the hard-coded representation of r0
written when the repository is created:

Index: subversion/libsvn_fs_fs/fs_fs.c
===================================================================
--- subversion/libsvn_fs_fs/fs_fs.c	(revision 1573204)
+++ subversion/libsvn_fs_fs/fs_fs.c	(working copy)
@@ -1274,9 +1274,9 @@ write_revision_zero(svn_fs_t *fs)
                   "\x80\x80\4\1\x1D"    /* 64k pages, 1 page using 29 bytes */
                   "\0"                  /* offset entry 0 page 1 */
                                         /* len, item & type, rev, checksum */
-                  "\x11\x34\0\xe0\xc6\xac\xa9\x07"
-                  "\x59\x09\0\xc0\xfa\xf8\xc5\x04"
-                  "\1\x0d\0\xf2\x95\xbe\xea\x01"
+                  "\x11\x34\0\xf5\xd6\x8c\x81\x06"
+                  "\x59\x09\0\xc8\xfc\xf6\x81\x04"
+                  "\1\x0d\0\x9d\x9e\xa9\x94\x0f"
                   "\x95\xff\3\x1b\0\0", /* last entry fills up 64k page */
                   38,
                   fs->pool));

-- 
Philip Martin | Subversion Committer
WANdisco // *Non-Stop Data*

Re: Disk format change for trunk's FSFS version 7

Posted by Branko Čibej <br...@wandisco.com>.

On 03.03.2014 14:03, Philip Martin wrote:
> Stefan Fuhrmann <st...@wandisco.com> writes:
>
>> On Sun, Mar 2, 2014 at 2:54 AM, Philip Martin <ph...@wandisco.com>wrote:
>>
>>> There is a problem with the FNV1a checksums in format 7: the on-disk
>>> representation for big-endian systems, like SPARC, is different from
>>> that of little-endian systems, like x86.  Both systems calculate the
>>> same checksum value, however the checksum code calls htonl() before
>>> returning the value to the caller.  On SPARC this has no effect and on
>>> x86 it reverses the byte order, changing the value.  If we were to write
>>> the resulting memory directly to disk as a block of data this would be
>>> good because the disk representations would be the same, but that is not
>>> what happens. The value is passed to encode_uint() which uses arithmetic
>>> to write out a representation of the bytes.  Since the htonl() call
>>> changes the x86 value this means the disk representation changes.
>> Committed with a few addition as r1573371 and r1573375.
> This changes the on-disk representation for the unreleased FSFS format 7
> on little-endian systems, that's probably most systems.  Anyone with
> repositories in this format needs to dump/load.

That's fine ... it'll teach them not to use unreleased versions. :)

-- 
Branko Čibej | Director of Subversion
WANdisco // Non-Stop Data
e. brane@wandisco.com

Disk format change for trunk's FSFS version 7

Posted by Philip Martin <ph...@wandisco.com>.

Stefan Fuhrmann <st...@wandisco.com> writes:

> On Sun, Mar 2, 2014 at 2:54 AM, Philip Martin <ph...@wandisco.com>wrote:
>
>> There is a problem with the FNV1a checksums in format 7: the on-disk
>> representation for big-endian systems, like SPARC, is different from
>> that of little-endian systems, like x86.  Both systems calculate the
>> same checksum value, however the checksum code calls htonl() before
>> returning the value to the caller.  On SPARC this has no effect and on
>> x86 it reverses the byte order, changing the value.  If we were to write
>> the resulting memory directly to disk as a block of data this would be
>> good because the disk representations would be the same, but that is not
>> what happens. The value is passed to encode_uint() which uses arithmetic
>> to write out a representation of the bytes.  Since the htonl() call
>> changes the x86 value this means the disk representation changes.

> Committed with a few addition as r1573371 and r1573375.

This changes the on-disk representation for the unreleased FSFS format 7
on little-endian systems, that's probably most systems.  Anyone with
repositories in this format needs to dump/load.

-- 
Philip Martin | Subversion Committer
WANdisco // *Non-Stop Data*

Re: Byte order issue in FSFS 7

Posted by Stefan Fuhrmann <st...@wandisco.com>.

On Sun, Mar 2, 2014 at 2:54 AM, Philip Martin <ph...@wandisco.com>wrote:

> There is a problem with the FNV1a checksums in format 7: the on-disk
> representation for big-endian systems, like SPARC, is different from
> that of little-endian systems, like x86.  Both systems calculate the
> same checksum value, however the checksum code calls htonl() before
> returning the value to the caller.  On SPARC this has no effect and on
> x86 it reverses the byte order, changing the value.  If we were to write
> the resulting memory directly to disk as a block of data this would be
> good because the disk representations would be the same, but that is not
> what happens. The value is passed to encode_uint() which uses arithmetic
> to write out a representation of the bytes.  Since the htonl() call
> changes the x86 value this means the disk representation changes.
>

Yikes! encode_int always writes in little endian
which means we get big endian on x86 and little
endian on Sparc.

To get the same disk representation on both systems we do one of:
>   1/ add a swap on little-endian systems
>   2/ remove a swap on little-endian systems
>   3/ add a swap on big-endian systems.
>

Well, I looked into a 4th option: storing the checksum
without further encoding.  That would even reduce the
index file size by ~10%. But it would add substantial
complexity the the integer stream object.

Since there is no ready-made swap for big-endian systems I have
> discounted 3/.  I've written patches for 1/ and 2/; they are more or
> less equivalent the difference is the in-memory representation.  I
> prefer 2/.
>

I agree - after playing with 1/ for a bit. Together with
the rev 0 template update, so it's not much smaller
than 2/. And the documentation updates would be
horrible because it is hard to explain what bye order
is used when.


> [If we ignore exotic systems then htonl() and ntohl() are the same
> function, both are noops on SPARC and both reverse on x86, which name to
> use where is really just documentation.]
>
> Here's 1/ calling ntohl() before encode_uint() and htonl() on the value
> retrieved from disk:
>
> Index: subversion/libsvn_fs_fs/index.c
> ===================================================================
> --- subversion/libsvn_fs_fs/index.c     (revision 1573204)
> +++ subversion/libsvn_fs_fs/index.c     (working copy)
> @@ -1691,7 +1691,8 @@ svn_fs_fs__p2l_index_create(svn_fs_t *fs,
>                                    encode_int(encoded, rev_diff),
>                                    iter_pool));
>        SVN_ERR(svn_spillbuf__write(buffer, (const char *)encoded,
> -                                  encode_uint(encoded,
> entry.fnv1_checksum),
> +                                  encode_uint(encoded,
> +                                              ntohl(entry.fnv1_checksum)),
>                                    iter_pool));
>
>        last_entry_end = entry_end;
> @@ -2018,7 +2019,7 @@ read_entry(svn_fs_fs__packed_number_stream_t *stre
>    entry.item.revision = *last_revision;
>
>    SVN_ERR(packed_stream_get(&value, stream));
> -  entry.fnv1_checksum = (apr_uint32_t)value;
> +  entry.fnv1_checksum = htonl((apr_uint32_t)value);
>
>    APR_ARRAY_PUSH(result, svn_fs_fs__p2l_entry_t) = entry;
>    *item_offset += entry.size;
>
> Here's 2/ removing htonl() from the checksum return and reversing the
> in-memory representation (there is a still an htonl() call inside the
> 32x4 checksum but that is correct):
>
> Index: subversion/libsvn_subr/checksum.c
> ===================================================================
> --- subversion/libsvn_subr/checksum.c   (revision 1573204)
> +++ subversion/libsvn_subr/checksum.c   (working copy)
> @@ -54,12 +54,12 @@ static const unsigned char sha1_empty_string_diges
>
>  /* The FNV-1a digest for the empty string. */
>  static const unsigned char fnv1a_32_empty_string_digest_array[] = {
> -  0x81, 0x1c, 0x9d, 0xc5
> +  0xc5, 0x9d, 0x1c, 0x81
>  };
>
>  /* The FNV-1a digest for the empty string. */
>  static const unsigned char fnv1a_32x4_empty_string_digest_array[] = {
> -  0xcd, 0x6d, 0x9a, 0x85
> +  0x85, 0x9a, 0x6d, 0xcd
>  };
>

I think it is necessary to have a platform independent
byte order in our digests. The above one would be
little endian and we would need to make sure that e.g.
Sparc writes them in the same order.

However, with the same effort, r1573371 always writes
big endian now - keeping the empty string digests unchanged.


>  /* Digests for an empty string, indexed by checksum type */
> Index: subversion/libsvn_subr/fnv1a.c
> ===================================================================
> --- subversion/libsvn_subr/fnv1a.c      (revision 1573204)
> +++ subversion/libsvn_subr/fnv1a.c      (working copy)
> @@ -109,15 +109,15 @@ finalize_fnv1a_32x4(apr_uint32_t hashes[SCALING],
>    if (len)
>      memcpy(final_data + sizeof(apr_uint32_t) * SCALING, input, len);
>
> -  return htonl(fnv1a_32(FNV1_BASE_32,
> -                        final_data,
> -                        sizeof(apr_uint32_t) * SCALING + len));
> +  return fnv1a_32(FNV1_BASE_32,
> +                  final_data,
> +                  sizeof(apr_uint32_t) * SCALING + len);
>  }
>
>  apr_uint32_t
>  svn__fnv1a_32(const void *input, apr_size_t len)
>  {
> -  return htonl(fnv1a_32(FNV1_BASE_32, input, len));
> +  return fnv1a_32(FNV1_BASE_32, input, len);
>  }
>
>  apr_uint32_t
> @@ -157,7 +157,7 @@ svn_fnv1a_32__update(svn_fnv1a_32__context_t *cont
>  apr_uint32_t
>  svn_fnv1a_32__finalize(svn_fnv1a_32__context_t *context)
>  {
> -  return htonl(context->hash);
> +  return context->hash;
>  }
>

Oops - missed that one. r1573375.

In both cases we need to change the hard-coded representation of r0
> written when the repository is created:
>
> Index: subversion/libsvn_fs_fs/fs_fs.c
> ===================================================================
> --- subversion/libsvn_fs_fs/fs_fs.c     (revision 1573204)
> +++ subversion/libsvn_fs_fs/fs_fs.c     (working copy)
> @@ -1274,9 +1274,9 @@ write_revision_zero(svn_fs_t *fs)
>                    "\x80\x80\4\1\x1D"    /* 64k pages, 1 page using 29
> bytes */
>                    "\0"                  /* offset entry 0 page 1 */
>                                          /* len, item & type, rev,
> checksum */
> -                  "\x11\x34\0\xe0\xc6\xac\xa9\x07"
> -                  "\x59\x09\0\xc0\xfa\xf8\xc5\x04"
> -                  "\1\x0d\0\xf2\x95\xbe\xea\x01"
> +                  "\x11\x34\0\xf5\xd6\x8c\x81\x06"
> +                  "\x59\x09\0\xc8\xfc\xf6\x81\x04"
> +                  "\1\x0d\0\x9d\x9e\xa9\x94\x0f"
>                    "\x95\xff\3\x1b\0\0", /* last entry fills up 64k page */
>                    38,
>                    fs->pool));
>

Those are horrible to put in by hand.
Thanks for doing all the work!
Committed with a few addition as r1573371 and r1573375.

-- Stefan^2.