You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Ronald Taneza <ro...@gmail.com> on 2017/11/21 14:43:43 UTC

Error E140001: Sum of subblock sizes larger than total block content length

Hi,

I got the error below while running "svnadmin load -M 0" to load a dump
file created by "svnrdump dump".

  svnadmin: E140001: Sum of subblock sizes larger than total block content
length

This error was reported when "svnadmin load" was loading a big file (around
2 GB) from a revision in the dump file.
I checked the dump file produced by svnrdump, and noticed that the
Content-length for the 2GB file is a negative value! (see below)

I finally got it working by first creating a local dump file using
"svnadmin dump", then loading this dump file using the same "svnadmin load"
command used above.

More details below:
We are migrating our subversion server from Ubuntu Linux to Window Server
2012.
* Current server (Linux): svn, version 1.9.3 (r1718519)
* New server (Windows): svn, version 1.8.19 (r1800620)

We are actually using an older svn version in the new Windows server,
because we are using CollabNet Subversion Edge, which still uses the svn
1.8 series.

To migrate data, we have a script running in the new server that does:
* svnrdump dump https://current-server/repo --incremental > dumpfile
* svnadmin load -M 0 local-path\repo < dumpfile

The "svnadmin load" command failed with the error message above, when
loading a revision containing a big file (around 2 GB).
Checking the dump file produced by svnrdump (svn version 1.8.19), I noticed
that the Content-length for the 2GB file is a negative value!
The expected Content-length value is (prop-content length +
text-content-length) = 2238208388.
But the actual value is -2056758908, which is what you get when you try to
interpret 2238208388 as a signed 32-bit integer (max 2147483647).

  SVN-fs-dump-format-version: 3

  Node-path: TheBigFile
  Node-kind: file
  Node-action: add
  Prop-delta: true
  Prop-content-length: 59
  Text-delta: true
  Text-content-length: 2238208329
  Text-content-md5: d2f79377abb0db99b37460c8156727fb
  Content-length: -2056758908

When the same revision is dumped using "svnadmin dump" in the Linux server
(svn version 1.9.3), the content-length is a positive value, as expected:

  SVN-fs-dump-format-version: 2

  Node-path: TheBigFile
  Node-kind: file
  Node-action: add
  Text-content-md5: d2f79377abb0db99b37460c8156727fb
  Text-content-sha1: 5f5c06f734f394697f90cfab26dee5d820216ee5
  Prop-content-length: 59
  Text-content-length: 2237924210
  Content-length: 2237924269

Lastly, I tried "svnrdump dump" in the Linux server (svn version 1.9.3),
and the content-length is also correct:

  SVN-fs-dump-format-version: 3

  Node-path: TheBigFile
  Node-kind: file
  Node-action: add
  Prop-delta: true
  Text-delta: true
  Text-content-md5: d2f79377abb0db99b37460c8156727fb
  Prop-content-length: 59
  Text-content-length: 2238208329
  Content-length: 2238208388

So this seems to be a problem with svnrdump version 1.8.x?

Of course, we would prefer to use the latest svn 1.9.x version, but we find
the CollabNet Subversion Edge package quite useful (apache2 and svn in one
installer package; provides web-based management console).

Regards,
Ronald

Re: Error E140001: Sum of subblock sizes larger than total block content length

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Ronald Taneza wrote on Tue, 21 Nov 2017 15:43 +0100:
> Checking the dump file produced by svnrdump (svn version 1.8.19), I noticed
> that the Content-length for the 2GB file is a negative value!
> The expected Content-length value is (prop-content length +
> text-content-length) = 2238208388.
> But the actual value is -2056758908, which is what you get when you try to
> interpret 2238208388 as a signed 32-bit integer (max 2147483647).

Subversion stores file sizes as svn_filesize_t which is an alias to
apr_int64_t which ought not to wrap until 2**63-1.  Are apr_int64_t and
APR_INT64_T_FMT correct on your systems?

It's possible that we're casting 64bit->32bit somewhere --- that'd be a
bug --- but let's first rule out the above (heeding Occam's razor).

Cheers,

Daniel

Re: Error E140001: Sum of subblock sizes larger than total block content length

Posted by Ronald Taneza <ro...@gmail.com>.
Hi Julian,

> This bug occurred because the "%ld" format and/or the 'long' data type
were 32-bit. This is indeed the case:
> On 64-bit Windows, 'long' is a 32-bit type (and 'long long' is 64-bit; on
most other platforms both are 64-bit.)

Ok, that explains it. I did not know that 'long' is 32 bits on 64-bit
Windows. Most of my experience with C is on embedded systems where we use
specific integer sizes (like C99 stdint.h types).

Thanks!

Regards,
Ronald


On Thu, Nov 23, 2017 at 12:17 PM, Julian Foad <ju...@apache.org> wrote:

> Ronald Taneza wrote:
>
>> Hi Julian,
>>
>> Thank you for your quick response and patch. I hope that this is fixed in
>> the next 1.8.x release and that CollabNet will also release an update to
>> Subversion Edge.
>>
>
> It should be.
>
> I'm still not so clear how this is exactly a problem with svnrdump 1.8.19
>> (from Subversion Edge). This is my first time browsing through the svn
>> source code, and I hope you'll indulge me with my questions below.
>>
>> We are using Subversion Edge 5.2.2 (Windows 64-bit) on a Windows Server
>> 2012 (64-bit) OS.
>>
>
> This bug occurred because the "%ld" format and/or the 'long' data type
> were 32-bit. This is indeed the case:
>
> On 64-bit Windows, 'long' is a 32-bit type (and 'long long' is 64-bit; on
> most other platforms both are 64-bit.)
>
> ( https://stackoverflow.com/questions/384502/what-is-the-bit-
> size-of-long-on-64-bit-windows#384672 )
>
> I also verified that the httpd.exe, svn.exe, and svnrdump.exe binaries are
>> 64-bit. (I ran the "file xxx" command in Cygwin and also checked their PE
>> signatures as described here: https://superuser.com/question
>> s/358434/how-to-check-if-a-binary-is-32-or-64-bit-on-windows).
>>
>> You mentioned:
>>
>>  > SVN_ERR(svn_stream_printf(eb->stream, pool,
>>                                 SVN_REPOS_DUMPFILE_CONTENT_LENGTH
>>                                ": %ld\n\n",
>>                                (unsigned long)info->size +
>>                                  propstring->len));
>>
>>  > info->size is apr_off_t ... probably 64 bits on most systems.
>>  > propstring->len is apr_size_t ... probably 64 bits on most systems.
>>
>
> I was wrong about that: apr_size_t would be 32-bit on 32-bit
> architectures. File sizes (apr_off_t) are more likely to be 64-bit,
> although there may be some platforms where they are 32-bit, maybe depending
> on compile-time options or something that is configurable.
>
> Therefore my initial patch (changing to use APR_SIZE_T_FMT) was wrong
> (would not have worked on 32-bit architectures). I have now corrected that
> to use APR_OFF_T_FMT.
>
>  > It uses "%lu" for the text content, which thus work OK up to 4 GB, and
>> "%ld" for the overall content length which thus only works up to 2 GB.
>>
>
> When I wrote that, I was assuming a 32-bit architecture.
>
> On a 64-bit system where unsigned long is uint64:
>>
>
> On a standard Windows-64 system, it's not.
>
> I hope that clears up the issue.
>
> - Julian
>

Re: Error E140001: Sum of subblock sizes larger than total block content length

Posted by Julian Foad <ju...@apache.org>.
Ronald Taneza wrote:
> Hi Julian,
> 
> Thank you for your quick response and patch. I hope that this is fixed 
> in the next 1.8.x release and that CollabNet will also release an update 
> to Subversion Edge.

It should be.

> I'm still not so clear how this is exactly a problem with svnrdump 
> 1.8.19 (from Subversion Edge). This is my first time browsing through 
> the svn source code, and I hope you'll indulge me with my questions below.
> 
> We are using Subversion Edge 5.2.2 (Windows 64-bit) on a Windows Server 
> 2012 (64-bit) OS.

This bug occurred because the "%ld" format and/or the 'long' data type 
were 32-bit. This is indeed the case:

On 64-bit Windows, 'long' is a 32-bit type (and 'long long' is 64-bit; 
on most other platforms both are 64-bit.)

( 
https://stackoverflow.com/questions/384502/what-is-the-bit-size-of-long-on-64-bit-windows#384672 
)

> I also verified that the httpd.exe, svn.exe, and svnrdump.exe binaries 
> are 64-bit. (I ran the "file xxx" command in Cygwin and also checked 
> their PE signatures as described here: 
> https://superuser.com/questions/358434/how-to-check-if-a-binary-is-32-or-64-bit-on-windows).
> 
> You mentioned:
> 
>  > SVN_ERR(svn_stream_printf(eb->stream, pool,
>                                 SVN_REPOS_DUMPFILE_CONTENT_LENGTH
>                                ": %ld\n\n",
>                                (unsigned long)info->size +
>                                  propstring->len));
> 
>  > info->size is apr_off_t ... probably 64 bits on most systems.
>  > propstring->len is apr_size_t ... probably 64 bits on most systems.

I was wrong about that: apr_size_t would be 32-bit on 32-bit 
architectures. File sizes (apr_off_t) are more likely to be 64-bit, 
although there may be some platforms where they are 32-bit, maybe 
depending on compile-time options or something that is configurable.

Therefore my initial patch (changing to use APR_SIZE_T_FMT) was wrong 
(would not have worked on 32-bit architectures). I have now corrected 
that to use APR_OFF_T_FMT.

>  > It uses "%lu" for the text content, which thus work OK up to 4 GB, 
> and "%ld" for the overall content length which thus only works up to 2 GB.

When I wrote that, I was assuming a 32-bit architecture.

> On a 64-bit system where unsigned long is uint64:

On a standard Windows-64 system, it's not.

I hope that clears up the issue.

- Julian

Re: Error E140001: Sum of subblock sizes larger than total block content length

Posted by Ronald Taneza <ro...@gmail.com>.
Hi Julian,

Thank you for your quick response and patch. I hope that this is fixed in
the next 1.8.x release and that CollabNet will also release an update to
Subversion Edge.

I'm still not so clear how this is exactly a problem with svnrdump 1.8.19
(from Subversion Edge). This is my first time browsing through the svn
source code, and I hope you'll indulge me with my questions below.

We are using Subversion Edge 5.2.2 (Windows 64-bit) on a Windows Server
2012 (64-bit) OS.
I also verified that the httpd.exe, svn.exe, and svnrdump.exe binaries are
64-bit. (I ran the "file xxx" command in Cygwin and also checked their PE
signatures as described here:
https://superuser.com/questions/358434/how-to-check-if-a-binary-is-32-or-64-bit-on-windows
).

You mentioned:

> SVN_ERR(svn_stream_printf(eb->stream, pool,
                               SVN_REPOS_DUMPFILE_CONTENT_LENGTH
                              ": %ld\n\n",
                              (unsigned long)info->size +
                                propstring->len));

> info->size is apr_off_t ... probably 64 bits on most systems.
> propstring->len is apr_size_t ... probably 64 bits on most systems.

> It uses "%lu" for the text content, which thus work OK up to 4 GB, and
"%ld" for the overall content length which thus only works up to 2 GB.

On a 64-bit system where unsigned long is uint64:
If info->size is int64 and propstring->len is uint64, then the expression
"(unsigned long)info->size + propstring->len" will produce uint64. This
result should be printed with "%lu" (as in your patch). However, printing
the result with "%ld" will still print a positive value (signed int64), I
believe.

If for some reason, info->size is just int32, then it can only store max
int32 (2 GB). So the earlier call to apr_file_info_get should already have
failed if the file size is greater than 2 GB.

      err = apr_file_info_get(info, APR_FINFO_SIZE, eb->delta_file);
      if (err)
        SVN_ERR(svn_error_wrap_apr(err, NULL));

However, apr_file_info_get did not return an error; so it just
somehow returned a negative value in info->size?

Regards,
Ronald


On Wed, Nov 22, 2017 at 6:05 PM, Julian Foad <ju...@apache.org> wrote:

> Julian Foad wrote:
>
>> (dropping users@)
>>
>> Julian Foad wrote:
>>
>>> The attached patch should fix it; not yet tested.
>>>
>>
> Proposed for backport to 1.8.x.
>
> - Julian
>
>
> I have opened https://issues.apache.org/jira/browse/SVN-4707
>> and attached the patch there. (Patch v2 is same as v1 but with tweaked
>> log message.)
>>
>> We briefly discussed testing. It's a pretty obvious fix, but ideally we
>> would write a regression test. We don't want to store 2 GB* of temporary
>> data during a test run (that would make testing onerous) so we would need
>> to write a test that generates data, streams it to the rdump 'dump'
>> function, pipes that into the rdump 'load', and checks that it parses
>> without throwing errors, without storing the loaded data.
>>
>> Anyone interested in writing such a test?
>>
>>    * Actually we should test with over 4 GB because as well as the "%ld"
>> bug I found and fixed a "%lu" bug in nearby code at the same time.
>>
>> - Julian
>>
>

Re: Error E140001: Sum of subblock sizes larger than total block content length

Posted by Julian Foad <ju...@apache.org>.
Julian Foad wrote:
> (dropping users@)
> 
> Julian Foad wrote:
>> The attached patch should fix it; not yet tested.

Proposed for backport to 1.8.x.

- Julian

> I have opened https://issues.apache.org/jira/browse/SVN-4707
> and attached the patch there. (Patch v2 is same as v1 but with tweaked 
> log message.)
> 
> We briefly discussed testing. It's a pretty obvious fix, but ideally we 
> would write a regression test. We don't want to store 2 GB* of temporary 
> data during a test run (that would make testing onerous) so we would 
> need to write a test that generates data, streams it to the rdump 'dump' 
> function, pipes that into the rdump 'load', and checks that it parses 
> without throwing errors, without storing the loaded data.
> 
> Anyone interested in writing such a test?
> 
>    * Actually we should test with over 4 GB because as well as the "%ld" 
> bug I found and fixed a "%lu" bug in nearby code at the same time.
> 
> - Julian

Re: Error E140001: Sum of subblock sizes larger than total block content length

Posted by Julian Foad <ju...@apache.org>.
(dropping users@)

Julian Foad wrote:
> The attached patch should fix it; not yet tested.

I have opened https://issues.apache.org/jira/browse/SVN-4707
and attached the patch there. (Patch v2 is same as v1 but with tweaked 
log message.)

We briefly discussed testing. It's a pretty obvious fix, but ideally we 
would write a regression test. We don't want to store 2 GB* of temporary 
data during a test run (that would make testing onerous) so we would 
need to write a test that generates data, streams it to the rdump 'dump' 
function, pipes that into the rdump 'load', and checks that it parses 
without throwing errors, without storing the loaded data.

Anyone interested in writing such a test?

   * Actually we should test with over 4 GB because as well as the "%ld" 
bug I found and fixed a "%lu" bug in nearby code at the same time.

- Julian

Re: Error E140001: Sum of subblock sizes larger than total block content length

Posted by Julian Foad <ju...@apache.org>.
> *From:* Ronald Taneza [mailto:ronald.taneza@gmail.com]
> *Sent:* dinsdag 21 november 2017 15:44
> *To:* users@subversion.apache.org
> 
> I got the error below while running "svnadmin load -M 0" to load a dump 
> file created by "svnrdump dump".
> 
>    svnadmin: E140001: Sum of subblock sizes larger than total block 
> content length
> 
> This error was reported when "svnadmin load" was loading a big file 
> (around 2 GB) from a revision in the dump file.
[...]
> Checking the dump file produced by svnrdump (svn version 1.8.19), I
> noticed that the Content-length for the 2GB file is a negative value!
[...]
>    SVN-fs-dump-format-version: 3
>    Node-path: TheBigFile
>    Node-kind: file
>    Node-action: add
>    Prop-delta: true
>    Prop-content-length: 59
>    Text-delta: true
>    Text-content-length: 2238208329
>    Text-content-md5: d2f79377abb0db99b37460c8156727fb
>    Content-length: -2056758908

Thank you for finding this!

I can see this bug existed in svnrdump up to 1.8.19. (For 1.9 I refactored this to use common code shared with 'svnadmin dump' which does not have this bug.)

In 1.8.19, subversion/svnrdump/svnrdump.c:close_file() contains:

  if (fb->dump_text)
  ...
      SVN_ERR(svn_stream_printf(eb->stream, pool,
                                SVN_REPOS_DUMPFILE_TEXT_CONTENT_LENGTH
                                ": %lu\n",
                                (unsigned long)info->size));
  ...
  if (fb->dump_props)
    SVN_ERR(svn_stream_printf(eb->stream, pool,
                              SVN_REPOS_DUMPFILE_CONTENT_LENGTH
                              ": %ld\n\n",
                              (unsigned long)info->size +
                                propstring->len));
  else if (fb->dump_text)
    SVN_ERR(svn_stream_printf(eb->stream, pool,
                              SVN_REPOS_DUMPFILE_CONTENT_LENGTH
                              ": %ld\n\n",
                              (unsigned long)info->size));
  ...


info->size is apr_off_t ... probably 64 bits on most systems.
propstring->len is apr_size_t ... probably 64 bits on most systems.

It uses "%lu" for the text content, which thus work OK up to 4 GB, and "%ld" for the overall content length which thus only works up to 2 GB.

Earlier in this file, the property content length is printed correctly:

  buf = apr_psprintf(pool, SVN_REPOS_DUMPFILE_CONTENT_LENGTH
                     ": %" APR_SIZE_T_FMT "\n", len);

The attached patch should fix it; not yet tested.

- Julian

Re: Error E140001: Sum of subblock sizes larger than total block content length

Posted by Julian Foad <ju...@apache.org>.
> *From:* Ronald Taneza [mailto:ronald.taneza@gmail.com]
> *Sent:* dinsdag 21 november 2017 15:44
> *To:* users@subversion.apache.org
> 
> I got the error below while running "svnadmin load -M 0" to load a dump 
> file created by "svnrdump dump".
> 
>    svnadmin: E140001: Sum of subblock sizes larger than total block 
> content length
> 
> This error was reported when "svnadmin load" was loading a big file 
> (around 2 GB) from a revision in the dump file.
[...]
> Checking the dump file produced by svnrdump (svn version 1.8.19), I
> noticed that the Content-length for the 2GB file is a negative value!
[...]
>    SVN-fs-dump-format-version: 3
>    Node-path: TheBigFile
>    Node-kind: file
>    Node-action: add
>    Prop-delta: true
>    Prop-content-length: 59
>    Text-delta: true
>    Text-content-length: 2238208329
>    Text-content-md5: d2f79377abb0db99b37460c8156727fb
>    Content-length: -2056758908

Thank you for finding this!

I can see this bug existed in svnrdump up to 1.8.19. (For 1.9 I refactored this to use common code shared with 'svnadmin dump' which does not have this bug.)

In 1.8.19, subversion/svnrdump/svnrdump.c:close_file() contains:

  if (fb->dump_text)
  ...
      SVN_ERR(svn_stream_printf(eb->stream, pool,
                                SVN_REPOS_DUMPFILE_TEXT_CONTENT_LENGTH
                                ": %lu\n",
                                (unsigned long)info->size));
  ...
  if (fb->dump_props)
    SVN_ERR(svn_stream_printf(eb->stream, pool,
                              SVN_REPOS_DUMPFILE_CONTENT_LENGTH
                              ": %ld\n\n",
                              (unsigned long)info->size +
                                propstring->len));
  else if (fb->dump_text)
    SVN_ERR(svn_stream_printf(eb->stream, pool,
                              SVN_REPOS_DUMPFILE_CONTENT_LENGTH
                              ": %ld\n\n",
                              (unsigned long)info->size));
  ...


info->size is apr_off_t ... probably 64 bits on most systems.
propstring->len is apr_size_t ... probably 64 bits on most systems.

It uses "%lu" for the text content, which thus work OK up to 4 GB, and "%ld" for the overall content length which thus only works up to 2 GB.

Earlier in this file, the property content length is printed correctly:

  buf = apr_psprintf(pool, SVN_REPOS_DUMPFILE_CONTENT_LENGTH
                     ": %" APR_SIZE_T_FMT "\n", len);

The attached patch should fix it; not yet tested.

- Julian