You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Evgeny Kotkov <ev...@visualsvn.com> on 2014/01/23 11:00:18 UTC

[RFC/PATCH] svnadmin: recover/hotcopy erroring out for old FSFS repositories

Hi,

I've noticed that svnadmin recover and hotcopy commands are broken for old
repositories around trunk@1560210.  The problem can be reproduced by creating
a new --compatible-version=1.3 / 1.2 / 1.1 FSFS repository, doing something
simple with it (say, running svn mkdir or svn checkout / add file / commit) and
attempting to recover or hotcopy it:
[[[
  (svnadmin-trunk recover REPOS)
   svnadmin: E200002: Serialized hash missing terminator

  (svnadmin-trunk hotcopy REPOS REPOS_BACKUP)
   svnadmin: E160006: No such revision 1
]]]

Alternatively, this can be reproduced by running svnadmin_tests.py with, for
example, --server-minor-version=3.  This problem exists in both 1.8.x and
trunk.  In 1.8.x, however, the errors for both hotcopy and recover are visually
identical:
[[[
  (svnadmin-1.8.x recover REPOS)
   svnadmin: E200002: Serialized hash missing terminator

  (svnadmin-1.8.x hotcopy REPOS REPOS_BACKUP)
   svnadmin: E200002: Serialized hash missing terminator
]]]

Overall, this looks like a regression since 1.7.x, where hotcopy works just
fine.  The 1.7.x recover also fails for these repositories with an
"svnadmin: E000002: Can't open file 'X/db/revs/2': No such file or directory"
error, but this is a well-known issue 4213 [1], which should be fixed by
r1367683 [2].

Being unable to recover and hotcopy old repositories with the most recent
svnadmin version seems like an important issue, so I think it's worth a couple
of tests.  [See the attached patch #1]

I did a quick investigation and spotted a couple of odd moments in the
related code:

1) First of all, there is a reason for the different error messages in trunk
   and in the 1.8.x branch.

   This is a regression introduced in r1503742 [3], because now calling the
   svn_fs_fs__find_max_ids() routine during hotcopy creates a cyclic
   dependency.  Hotcopy needs to update the db/current file, so it uses this
   routine to get the NEXT_NODE_ID / NEXT_COPY_ID for old repositories.
   However, this currently behaves the following way:

   hotcopy_update_current()
     ...
     (wants to update the db/current file, but doesn't know the next ids)

      svn_fs_fs__find_max_ids()
        svn_fs_fs__rev_get_root()
          svn_fs_fs__ensure_revision_exists()
          ...
          (reads the most recent db/current file in order to check the presence
           of this revision. it is reported missing, because we're right in the
           middle of updating the db/current file and cannot do that yet,
           because we still do not know the next ids)

   So, basically, we're calling a routine that assumes the db/current file is
   up-to-date in order to get the information on how to update the outdated
   db/current file.  This ends with an "E160006: No such revision 1" error.

2) The reason for the "hash missing terminator" error is a bit trickier.

   The NEXT_NODE_ID / NEXT_COPY_ID recovery code recursively crawls
   through the dir representations in the db, however, it assumes that the
   EXPANDED_SIZE in the rep description always is non-zero.  According to
   the docstring in subversion/libsvn_fs_fs/fs.h, this assumption is incorrect:
   [[[
     /* The size of the representation in bytes as seen in the revision
        file. */
     svn_filesize_t size;

     /* The size of the fulltext of the representation. If this is 0,
      * the fulltext size is equal to representation size in the rev file, */
     svn_filesize_t expanded_size;
   ]]]

   I did a small experiment with different svn (1.7 / 1.8 / trunk) versions
   writing dir reps to repositories with different --compatible-versions...
   [[[
     # text: REVISION OFFSET SIZE EXPANDED_SIZE
     svn 1.7, --compatible-version=1.3 repository (db/revs/1)
      text: 1 57 28 28
     svn 1.8, --compatible-version=1.3 repository (db/revs/1)
      text: 1 57 28 0
   ]]]

   ...and it looks like svn 1.7.x did have the EXPANDED_SIZE equal to SIZE, but
   this behavior changed with 1.8.x. In 1.8.x, EXPANDED_SIZE is set to 0 for
   these representations, so, attemping to read them ends with an
   "E200002: Serialized hash missing terminator" error.

   See the attached 'rep-description-dumps' file.

   NOTE: This code is a part of the recover_find_max_ids() routine and, worth
   mentioning, has some history.  There is r1404675 [4], which added the "pick
   the right one from EXPANDED_SIZE and SIZE" logic, however, it was reverted
   in r1404708 [5].  Unfortunately, I do not yet understand the reasoning
   behind this revert, so, I guess, I'll just have to continue investigating.

3) There is an "uniquifier" concept for the representations that makes the rep
   caching possible.  Each uniquifier requires a new NODE_ID, so, for example,
   a transaction which requires 5 new NODE_IDs ends up with taking 10 of them.
   This assumption is not shared by the recovery procedure, which recovers the
   NEXT_NODE_ID / NEXT_COPY_ID for old repositories by finding the maximum
   NODE_ID / COPY_ID and bumping them by one.

   For example, for the old format Greek tree repository with db/current
   containing "1 14 1" the recovery procedure will replace the db/current
   contents with "1 13 1".  '13' in this example is the recovered NEW_NODE_ID
   (but it also is the NODE_ID reserved for the uniquifier of the last "real"
   NODE_ID).  As a consequence, hotcopying this repository (which utilizes the
   recover_find_max_ids() routine) will end up with *mismatching db/current*
   files in the hotcopy source and destination.

   It looks like making these uniquifiers for old repositories is unnecessary
   in the first place (they are only required for rep caching, which is only
   available for the newer >= SVN_FS_FS__MIN_REP_SHARING_FORMAT
   repositories).

   See set_uniquifier() in subversion/libsvn_fs_fs/transaction.c

4) The hotcopy_update_current() does something quite odd for old repositories.

   It grabs the MAX_NODE_ID / MAX_COPY_ID from svn_fs_fs__find_max_ids(),
   claims them to be the NEXT_NODE_ID / NEXT_COPY_ID and writes them into the
   db/current file.  The maximum values found by the svn_fs_fs__find_max_ids()
   routine should be bumped by one (the same way it is done in recover_body(),
   subversion/libsvn_fs_fs/recovery.c).

   Not doing so will result in the hotcopy destination having incorrect
   NEXT_NODE_ID and NEXT_COPY_ID in the db/current file (pointing to
   already taken ids).

If this might help to understand what is going on, I've made a quick-and-dirty
patch for 1.8.x branch, which explicitly points the problems and rushes toward
the green test suite.  Doing this for 1.8.x allows to postpone the problem 1)
and focus on 2-4), because, as far as I suspect, fixing 1) might get a bit
messy.  [See the attached patch #2]

An appropriate fix for this issue undoubtedly requires a lot more work.
I think I might be able to come up with a patch for the trunk in the nearby
future (in case someone does not fix it earlier).

[1] http://subversion.tigris.org/issues/show_bug.cgi?id=4213
[2] http://svn.apache.org/viewvc?view=revision&revision=r1367683
[3] http://svn.apache.org/viewvc?view=revision&revision=r1503742
[4] http://svn.apache.org/viewvc?view=revision&revision=r1404675
[5] http://svn.apache.org/viewvc?view=revision&revision=r1404708


Thanks and regards,
Evgeny Kotkov

Re: [RFC/PATCH] svnadmin: recover/hotcopy erroring out for old FSFS repositories

Posted by Ivan Zhakov <iv...@visualsvn.com>.

On 31 January 2014 14:57, Ivan Zhakov <iv...@visualsvn.com> wrote:
> On 31 January 2014 05:50, Evgeny Kotkov <ev...@visualsvn.com> wrote:
>>> This only affects non-sharded repositories with rev-local IDs,
>>> i.e. those in SVN 1.4 format. For those, it is writing 3 files
>>> instead of 2 per rev now.
>>
>> I assume you are talking about r1560723 [1].  If I read the code correctly...
>> [[[
>>       if (   (!max_files_per_dir || rev % max_files_per_dir == 0)
>>           && dst_ffd->format >= SVN_FS_FS__MIN_NO_GLOBAL_IDS_FORMAT)
>>         SVN_ERR(hotcopy_update_current(&dst_youngest, dst_fs, rev, iterpool));
>> ]]]
>>
>> ...hotcopy now checkpoints after every revision for all non-sharded repositories
>> with FSFS format >= 3.  So, checkpointing happens for all FSFS format 1/2
>> repositories upgraded via 'svnadmin upgrade' (with linear layout), that were
>> not fsfs-reshard'ed or dump/loaded into a repository with newer format.
>>
>> I am not aware of how many people reshard or dump/load their old repositories
>> after upgrade, but I did a quick benchmark for a real-world repository and on my
>> machine it shows 7x performance degradation with checkpointing enabled.  Is it
>> worth the ability to re-run the backup from a checkpoint upon cancellation?
>>
>> [[[
>>   # svnrdump http://googletest.googlecode.com/svn/trunk/
>>   # load the dump into a --compatible-version=1.3 repository
>>   # upgrade the repository with the most recent svnadmin
>>
>>   # disable checkpointing, benchmark making 100 hotcopy backups:
>>   real 0m18.741s
>>   user 0m2.552s
>>   sys 0m13.432s
>>
>>   # now enable checkpointing and repeat the benchmark:
>>   real 2m5.793s
>>   user 0m0.836s
>>   sys 0m34.840s
>> ]]]
>>
>> [1] http://svn.apache.org/viewvc?view=revision&revision=r1560723
>>
> That's what I suspected. In this case I think r1560723 should not be
> backported to 1.8.x for the following reasons:
> 1. Significant performance degradation for 'svnadmin hotcopy' with
> '--incremental' flag.
I meant "without '--incremental' flag"


-- 
Ivan Zhakov
CTO | VisualSVN | http://www.visualsvn.com

Re: [RFC/PATCH] svnadmin: recover/hotcopy erroring out for old FSFS repositories

Posted by Ivan Zhakov <iv...@visualsvn.com>.

On 7 April 2014 20:38, Stefan Fuhrmann <st...@wandisco.com> wrote:
> On Sun, Apr 6, 2014 at 5:36 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:
>>
>> On 6 April 2014 18:31, Stefan Fuhrmann <st...@wandisco.com>
>> wrote:
>> > On Wed, Apr 2, 2014 at 12:13 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:
>> >>
>> >> On 31 January 2014 14:57, Ivan Zhakov <iv...@visualsvn.com> wrote:
>> >> > On 31 January 2014 05:50, Evgeny Kotkov <ev...@visualsvn.com>
>> >> > wrote:
>> >> >>> This only affects non-sharded repositories with rev-local IDs,
>> >> >>> i.e. those in SVN 1.4 format. For those, it is writing 3 files
>> >> >>> instead of 2 per rev now.
>> >> >>
>> >> >> I assume you are talking about r1560723 [1].  If I read the code
>> >> >> correctly...
>> >> >> [[[
>> >> >>       if (   (!max_files_per_dir || rev % max_files_per_dir == 0)
>> >> >>           && dst_ffd->format >= SVN_FS_FS__MIN_NO_GLOBAL_IDS_FORMAT)
>> >> >>         SVN_ERR(hotcopy_update_current(&dst_youngest, dst_fs, rev,
>> >> >> iterpool));
>> >> >> ]]]
>> >> >>
>> >> >> ...hotcopy now checkpoints after every revision for all non-sharded
>> >> >> repositories
>> >> >> with FSFS format >= 3.  So, checkpointing happens for all FSFS
>> >> >> format
>> >> >> 1/2
>> >> >> repositories upgraded via 'svnadmin upgrade' (with linear layout),
>> >> >> that
>> >> >> were
>> >> >> not fsfs-reshard'ed or dump/loaded into a repository with newer
>> >> >> format.
>> >> >>
>> >> >> I am not aware of how many people reshard or dump/load their old
>> >> >> repositories
>> >> >> after upgrade, but I did a quick benchmark for a real-world
>> >> >> repository
>> >> >> and on my
>> >> >> machine it shows 7x performance degradation with checkpointing
>> >> >> enabled.
>> >> >> Is it
>> >> >> worth the ability to re-run the backup from a checkpoint upon
>> >> >> cancellation?
>> >> >>
>> >> >> [[[
>> >> >>   # svnrdump http://googletest.googlecode.com/svn/trunk/
>> >> >>   # load the dump into a --compatible-version=1.3 repository
>> >> >>   # upgrade the repository with the most recent svnadmin
>> >> >>
>> >> >>   # disable checkpointing, benchmark making 100 hotcopy backups:
>> >> >>   real 0m18.741s
>> >> >>   user 0m2.552s
>> >> >>   sys 0m13.432s
>> >> >>
>> >> >>   # now enable checkpointing and repeat the benchmark:
>> >> >>   real 2m5.793s
>> >> >>   user 0m0.836s
>> >> >>   sys 0m34.840s
>> >> >> ]]]
>> >> >>
>> >> >> [1] http://svn.apache.org/viewvc?view=revision&revision=r1560723
>> >> >>
>> >> > That's what I suspected. In this case I think r1560723 should not be
>> >> > backported to 1.8.x for the following reasons:
>> >> > 1. Significant performance degradation for 'svnadmin hotcopy' with
>> >> > '--incremental' flag.
>> >> > 2. The change doesn't fix original circular dependency problem in
>> >> > FSFS
>> >> > 3. 'svnadmin recover' writes fake value to CURRENT file which very
>> >> > bad
>> >> > practice IMHO and could lead undefined consequences.
>> >> >
>> >> Hi Stefan,
>> >>
>> >> Is this problem fixed or it's better to revert r1560723 for reasons I
>> >> stated above?
>> >
>> >
>> > Hi Ivan,
>> >
>> > Evgeny applied his patch a month ago as r1573744.
>> >
>> Hi Stefan,
>>
>> Yes, I know. But svnadmin hotcopy performance regression was not fixed
>> as far I know.
>
>
> Oh - sorry!
> When I read your post, I didn't expand the quote.
>
> I don't have string feelings either way. As it makes
> laggards happy, I'm fine with the revert.
>
> Reverted in r1585516.
>
Thanks a lot!

-- 
Ivan Zhakov

Re: [RFC/PATCH] svnadmin: recover/hotcopy erroring out for old FSFS repositories

Posted by Stefan Fuhrmann <st...@wandisco.com>.

On Sun, Apr 6, 2014 at 5:36 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:

> On 6 April 2014 18:31, Stefan Fuhrmann <st...@wandisco.com>
> wrote:
> > On Wed, Apr 2, 2014 at 12:13 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:
> >>
> >> On 31 January 2014 14:57, Ivan Zhakov <iv...@visualsvn.com> wrote:
> >> > On 31 January 2014 05:50, Evgeny Kotkov <ev...@visualsvn.com>
> >> > wrote:
> >> >>> This only affects non-sharded repositories with rev-local IDs,
> >> >>> i.e. those in SVN 1.4 format. For those, it is writing 3 files
> >> >>> instead of 2 per rev now.
> >> >>
> >> >> I assume you are talking about r1560723 [1].  If I read the code
> >> >> correctly...
> >> >> [[[
> >> >>       if (   (!max_files_per_dir || rev % max_files_per_dir == 0)
> >> >>           && dst_ffd->format >= SVN_FS_FS__MIN_NO_GLOBAL_IDS_FORMAT)
> >> >>         SVN_ERR(hotcopy_update_current(&dst_youngest, dst_fs, rev,
> >> >> iterpool));
> >> >> ]]]
> >> >>
> >> >> ...hotcopy now checkpoints after every revision for all non-sharded
> >> >> repositories
> >> >> with FSFS format >= 3.  So, checkpointing happens for all FSFS format
> >> >> 1/2
> >> >> repositories upgraded via 'svnadmin upgrade' (with linear layout),
> that
> >> >> were
> >> >> not fsfs-reshard'ed or dump/loaded into a repository with newer
> format.
> >> >>
> >> >> I am not aware of how many people reshard or dump/load their old
> >> >> repositories
> >> >> after upgrade, but I did a quick benchmark for a real-world
> repository
> >> >> and on my
> >> >> machine it shows 7x performance degradation with checkpointing
> enabled.
> >> >> Is it
> >> >> worth the ability to re-run the backup from a checkpoint upon
> >> >> cancellation?
> >> >>
> >> >> [[[
> >> >>   # svnrdump http://googletest.googlecode.com/svn/trunk/
> >> >>   # load the dump into a --compatible-version=1.3 repository
> >> >>   # upgrade the repository with the most recent svnadmin
> >> >>
> >> >>   # disable checkpointing, benchmark making 100 hotcopy backups:
> >> >>   real 0m18.741s
> >> >>   user 0m2.552s
> >> >>   sys 0m13.432s
> >> >>
> >> >>   # now enable checkpointing and repeat the benchmark:
> >> >>   real 2m5.793s
> >> >>   user 0m0.836s
> >> >>   sys 0m34.840s
> >> >> ]]]
> >> >>
> >> >> [1] http://svn.apache.org/viewvc?view=revision&revision=r1560723
> >> >>
> >> > That's what I suspected. In this case I think r1560723 should not be
> >> > backported to 1.8.x for the following reasons:
> >> > 1. Significant performance degradation for 'svnadmin hotcopy' with
> >> > '--incremental' flag.
> >> > 2. The change doesn't fix original circular dependency problem in FSFS
> >> > 3. 'svnadmin recover' writes fake value to CURRENT file which very bad
> >> > practice IMHO and could lead undefined consequences.
> >> >
> >> Hi Stefan,
> >>
> >> Is this problem fixed or it's better to revert r1560723 for reasons I
> >> stated above?
> >
> >
> > Hi Ivan,
> >
> > Evgeny applied his patch a month ago as r1573744.
> >
> Hi Stefan,
>
> Yes, I know. But svnadmin hotcopy performance regression was not fixed
> as far I know.
>

Oh - sorry!
When I read your post, I didn't expand the quote.

I don't have string feelings either way. As it makes
laggards happy, I'm fine with the revert.

Reverted in r1585516.

-- Stefan^2.

Re: [RFC/PATCH] svnadmin: recover/hotcopy erroring out for old FSFS repositories

Posted by Ivan Zhakov <iv...@visualsvn.com>.

On 6 April 2014 18:31, Stefan Fuhrmann <st...@wandisco.com> wrote:
> On Wed, Apr 2, 2014 at 12:13 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:
>>
>> On 31 January 2014 14:57, Ivan Zhakov <iv...@visualsvn.com> wrote:
>> > On 31 January 2014 05:50, Evgeny Kotkov <ev...@visualsvn.com>
>> > wrote:
>> >>> This only affects non-sharded repositories with rev-local IDs,
>> >>> i.e. those in SVN 1.4 format. For those, it is writing 3 files
>> >>> instead of 2 per rev now.
>> >>
>> >> I assume you are talking about r1560723 [1].  If I read the code
>> >> correctly...
>> >> [[[
>> >>       if (   (!max_files_per_dir || rev % max_files_per_dir == 0)
>> >>           && dst_ffd->format >= SVN_FS_FS__MIN_NO_GLOBAL_IDS_FORMAT)
>> >>         SVN_ERR(hotcopy_update_current(&dst_youngest, dst_fs, rev,
>> >> iterpool));
>> >> ]]]
>> >>
>> >> ...hotcopy now checkpoints after every revision for all non-sharded
>> >> repositories
>> >> with FSFS format >= 3.  So, checkpointing happens for all FSFS format
>> >> 1/2
>> >> repositories upgraded via 'svnadmin upgrade' (with linear layout), that
>> >> were
>> >> not fsfs-reshard'ed or dump/loaded into a repository with newer format.
>> >>
>> >> I am not aware of how many people reshard or dump/load their old
>> >> repositories
>> >> after upgrade, but I did a quick benchmark for a real-world repository
>> >> and on my
>> >> machine it shows 7x performance degradation with checkpointing enabled.
>> >> Is it
>> >> worth the ability to re-run the backup from a checkpoint upon
>> >> cancellation?
>> >>
>> >> [[[
>> >>   # svnrdump http://googletest.googlecode.com/svn/trunk/
>> >>   # load the dump into a --compatible-version=1.3 repository
>> >>   # upgrade the repository with the most recent svnadmin
>> >>
>> >>   # disable checkpointing, benchmark making 100 hotcopy backups:
>> >>   real 0m18.741s
>> >>   user 0m2.552s
>> >>   sys 0m13.432s
>> >>
>> >>   # now enable checkpointing and repeat the benchmark:
>> >>   real 2m5.793s
>> >>   user 0m0.836s
>> >>   sys 0m34.840s
>> >> ]]]
>> >>
>> >> [1] http://svn.apache.org/viewvc?view=revision&revision=r1560723
>> >>
>> > That's what I suspected. In this case I think r1560723 should not be
>> > backported to 1.8.x for the following reasons:
>> > 1. Significant performance degradation for 'svnadmin hotcopy' with
>> > '--incremental' flag.
>> > 2. The change doesn't fix original circular dependency problem in FSFS
>> > 3. 'svnadmin recover' writes fake value to CURRENT file which very bad
>> > practice IMHO and could lead undefined consequences.
>> >
>> Hi Stefan,
>>
>> Is this problem fixed or it's better to revert r1560723 for reasons I
>> stated above?
>
>
> Hi Ivan,
>
> Evgeny applied his patch a month ago as r1573744.
>
Hi Stefan,

Yes, I know. But svnadmin hotcopy performance regression was not fixed
as far I know.


-- 
Ivan Zhakov
CTO | VisualSVN | http://www.visualsvn.com

Re: [RFC/PATCH] svnadmin: recover/hotcopy erroring out for old FSFS repositories

Posted by Stefan Fuhrmann <st...@wandisco.com>.

On Wed, Apr 2, 2014 at 12:13 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:

> On 31 January 2014 14:57, Ivan Zhakov <iv...@visualsvn.com> wrote:
> > On 31 January 2014 05:50, Evgeny Kotkov <ev...@visualsvn.com>
> wrote:
> >>> This only affects non-sharded repositories with rev-local IDs,
> >>> i.e. those in SVN 1.4 format. For those, it is writing 3 files
> >>> instead of 2 per rev now.
> >>
> >> I assume you are talking about r1560723 [1].  If I read the code
> correctly...
> >> [[[
> >>       if (   (!max_files_per_dir || rev % max_files_per_dir == 0)
> >>           && dst_ffd->format >= SVN_FS_FS__MIN_NO_GLOBAL_IDS_FORMAT)
> >>         SVN_ERR(hotcopy_update_current(&dst_youngest, dst_fs, rev,
> iterpool));
> >> ]]]
> >>
> >> ...hotcopy now checkpoints after every revision for all non-sharded
> repositories
> >> with FSFS format >= 3.  So, checkpointing happens for all FSFS format
> 1/2
> >> repositories upgraded via 'svnadmin upgrade' (with linear layout), that
> were
> >> not fsfs-reshard'ed or dump/loaded into a repository with newer format.
> >>
> >> I am not aware of how many people reshard or dump/load their old
> repositories
> >> after upgrade, but I did a quick benchmark for a real-world repository
> and on my
> >> machine it shows 7x performance degradation with checkpointing enabled.
>  Is it
> >> worth the ability to re-run the backup from a checkpoint upon
> cancellation?
> >>
> >> [[[
> >>   # svnrdump http://googletest.googlecode.com/svn/trunk/
> >>   # load the dump into a --compatible-version=1.3 repository
> >>   # upgrade the repository with the most recent svnadmin
> >>
> >>   # disable checkpointing, benchmark making 100 hotcopy backups:
> >>   real 0m18.741s
> >>   user 0m2.552s
> >>   sys 0m13.432s
> >>
> >>   # now enable checkpointing and repeat the benchmark:
> >>   real 2m5.793s
> >>   user 0m0.836s
> >>   sys 0m34.840s
> >> ]]]
> >>
> >> [1] http://svn.apache.org/viewvc?view=revision&revision=r1560723
> >>
> > That's what I suspected. In this case I think r1560723 should not be
> > backported to 1.8.x for the following reasons:
> > 1. Significant performance degradation for 'svnadmin hotcopy' with
> > '--incremental' flag.
> > 2. The change doesn't fix original circular dependency problem in FSFS
> > 3. 'svnadmin recover' writes fake value to CURRENT file which very bad
> > practice IMHO and could lead undefined consequences.
> >
> Hi Stefan,
>
> Is this problem fixed or it's better to revert r1560723 for reasons I
> stated above?
>

Hi Ivan,

Evgeny applied his patch a month ago as r1573744.

-- Stefan^2.

Re: [RFC/PATCH] svnadmin: recover/hotcopy erroring out for old FSFS repositories

Posted by Ivan Zhakov <iv...@visualsvn.com>.

On 31 January 2014 14:57, Ivan Zhakov <iv...@visualsvn.com> wrote:
> On 31 January 2014 05:50, Evgeny Kotkov <ev...@visualsvn.com> wrote:
>>> This only affects non-sharded repositories with rev-local IDs,
>>> i.e. those in SVN 1.4 format. For those, it is writing 3 files
>>> instead of 2 per rev now.
>>
>> I assume you are talking about r1560723 [1].  If I read the code correctly...
>> [[[
>>       if (   (!max_files_per_dir || rev % max_files_per_dir == 0)
>>           && dst_ffd->format >= SVN_FS_FS__MIN_NO_GLOBAL_IDS_FORMAT)
>>         SVN_ERR(hotcopy_update_current(&dst_youngest, dst_fs, rev, iterpool));
>> ]]]
>>
>> ...hotcopy now checkpoints after every revision for all non-sharded repositories
>> with FSFS format >= 3.  So, checkpointing happens for all FSFS format 1/2
>> repositories upgraded via 'svnadmin upgrade' (with linear layout), that were
>> not fsfs-reshard'ed or dump/loaded into a repository with newer format.
>>
>> I am not aware of how many people reshard or dump/load their old repositories
>> after upgrade, but I did a quick benchmark for a real-world repository and on my
>> machine it shows 7x performance degradation with checkpointing enabled.  Is it
>> worth the ability to re-run the backup from a checkpoint upon cancellation?
>>
>> [[[
>>   # svnrdump http://googletest.googlecode.com/svn/trunk/
>>   # load the dump into a --compatible-version=1.3 repository
>>   # upgrade the repository with the most recent svnadmin
>>
>>   # disable checkpointing, benchmark making 100 hotcopy backups:
>>   real 0m18.741s
>>   user 0m2.552s
>>   sys 0m13.432s
>>
>>   # now enable checkpointing and repeat the benchmark:
>>   real 2m5.793s
>>   user 0m0.836s
>>   sys 0m34.840s
>> ]]]
>>
>> [1] http://svn.apache.org/viewvc?view=revision&revision=r1560723
>>
> That's what I suspected. In this case I think r1560723 should not be
> backported to 1.8.x for the following reasons:
> 1. Significant performance degradation for 'svnadmin hotcopy' with
> '--incremental' flag.
> 2. The change doesn't fix original circular dependency problem in FSFS
> 3. 'svnadmin recover' writes fake value to CURRENT file which very bad
> practice IMHO and could lead undefined consequences.
>
Hi Stefan,

Is this problem fixed or it's better to revert r1560723 for reasons I
stated above?

-- 
Ivan Zhakov

Re: [RFC/PATCH] svnadmin: recover/hotcopy erroring out for old FSFS repositories

Posted by Ivan Zhakov <iv...@visualsvn.com>.

On 31 January 2014 05:50, Evgeny Kotkov <ev...@visualsvn.com> wrote:
>> This only affects non-sharded repositories with rev-local IDs,
>> i.e. those in SVN 1.4 format. For those, it is writing 3 files
>> instead of 2 per rev now.
>
> I assume you are talking about r1560723 [1].  If I read the code correctly...
> [[[
>       if (   (!max_files_per_dir || rev % max_files_per_dir == 0)
>           && dst_ffd->format >= SVN_FS_FS__MIN_NO_GLOBAL_IDS_FORMAT)
>         SVN_ERR(hotcopy_update_current(&dst_youngest, dst_fs, rev, iterpool));
> ]]]
>
> ...hotcopy now checkpoints after every revision for all non-sharded repositories
> with FSFS format >= 3.  So, checkpointing happens for all FSFS format 1/2
> repositories upgraded via 'svnadmin upgrade' (with linear layout), that were
> not fsfs-reshard'ed or dump/loaded into a repository with newer format.
>
> I am not aware of how many people reshard or dump/load their old repositories
> after upgrade, but I did a quick benchmark for a real-world repository and on my
> machine it shows 7x performance degradation with checkpointing enabled.  Is it
> worth the ability to re-run the backup from a checkpoint upon cancellation?
>
> [[[
>   # svnrdump http://googletest.googlecode.com/svn/trunk/
>   # load the dump into a --compatible-version=1.3 repository
>   # upgrade the repository with the most recent svnadmin
>
>   # disable checkpointing, benchmark making 100 hotcopy backups:
>   real 0m18.741s
>   user 0m2.552s
>   sys 0m13.432s
>
>   # now enable checkpointing and repeat the benchmark:
>   real 2m5.793s
>   user 0m0.836s
>   sys 0m34.840s
> ]]]
>
> [1] http://svn.apache.org/viewvc?view=revision&revision=r1560723
>
That's what I suspected. In this case I think r1560723 should not be
backported to 1.8.x for the following reasons:
1. Significant performance degradation for 'svnadmin hotcopy' with
'--incremental' flag.
2. The change doesn't fix original circular dependency problem in FSFS
3. 'svnadmin recover' writes fake value to CURRENT file which very bad
practice IMHO and could lead undefined consequences.


-- 
Ivan Zhakov
CTO | VisualSVN | http://www.visualsvn.com

Re: [RFC/PATCH] svnadmin: recover/hotcopy erroring out for old FSFS repositories

Posted by Evgeny Kotkov <ev...@visualsvn.com>.

> This only affects non-sharded repositories with rev-local IDs,
> i.e. those in SVN 1.4 format. For those, it is writing 3 files
> instead of 2 per rev now.

I assume you are talking about r1560723 [1].  If I read the code correctly...
[[[
      if (   (!max_files_per_dir || rev % max_files_per_dir == 0)
          && dst_ffd->format >= SVN_FS_FS__MIN_NO_GLOBAL_IDS_FORMAT)
        SVN_ERR(hotcopy_update_current(&dst_youngest, dst_fs, rev, iterpool));
]]]

...hotcopy now checkpoints after every revision for all non-sharded repositories
with FSFS format >= 3.  So, checkpointing happens for all FSFS format 1/2
repositories upgraded via 'svnadmin upgrade' (with linear layout), that were
not fsfs-reshard'ed or dump/loaded into a repository with newer format.

I am not aware of how many people reshard or dump/load their old repositories
after upgrade, but I did a quick benchmark for a real-world repository and on my
machine it shows 7x performance degradation with checkpointing enabled.  Is it
worth the ability to re-run the backup from a checkpoint upon cancellation?

[[[
  # svnrdump http://googletest.googlecode.com/svn/trunk/
  # load the dump into a --compatible-version=1.3 repository
  # upgrade the repository with the most recent svnadmin

  # disable checkpointing, benchmark making 100 hotcopy backups:
  real 0m18.741s
  user 0m2.552s
  sys 0m13.432s

  # now enable checkpointing and repeat the benchmark:
  real 2m5.793s
  user 0m0.836s
  sys 0m34.840s
]]]

[1] http://svn.apache.org/viewvc?view=revision&revision=r1560723


Regards,
Evgeny Kotkov

Re: [RFC/PATCH] svnadmin: recover/hotcopy erroring out for old FSFS repositories

Posted by Stefan Fuhrmann <st...@wandisco.com>.

On Mon, Jan 27, 2014 at 10:25 AM, Ivan Zhakov <iv...@visualsvn.com> wrote:

> Hi Stefan,
>
>
> On 27 January 2014 11:21, Stefan Fuhrmann <st...@wandisco.com>
> wrote:
> > On Thu, Jan 23, 2014 at 11:00 AM, Evgeny Kotkov
> > <ev...@visualsvn.com> wrote:
> >>
> >>      (wants to update the db/current file, but doesn't know the next
> ids)
> >>
> >>       svn_fs_fs__find_max_ids()
> >>         svn_fs_fs__rev_get_root()
> >>           svn_fs_fs__ensure_revision_exists()
> >>           ...
> >>           (reads the most recent db/current file in order to check the
> >> presence
> >>            of this revision. it is reported missing, because we're right
> >> in the
> >>            middle of updating the db/current file and cannot do that
> yet,
> >>            because we still do not know the next ids)
> >>
> >>    So, basically, we're calling a routine that assumes the db/current
> file
> >> is
> >>    up-to-date in order to get the information on how to update the
> >> outdated
> >>    db/current file.  This ends with an "E160006: No such revision 1"
> >> error.
> >
> >
> > Fixed in /trunk and proposed for 1.8.x. The problem was that
> > you had to bump the youngest rev before you could access
> > the latest rev in the destination repo.
> >
> Are you sure about this fix? Bumping youngest rev after copy every
> revision will slow down 'svnadmin hotcopy' significantly. Did we bump
> youngest rev in 1.7.x?
>

This only affects non-sharded repositories with rev-local IDs,
i.e. those in SVN 1.4 format. For those, it is writing 3 files
instead of 2 per rev now.

The question is why people would not upgrade (shard) those
repositories while they are still in use (thus, the need for hotcopy).
The only use-case I can think of is a repo with a moderate number
of revs but those revs being quite large. In that case, updating
one more file per rev shouldn't be significant.

On trunk, we get an extra benefit for that use-case: we can
interrupt the hotcopy at any time and continue with --incremental.

Do you think it is worth adding another patch to 1.8.x that skips
the revnum bump for 1.4 repos?

-- Stefan^2.

Re: [RFC/PATCH] svnadmin: recover/hotcopy erroring out for old FSFS repositories

Posted by Ivan Zhakov <iv...@visualsvn.com>.

Hi Stefan,


On 27 January 2014 11:21, Stefan Fuhrmann <st...@wandisco.com> wrote:
> On Thu, Jan 23, 2014 at 11:00 AM, Evgeny Kotkov
> <ev...@visualsvn.com> wrote:
>>
>>      (wants to update the db/current file, but doesn't know the next ids)
>>
>>       svn_fs_fs__find_max_ids()
>>         svn_fs_fs__rev_get_root()
>>           svn_fs_fs__ensure_revision_exists()
>>           ...
>>           (reads the most recent db/current file in order to check the
>> presence
>>            of this revision. it is reported missing, because we're right
>> in the
>>            middle of updating the db/current file and cannot do that yet,
>>            because we still do not know the next ids)
>>
>>    So, basically, we're calling a routine that assumes the db/current file
>> is
>>    up-to-date in order to get the information on how to update the
>> outdated
>>    db/current file.  This ends with an "E160006: No such revision 1"
>> error.
>
>
> Fixed in /trunk and proposed for 1.8.x. The problem was that
> you had to bump the youngest rev before you could access
> the latest rev in the destination repo.
>
Are you sure about this fix? Bumping youngest rev after copy every
revision will slow down 'svnadmin hotcopy' significantly. Did we bump
youngest rev in 1.7.x?


-- 
Ivan Zhakov
CTO | VisualSVN | http://www.visualsvn.com

Re: [RFC/PATCH] svnadmin: recover/hotcopy erroring out for old FSFS repositories

Posted by Stefan Fuhrmann <st...@wandisco.com>.

On Thu, Jan 23, 2014 at 11:00 AM, Evgeny Kotkov <evgeny.kotkov@visualsvn.com
> wrote:

> Hi,
>
> I've noticed that svnadmin recover and hotcopy commands are broken for old
> repositories around trunk@1560210.  The problem can be reproduced by
> creating
> a new --compatible-version=1.3 / 1.2 / 1.1 FSFS repository, doing something
> simple with it (say, running svn mkdir or svn checkout / add file /
> commit) and
> attempting to recover or hotcopy it:
> [[[
>   (svnadmin-trunk recover REPOS)
>    svnadmin: E200002: Serialized hash missing terminator
>
>   (svnadmin-trunk hotcopy REPOS REPOS_BACKUP)
>    svnadmin: E160006: No such revision 1
> ]]]
>
> Alternatively, this can be reproduced by running svnadmin_tests.py with,
> for
> example, --server-minor-version=3.  This problem exists in both 1.8.x and
> trunk.  In 1.8.x, however, the errors for both hotcopy and recover are
> visually
> identical:
> [[[
>   (svnadmin-1.8.x recover REPOS)
>    svnadmin: E200002: Serialized hash missing terminator
>
>   (svnadmin-1.8.x hotcopy REPOS REPOS_BACKUP)
>    svnadmin: E200002: Serialized hash missing terminator
> ]]]


> Overall, this looks like a regression since 1.7.x, where hotcopy works just
> fine.  The 1.7.x recover also fails for these repositories with an
> "svnadmin: E000002: Can't open file 'X/db/revs/2': No such file or
> directory"
> error, but this is a well-known issue 4213 [1], which should be fixed by
> r1367683 [2].
>
> Being unable to recover and hotcopy old repositories with the most recent
> svnadmin version seems like an important issue, so I think it's worth a
> couple
> of tests.  [See the attached patch #1]
>
> I did a quick investigation and spotted a couple of odd moments in the
> related code:
>
> 1) First of all, there is a reason for the different error messages in
> trunk
>    and in the 1.8.x branch.
>
>    This is a regression introduced in r1503742 [3], because now calling the
>    svn_fs_fs__find_max_ids() routine during hotcopy creates a cyclic
>    dependency.  Hotcopy needs to update the db/current file, so it uses
> this
>    routine to get the NEXT_NODE_ID / NEXT_COPY_ID for old repositories.
>    However, this currently behaves the following way:
>
>    hotcopy_update_current()
>      ...
>      (wants to update the db/current file, but doesn't know the next ids)
>
>       svn_fs_fs__find_max_ids()
>         svn_fs_fs__rev_get_root()
>           svn_fs_fs__ensure_revision_exists()
>           ...
>           (reads the most recent db/current file in order to check the
> presence
>            of this revision. it is reported missing, because we're right
> in the
>            middle of updating the db/current file and cannot do that yet,
>            because we still do not know the next ids)
>
>    So, basically, we're calling a routine that assumes the db/current file
> is
>    up-to-date in order to get the information on how to update the outdated
>    db/current file.  This ends with an "E160006: No such revision 1" error.
>

Fixed in /trunk and proposed for 1.8.x. The problem was that
you had to bump the youngest rev before you could access
the latest rev in the destination repo.

2) The reason for the "hash missing terminator" error is a bit trickier.
>
>    The NEXT_NODE_ID / NEXT_COPY_ID recovery code recursively crawls
>    through the dir representations in the db, however, it assumes that the
>    EXPANDED_SIZE in the rep description always is non-zero.  According to
>    the docstring in subversion/libsvn_fs_fs/fs.h, this assumption is
> incorrect:
>    [[[
>      /* The size of the representation in bytes as seen in the revision
>         file. */
>      svn_filesize_t size;
>
>      /* The size of the fulltext of the representation. If this is 0,
>       * the fulltext size is equal to representation size in the rev file,
> */
>      svn_filesize_t expanded_size;
>    ]]]
>
>    I did a small experiment with different svn (1.7 / 1.8 / trunk) versions
>    writing dir reps to repositories with different --compatible-versions...
>    [[[
>      # text: REVISION OFFSET SIZE EXPANDED_SIZE
>      svn 1.7, --compatible-version=1.3 repository (db/revs/1)
>       text: 1 57 28 28
>      svn 1.8, --compatible-version=1.3 repository (db/revs/1)
>       text: 1 57 28 0
>    ]]]
>
>    ...and it looks like svn 1.7.x did have the EXPANDED_SIZE equal to
> SIZE, but
>    this behavior changed with 1.8.x. In 1.8.x, EXPANDED_SIZE is set to 0
> for
>    these representations, so, attemping to read them ends with an
>    "E200002: Serialized hash missing terminator" error.
>
>    See the attached 'rep-description-dumps' file.
>
>    NOTE: This code is a part of the recover_find_max_ids() routine and,
> worth
>    mentioning, has some history.  There is r1404675 [4], which added the
> "pick
>    the right one from EXPANDED_SIZE and SIZE" logic, however, it was
> reverted
>    in r1404708 [5].  Unfortunately, I do not yet understand the reasoning
>    behind this revert, so, I guess, I'll just have to continue
> investigating.
>

That one gave me quite a headache - the fix was simple
but I had to check when EXPANDED_SIZE would be set
to 0 in old releases and whether old releases would work
with the new cases introduced in 1.8.x.

As it turned out, this only happens to non-deltified directories
and those use the same reader logic as properties, i.e.
they happen to work (recovery being the only broken part).

Recovery has been fixed and we also never write 0 for
EXPANDED_SIZE anymore - both on /trunk and as a
backport for 1.8.x.

3) There is an "uniquifier" concept for the representations that makes the
> rep
>    caching possible.  Each uniquifier requires a new NODE_ID, so, for
> example,
>    a transaction which requires 5 new NODE_IDs ends up with taking 10 of
> them.
>    This assumption is not shared by the recovery procedure, which recovers
> the
>    NEXT_NODE_ID / NEXT_COPY_ID for old repositories by finding the maximum
>    NODE_ID / COPY_ID and bumping them by one.
>
>    For example, for the old format Greek tree repository with db/current
>    containing "1 14 1" the recovery procedure will replace the db/current
>    contents with "1 13 1".  '13' in this example is the recovered
> NEW_NODE_ID
>    (but it also is the NODE_ID reserved for the uniquifier of the last
> "real"
>    NODE_ID).  As a consequence, hotcopying this repository (which utilizes
> the
>    recover_find_max_ids() routine) will end up with *mismatching
> db/current*
>    files in the hotcopy source and destination.
>
>    It looks like making these uniquifiers for old repositories is
> unnecessary
>    in the first place (they are only required for rep caching, which is
> only
>    available for the newer >= SVN_FS_FS__MIN_REP_SHARING_FORMAT
>    repositories).
>
>    See set_uniquifier() in subversion/libsvn_fs_fs/transaction.c
>

A simple oversight. Fixed on /trunk & proposed for 1.8.x.


> 4) The hotcopy_update_current() does something quite odd for old
> repositories.
>
>    It grabs the MAX_NODE_ID / MAX_COPY_ID from svn_fs_fs__find_max_ids(),
>    claims them to be the NEXT_NODE_ID / NEXT_COPY_ID and writes them into
> the
>    db/current file.  The maximum values found by the
> svn_fs_fs__find_max_ids()
>    routine should be bumped by one (the same way it is done in
> recover_body(),
>    subversion/libsvn_fs_fs/recovery.c).
>
>    Not doing so will result in the hotcopy destination having incorrect
>    NEXT_NODE_ID and NEXT_COPY_ID in the db/current file (pointing to
>    already taken ids).
>

Yup. Off-by-one. Also fixed.

If this might help to understand what is going on, I've made a
> quick-and-dirty
> patch for 1.8.x branch, which explicitly points the problems and rushes
> toward
> the green test suite.  Doing this for 1.8.x allows to postpone the problem
> 1)
> and focus on 2-4), because, as far as I suspect, fixing 1) might get a bit
> messy.  [See the attached patch #2]
>
> An appropriate fix for this issue undoubtedly requires a lot more work.
> I think I might be able to come up with a patch for the trunk in the nearby
> future (in case someone does not fix it earlier).
>
> [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4213
> [2] http://svn.apache.org/viewvc?view=revision&revision=r1367683
> [3] http://svn.apache.org/viewvc?view=revision&revision=r1503742
> [4] http://svn.apache.org/viewvc?view=revision&revision=r1404675
> [5] http://svn.apache.org/viewvc?view=revision&revision=r1404708
>
>
Thanks for the keen eye!

-- Stefan^2.

RE: [RFC/PATCH] svnadmin: recover/hotcopy erroring out for old FSFS repositories

Posted by Bert Huijben <be...@qqmail.nl>.


> -----Original Message-----
> From: Evgeny Kotkov [mailto:evgeny.kotkov@visualsvn.com]
> Sent: donderdag 23 januari 2014 11:00
> To: dev@subversion.apache.org
> Subject: [RFC/PATCH] svnadmin: recover/hotcopy erroring out for old FSFS
> repositories

<snip>

> If this might help to understand what is going on, I've made a
quick-and-dirty
> patch for 1.8.x branch, which explicitly points the problems and rushes
> toward
> the green test suite.  Doing this for 1.8.x allows to postpone the problem
1)
> and focus on 2-4), because, as far as I suspect, fixing 1) might get a bit
> messy.  [See the attached patch #2]

Thanks fort his detailed error report. I commited your testcases to make it
very clear that we have some problem to solve.

I'll wait for others to respond in depth, as I'm not really familiar enough
with this code yet, to choose the right approach here. (I see multiple kinds
of regressions in your report)

> 
> An appropriate fix for this issue undoubtedly requires a lot more work.
> I think I might be able to come up with a patch for the trunk in the
nearby
> future (in case someone does not fix it earlier).
> 
> [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4213
> [2] http://svn.apache.org/viewvc?view=revision&revision=r1367683
> [3] http://svn.apache.org/viewvc?view=revision&revision=r1503742
> [4] http://svn.apache.org/viewvc?view=revision&revision=r1404675
> [5] http://svn.apache.org/viewvc?view=revision&revision=r1404708

Thanks,

	Bert
> 
> 
> Thanks and regards,
> Evgeny Kotkov

Re: [RFC/PATCH] svnadmin: recover/hotcopy erroring out for old FSFS repositories

Posted by Stefan Fuhrmann <st...@wandisco.com>.

On Thu, Jan 23, 2014 at 11:00 AM, Evgeny Kotkov <evgeny.kotkov@visualsvn.com
> wrote:

> Hi,
>
> I've noticed that svnadmin recover and hotcopy commands are broken for old
> repositories around trunk@1560210.  The problem can be reproduced by
> creating
> a new --compatible-version=1.3 / 1.2 / 1.1 FSFS repository, doing something
> simple with it (say, running svn mkdir or svn checkout / add file /
> commit) and
> attempting to recover or hotcopy it:
> [[[
>   (svnadmin-trunk recover REPOS)
>    svnadmin: E200002: Serialized hash missing terminator
>
>   (svnadmin-trunk hotcopy REPOS REPOS_BACKUP)
>    svnadmin: E160006: No such revision 1
> ]]]
>
> Alternatively, this can be reproduced by running svnadmin_tests.py with,
> for
> example, --server-minor-version=3.  This problem exists in both 1.8.x and
> trunk.  In 1.8.x, however, the errors for both hotcopy and recover are
> visually
> identical:
> [[[
>   (svnadmin-1.8.x recover REPOS)
>    svnadmin: E200002: Serialized hash missing terminator
>
>   (svnadmin-1.8.x hotcopy REPOS REPOS_BACKUP)
>    svnadmin: E200002: Serialized hash missing terminator
> ]]]
>
> Overall, this looks like a regression since 1.7.x, where hotcopy works just
> fine.  The 1.7.x recover also fails for these repositories with an
> "svnadmin: E000002: Can't open file 'X/db/revs/2': No such file or
> directory"
> error, but this is a well-known issue 4213 [1], which should be fixed by
> r1367683 [2].
>
> Being unable to recover and hotcopy old repositories with the most recent
> svnadmin version seems like an important issue, so I think it's worth a
> couple
> of tests.  [See the attached patch #1]
>
> I did a quick investigation and spotted a couple of odd moments in the
> related code:
>
> 1) First of all, there is a reason for the different error messages in
> trunk
>    and in the 1.8.x branch.
>
>    This is a regression introduced in r1503742 [3], because now calling the
>    svn_fs_fs__find_max_ids() routine during hotcopy creates a cyclic
>    dependency.  Hotcopy needs to update the db/current file, so it uses
> this
>    routine to get the NEXT_NODE_ID / NEXT_COPY_ID for old repositories.
>    However, this currently behaves the following way:
>
>    hotcopy_update_current()
>      ...
>      (wants to update the db/current file, but doesn't know the next ids)
>
>       svn_fs_fs__find_max_ids()
>         svn_fs_fs__rev_get_root()
>           svn_fs_fs__ensure_revision_exists()
>           ...
>           (reads the most recent db/current file in order to check the
> presence
>            of this revision. it is reported missing, because we're right
> in the
>            middle of updating the db/current file and cannot do that yet,
>            because we still do not know the next ids)
>
>    So, basically, we're calling a routine that assumes the db/current file
> is
>    up-to-date in order to get the information on how to update the outdated
>    db/current file.  This ends with an "E160006: No such revision 1" error.
>
> 2) The reason for the "hash missing terminator" error is a bit trickier.
>
>    The NEXT_NODE_ID / NEXT_COPY_ID recovery code recursively crawls
>    through the dir representations in the db, however, it assumes that the
>    EXPANDED_SIZE in the rep description always is non-zero.  According to
>    the docstring in subversion/libsvn_fs_fs/fs.h, this assumption is
> incorrect:
>    [[[
>      /* The size of the representation in bytes as seen in the revision
>         file. */
>      svn_filesize_t size;
>
>      /* The size of the fulltext of the representation. If this is 0,
>       * the fulltext size is equal to representation size in the rev file,
> */
>      svn_filesize_t expanded_size;
>    ]]]
>
>    I did a small experiment with different svn (1.7 / 1.8 / trunk) versions
>    writing dir reps to repositories with different --compatible-versions...
>    [[[
>      # text: REVISION OFFSET SIZE EXPANDED_SIZE
>      svn 1.7, --compatible-version=1.3 repository (db/revs/1)
>       text: 1 57 28 28
>      svn 1.8, --compatible-version=1.3 repository (db/revs/1)
>       text: 1 57 28 0
>    ]]]
>
>    ...and it looks like svn 1.7.x did have the EXPANDED_SIZE equal to
> SIZE, but
>    this behavior changed with 1.8.x. In 1.8.x, EXPANDED_SIZE is set to 0
> for
>    these representations, so, attemping to read them ends with an
>    "E200002: Serialized hash missing terminator" error.
>
>    See the attached 'rep-description-dumps' file.
>
>    NOTE: This code is a part of the recover_find_max_ids() routine and,
> worth
>    mentioning, has some history.  There is r1404675 [4], which added the
> "pick
>    the right one from EXPANDED_SIZE and SIZE" logic, however, it was
> reverted
>    in r1404708 [5].  Unfortunately, I do not yet understand the reasoning
>    behind this revert, so, I guess, I'll just have to continue
> investigating.
>
> 3) There is an "uniquifier" concept for the representations that makes the
> rep
>    caching possible.  Each uniquifier requires a new NODE_ID, so, for
> example,
>    a transaction which requires 5 new NODE_IDs ends up with taking 10 of
> them.
>    This assumption is not shared by the recovery procedure, which recovers
> the
>    NEXT_NODE_ID / NEXT_COPY_ID for old repositories by finding the maximum
>    NODE_ID / COPY_ID and bumping them by one.
>
>    For example, for the old format Greek tree repository with db/current
>    containing "1 14 1" the recovery procedure will replace the db/current
>    contents with "1 13 1".  '13' in this example is the recovered
> NEW_NODE_ID
>    (but it also is the NODE_ID reserved for the uniquifier of the last
> "real"
>    NODE_ID).  As a consequence, hotcopying this repository (which utilizes
> the
>    recover_find_max_ids() routine) will end up with *mismatching
> db/current*
>    files in the hotcopy source and destination.
>
>    It looks like making these uniquifiers for old repositories is
> unnecessary
>    in the first place (they are only required for rep caching, which is
> only
>    available for the newer >= SVN_FS_FS__MIN_REP_SHARING_FORMAT
>    repositories).
>
>    See set_uniquifier() in subversion/libsvn_fs_fs/transaction.c
>
> 4) The hotcopy_update_current() does something quite odd for old
> repositories.
>
>    It grabs the MAX_NODE_ID / MAX_COPY_ID from svn_fs_fs__find_max_ids(),
>    claims them to be the NEXT_NODE_ID / NEXT_COPY_ID and writes them into
> the
>    db/current file.  The maximum values found by the
> svn_fs_fs__find_max_ids()
>    routine should be bumped by one (the same way it is done in
> recover_body(),
>    subversion/libsvn_fs_fs/recovery.c).
>
>    Not doing so will result in the hotcopy destination having incorrect
>    NEXT_NODE_ID and NEXT_COPY_ID in the db/current file (pointing to
>    already taken ids).
>
> If this might help to understand what is going on, I've made a
> quick-and-dirty
> patch for 1.8.x branch, which explicitly points the problems and rushes
> toward
> the green test suite.  Doing this for 1.8.x allows to postpone the problem
> 1)
> and focus on 2-4), because, as far as I suspect, fixing 1) might get a bit
> messy.  [See the attached patch #2]
>
> An appropriate fix for this issue undoubtedly requires a lot more work.
> I think I might be able to come up with a patch for the trunk in the nearby
> future (in case someone does not fix it earlier).
>
> [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4213
> [2] http://svn.apache.org/viewvc?view=revision&revision=r1367683
> [3] http://svn.apache.org/viewvc?view=revision&revision=r1503742
> [4] http://svn.apache.org/viewvc?view=revision&revision=r1404675
> [5] http://svn.apache.org/viewvc?view=revision&revision=r1404708
>
>
> Thanks and regards,
> Evgeny Kotkov
>

Hi Evgeny,

Thanks for the patches. I've been digging into the code
to figure out how and why the size==0 came about. But
I'll not be able to finish that bit before Saturday.

-- Stefan^2.