You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Alfred von Campe <al...@von-campe.com> on 2014/10/03 23:05:06 UTC

Strange problem with filtering dump files

We have a repo that we want to split up into multiple repos.  The strategy to do this is fairly simple:

  1. Dump repo using svnadmin dump
  2. Construct a list of paths we want to exclude by greping for Node-path: in the dump file and greping that for certain directory names to exclude
  3. Use svndumpsanitizer (svndumpfilter has too many issues) to either include or exclude directories from list generated in previous step to create new dump file
  4. Create new repos and use svnadmin load from new dump files

This mostly works except for one very bizarre problem.  One particular directory I am trying to filter out consistently only gets partially excluded.  For example, imagine a subdirectory named foo that contains a few subdirectories and a few dozen source files.  I use trunk/foo (and branches/branch1/foo, branches/branch2/foo, etc.) as the paths to exclude, yet the directory foo and *one* of its subdirectories with *some* of its source files are not filtered from the trunk and all branches.  I can’t figure out what’s different or special about these files.

What is the best way to debug this issue?  This is my first time trying to split a repository.  In case it matters, we are using Subversion 1.7.5 on the server (svnadmin dump) and I am doing all my testing with a 1.8.10 client.

Alfred




Re: Strange problem with filtering dump files

Posted by Ryan Schmidt <su...@ryandesign.com>.
On Oct 3, 2014, at 4:05 PM, Alfred von Campe wrote:

> We have a repo that we want to split up into multiple repos.  The strategy to do this is fairly simple:
> 
>  1. Dump repo using svnadmin dump
>  2. Construct a list of paths we want to exclude by greping for Node-path: in the dump file and greping that for certain directory names to exclude
>  3. Use svndumpsanitizer (svndumpfilter has too many issues) to either include or exclude directories from list generated in previous step to create new dump file
>  4. Create new repos and use svnadmin load from new dump files
> 
> This mostly works except for one very bizarre problem.  One particular directory I am trying to filter out consistently only gets partially excluded.  For example, imagine a subdirectory named foo that contains a few subdirectories and a few dozen source files.  I use trunk/foo (and branches/branch1/foo, branches/branch2/foo, etc.) as the paths to exclude, yet the directory foo and *one* of its subdirectories with *some* of its source files are not filtered from the trunk and all branches.  I can’t figure out what’s different or special about these files.
> 
> What is the best way to debug this issue?  This is my first time trying to split a repository.  In case it matters, we are using Subversion 1.7.5 on the server (svnadmin dump) and I am doing all my testing with a 1.8.10 client.

Were these files/directories ever anywhere else in the repository (i.e. were they renamed or moved to their present location at some point)? Try filtering out those paths as well, even if they don't exist anymore in the head of the repository.