You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@subversion.apache.org by Rajesh Kumar <Ra...@jda.com> on 2015/03/08 05:57:51 UTC

Copy and Reduce the size of SVn repos

I have one Huge SVN repos which is around 1TB in terms of size. I have two requirement as follows and i would like to know the best approach to be followed to save time and effort.
1.    Duplicating the whole repos of 1TB in shorter span of time and create another SVN repos.
2.    How to reduce the Repos size drastically without impacting the integrity and version of the files? Here my repos size is 1TB and i want to make it smaller without deleting any files? what are the ways of doing so.....?

-Rajesh

Don’t Miss FOCUS 2015 Orlando – over 100 customer-led sessions and 1 Grammy-winning singer! Learn More > <http://www.jda.com/focus>

Re: Copy and Reduce the size of SVn repos

Posted by Les Mikesell <le...@gmail.com>.

On Wed, Mar 11, 2015 at 2:00 AM, Stümpfig, Thomas
<th...@siemens.com> wrote:
> l
> Actually splitting projects is not a solution to something that eliminates old data.

Correct, but if we give up on getting a working obliterate, we are
left with dump/filter/load as the only way to administer content.  And
as a practical matter, how many dump/filter/load cycles do you want to
do on repositories after they go over a few hundred gigs with all of
your development teams waiting for you to get the filters right to
match all the distributed cruft?    Also, in many cases over the years
whole projects become obsolete so getting rid of or archiving that
part would be easy if you had used the 'directory of repositories'
approach instead of 'repository of projects' and everything would have
worked about the same.

-- 
   Les Mikesell
     lesmikesell@gmail.com

RE: Copy and Reduce the size of SVn repos

Posted by Stümpfig, Thomas <th...@siemens.com>.

Hi all
Actually splitting projects is not a solution to something that eliminates old data. Think of a project with one file only. Legally one might be forced to keep the file at least for 5 or 10 Years. But after this period the very same old revisions of the file must be destroyed because of other legal or contractual obligations. There is enough reason for final deletion of old data. I appreciate very much the work of open source programmers. And as a matter of fact we deal with svn's limitations as we use it with much success for our purposes.  Said this, one of the most wanted feature is obliteration.

Regards
Thomas

-----Original Message-----
From: Les Mikesell [mailto:lesmikesell@gmail.com]
Sent: Dienstag, 10. März 2015 23:37
To: Nico Kadel-Garcia
Cc: Branko Čibej; Subversion
Subject: Re: Copy and Reduce the size of SVn repos

On Sun, Mar 8, 2015 at 8:27 PM, Nico Kadel-Garcia <nk...@gmail.com> wrote:
> >>
>> Heh, I have to ask, where did you find that doctrine? There's no such
>> thing. It's all a lot more mundane: First, you have to get people to
>
> I've had to deal with that doctrine personally and professionally
> since first working with Subversion in 2006. It comes up again eveyr
> so often, for example in
> http://subversion.tigris.org/issues/show_bug.cgi?id=516 and is
> relevant to the original poster's request.
>
> There can be both software and legal reasons to ensure that the
> history is pristine and never forgets a single byte. But in most
> shops, for any lengthy project, *someone* is going to submit
> unnecessary bulky binaries, and *someone* is going to create spurious
> branches, tags, or other subdirectories that should go the way of the
> passenger pigeon.
>
>> agree what "obliterate" actually means; there are about five meanings
>> that I know of. And second, all five are insanely hard to implement
>> with our current repository design (just ask Julian, he spent about a
>> year trying to come up with a sane, moderately backwards-compatible solution).
>>
>> -- Brane
>
> I appreciate that it's been awkward. The only workable method method
> now is the sort of "svn export; svn import to new repo and discard old
> repo" that I described, or a potentially dangerous and often fragile
> dump, filter, and reload approach that preserves the original URL's
> for the repo, but it's really not the same repo.
>
> It remains messy as heck. This is, in fact, one of the places where
> git or other systems's more gracious exclusion or garbage collection
> tools doe better. Even CVS had the ability to simply delete a
> directory on the main fileserver to discard old debris: it's one of
> the risks of the more database based approach of Subversion to
> managing the entire repository history.

Maybe it is time to change the request from 'obliterate' to _any_
reasonable way to fix a repository that has accumulated cruft.   And a
big warning to new users to put separate projects in separate repositories from the start because they are too hard to untangle later.  I've considered dumping ours and trying to split by project, but I'm not even sure that is possible because many were imported from CVS then subsequently moved to improve the layout.  So I can't really filter by path.

--
   Les Mikesell
     lesmikesell@gmail.com
-----------------
Siemens Industry Software GmbH & Co. KG; Anschrift: Franz-Geuer-Str. 10, 50823 Köln;
Kommanditgesellschaft: Sitz der Gesellschaft: Köln; Registergericht: Amtsgericht Köln, HRA 28227;
Geschäftsführung und persönlich haftender Gesellschafter: Siemens Industry Software Management GmbH;
Geschäftsführer: Urban August, Daniel Trebes; Sitz der Gesellschaft: Köln; Registergericht: Amtsgericht Köln, HRB 70858

Re: Copy and Reduce the size of SVn repos

Posted by Les Mikesell <le...@gmail.com>.

On Sun, Mar 8, 2015 at 8:27 PM, Nico Kadel-Garcia <nk...@gmail.com> wrote:
> >>
>> Heh, I have to ask, where did you find that doctrine? There's no such
>> thing. It's all a lot more mundane: First, you have to get people to
>
> I've had to deal with that doctrine personally and professionally
> since first working with Subversion in 2006. It comes up again eveyr
> so often, for example in
> http://subversion.tigris.org/issues/show_bug.cgi?id=516 and is
> relevant to the original poster's request.
>
> There can be both software and legal reasons to ensure that the
> history is pristine and never forgets a single byte. But in most
> shops, for any lengthy project, *someone* is going to submit
> unnecessary bulky binaries, and *someone* is going to create spurious
> branches, tags, or other subdirectories that should go the way of the
> passenger pigeon.
>
>> agree what "obliterate" actually means; there are about five meanings
>> that I know of. And second, all five are insanely hard to implement with
>> our current repository design (just ask Julian, he spent about a year
>> trying to come up with a sane, moderately backwards-compatible solution).
>>
>> -- Brane
>
> I appreciate that it's been awkward. The only workable method method
> now is the sort of "svn export; svn import to new repo and discard old
> repo" that I described, or a potentially dangerous and often fragile
> dump, filter, and reload approach that preserves the original URL's
> for the repo, but it's really not the same repo.
>
> It remains messy as heck. This is, in fact, one of the places where
> git or other systems's more gracious exclusion or garbage collection
> tools doe better. Even CVS had the ability to simply delete a
> directory on the main fileserver to discard old debris: it's one of
> the risks of the more database based approach of Subversion to
> managing the entire repository history.

Maybe it is time to change the request from 'obliterate' to _any_
reasonable way to fix a repository that has accumulated cruft.   And a
big warning to new users to put separate projects in separate
repositories from the start because they are too hard to untangle
later.  I've considered dumping ours and trying to split by project,
but I'm not even sure that is possible because many were imported from
CVS then subsequently moved to improve the layout.  So I can't really
filter by path.

-- 
   Les Mikesell
     lesmikesell@gmail.com

Re: Copy and Reduce the size of SVn repos

Posted by Nico Kadel-Garcia <nk...@gmail.com>.

On Sun, Mar 8, 2015 at 12:42 PM, Branko Čibej <br...@wandisco.com> wrote:
> On 08.03.2015 09:35, Nico Kadel-Garcia wrote:
>> On Sat, Mar 7, 2015 at 11:57 PM, Rajesh Kumar <Ra...@jda.com> wrote:
>>> I have one Huge SVN repos which is around 1TB in terms of size. I have two
>>> requirement as follows and i would like to know the best approach to be
>>> followed to save time and effort.
>> According to the doctrine of "there shall be no obliterate command,
>> the record must be kept absolutely pristine at all costs, praise the
>> gospel of all history matters!",
>
> Heh, I have to ask, where did you find that doctrine? There's no such
> thing. It's all a lot more mundane: First, you have to get people to

I've had to deal with that doctrine personally and professionally
since first working with Subversion in 2006. It comes up again eveyr
so often, for example in
http://subversion.tigris.org/issues/show_bug.cgi?id=516 and is
relevant to the original poster's request.

There can be both software and legal reasons to ensure that the
history is pristine and never forgets a single byte. But in most
shops, for any lengthy project, *someone* is going to submit
unnecessary bulky binaries, and *someone* is going to create spurious
branches, tags, or other subdirectories that should go the way of the
passenger pigeon.

> agree what "obliterate" actually means; there are about five meanings
> that I know of. And second, all five are insanely hard to implement with
> our current repository design (just ask Julian, he spent about a year
> trying to come up with a sane, moderately backwards-compatible solution).
>
> -- Brane

I appreciate that it's been awkward. The only workable method method
now is the sort of "svn export; svn import to new repo and discard old
repo" that I described, or a potentially dangerous and often fragile
dump, filter, and reload approach that preserves the original URL's
for the repo, but it's really not the same repo.

It remains messy as heck. This is, in fact, one of the places where
git or other systems's more gracious exclusion or garbage collection
tools doe better. Even CVS had the ability to simply delete a
directory on the main fileserver to discard old debris: it's one of
the risks of the more database based approach of Subversion to
managing the entire repository history.

Re: Copy and Reduce the size of SVn repos

Posted by Les Mikesell <le...@gmail.com>.

On Sun, Mar 8, 2015 at 3:31 PM, Tony Sweeney <sw...@addr.com> wrote:
>
> As I recall, this was feature request #13 after Perforce was released, and was implemented the best part of 15 years ago.  As near as I can tell it's architecturally impossible to implement in Subversion as a consequence of some of the initial design choices.  Subversion has served me well, but this has been a glaring misfeature since its inception:
>

I have to agree.  I can't imagine anyone using subversion for any
length of time without having some things committed that shouldn't be
there.   It probably would still be the main topic of conversation
here if everyone had not simply given up hope long ago.

-- 
  Les Mikesell
      lesmikesell@gmail.com

Re: Copy and Reduce the size of SVn repos

Posted by Tony Sweeney <sw...@addr.com>.

On 03/08/15 16:42, Branko Čibej wrote:
> On 08.03.2015 09:35, Nico Kadel-Garcia wrote:
>> On Sat, Mar 7, 2015 at 11:57 PM, Rajesh Kumar <Ra...@jda.com> wrote:
>>> I have one Huge SVN repos which is around 1TB in terms of size. I have two
>>> requirement as follows and i would like to know the best approach to be
>>> followed to save time and effort.
>> According to the doctrine of "there shall be no obliterate command,
>> the record must be kept absolutely pristine at all costs, praise the
>> gospel of all history matters!",
> 
> Heh, I have to ask, where did you find that doctrine? There's no such
> thing. It's all a lot more mundane: First, you have to get people to
> agree what "obliterate" actually means; there are about five meanings
> that I know of. And second, all five are insanely hard to implement with
> our current repository design (just ask Julian, he spent about a year
> trying to come up with a sane, moderately backwards-compatible solution).
> 
> -- Brane
> 
> 
root@fractal:~ # p4 help obliterate

    obliterate -- Remove files and their history from the depot

    p4 obliterate [-y -A -b -a -h] file[revRange] ...

	Obliterate permanently removes files and their history from the server.
	(See 'p4 delete' for the non-destructive way to delete a file.)
	Obliterate retrieves the disk space used by the obliterated files
	in the archive and clears the files from the metadata that is
	maintained by the server.  Files in client workspaces are not
	physically affected, but they are no longer under Perforce control.

	Obliterate is aware of lazy copies made when 'p4 integrate' creates
	a branch, and does not remove copies that are still in use. Because
	of this, obliterating files does not guarantee that the corresponding
	files in the archive will be removed.

	If the file argument has a revision, the specified revision is
	obliterated.  If the file argument has a revision range, the
	revisions in that range are obliterated.  See 'p4 help revisions'
	for help.

	By default, obliterate displays a preview of the results. To execute
	the operation, you must specify the -y flag.

	By default, obliterate will not process a revision which has been
	archived. To include such revisions, you must specify the -A flag.

	Obliterate has three flags that can improve performance:

	The '-b' flag restricts files in the argument range to those that
	are branched and are both the first revision and the head revision
	This flag is useful for removing old branches while keeping files
	of interest (files that were modified).

	The '-a' flag skips the archive search and removal phase.  This
	phase of obliterate can take a very long time for sites with big
	archive maps (db.archmap).  However, file content is not removed;
	if the file was a branch, then it's most likely that the archival
	search is not necessary.  This option is safe to use with the '-b'
	option.

	The '-h' flag instructs obliterate not to search db.have for all
	possible matching records to delete.  Usually, db.have is one of the
	largest tables in a repository and consequently this search takes
	a long time.  Do not use this flag when obliterating branches or
	namespaces for reuse,  because the old content on any client
	will not match the newly-added repository files.  Note that use of
	the -h flag has the side-effect of cleaning the obliterated files
	from client workspaces when they are synced.

	If you are obliterating files in order to entirely remove a depot
	from the server, and files in that depot have been integrated to
	other depots, run 'p4 snap' first to break those linkages, so that
	obliterate can remove the unreferenced archive files. If, instead,
	you specify '-a' to skip the archive removal phase, then you will
	need to specify '-f' when deleting the depot, since the presence
	of the archive files will prevent the depot deletion.

	'p4 obliterate' requires 'admin' access, which is granted by 'p4
	protect'.

root@fractal:~ # 

As I recall, this was feature request #13 after Perforce was released, and was implemented the best part of 15 years ago.  As near as I can tell it's architecturally impossible to implement in Subversion as a consequence of some of the initial design choices.  Subversion has served me well, but this has been a glaring misfeature since its inception:

http://svn.haxx.se/dev/archive-2003-01/0364.shtml

Tony.

Re: Copy and Reduce the size of SVn repos

Posted by Branko Čibej <br...@wandisco.com>.

On 08.03.2015 09:35, Nico Kadel-Garcia wrote:
> On Sat, Mar 7, 2015 at 11:57 PM, Rajesh Kumar <Ra...@jda.com> wrote:
>> I have one Huge SVN repos which is around 1TB in terms of size. I have two
>> requirement as follows and i would like to know the best approach to be
>> followed to save time and effort.
> According to the doctrine of "there shall be no obliterate command,
> the record must be kept absolutely pristine at all costs, praise the
> gospel of all history matters!",

Heh, I have to ask, where did you find that doctrine? There's no such
thing. It's all a lot more mundane: First, you have to get people to
agree what "obliterate" actually means; there are about five meanings
that I know of. And second, all five are insanely hard to implement with
our current repository design (just ask Julian, he spent about a year
trying to come up with a sane, moderately backwards-compatible solution).

-- Brane

Re: Copy and Reduce the size of SVn repos

Posted by Nico Kadel-Garcia <nk...@gmail.com>.

On Sat, Mar 7, 2015 at 11:57 PM, Rajesh Kumar <Ra...@jda.com> wrote:
> I have one Huge SVN repos which is around 1TB in terms of size. I have two
> requirement as follows and i would like to know the best approach to be
> followed to save time and effort.

According to the doctrine of "there shall be no obliterate command,
the record must be kept absolutely pristine at all costs, praise the
gospel of all history matters!", you don't In theory, history is kept
pristine and cannot be discarded. Sometimes there are even good
historical or legal reasons to do so. Personally, I consider it like
"cleaning your plate". It's a good idea when you're 5 years old,
because your folks want you to eat your vegetables instead of candy
later and they know better than you that you *will* get hungry again
quite soon. But for grownups, with the big pile of starchy empty
content from abandoned branches, and fatty binaries that will just
clog your backups and your workflow and give you a coronary when you
realize *how much* cruft is in the failover backup system, I find it
OK to say "no: we've had enough history" and send some of it to the
wastebasket.

In practice, when your repository has reached a full Terabyte, it's
out of hand and has probably wound up cluttered with unnecessary
binary content, such as jar files, rpm's, or iso images. If it's where
you keep a year of binary releases of a big project, OK, but
otherwise, I think not.

> 1.    Duplicating the whole repos of 1TB in shorter span of time and create
> another SVN repos.

This is straightforward and usually the way to do it if you're in a rush.

> 2.    How to reduce the Repos size drastically without impacting the
> integrity and version of the files? Here my repos size is 1TB and i want to
> make it smaller without deleting any files? what are the ways of doing
> so.....?

You can't reduce it much without cutting out history. You *can* set a
final tag, do an export of *that*, and import that to a new
repository, or dump the tag and load it as the trunk of a much, much
smaller new reporisoty, and *lock the old repository permamenently",
and *make people check out new clean working copies from the new
repository*. I've done that very effectively in a number of
professional environments, when individual products could and should
have been forked off to sepaqrate repositories.

> -Rajesh
>
> Don’t Miss FOCUS 2015 Orlando – over 100 customer-led sessions and 1
> Grammy-winning singer! Learn More >

Re: Copy and Reduce the size of SVn repos

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.

Ryan Schmidt wrote on Sun, Mar 08, 2015 at 23:33:14 -0500:
> 
> On Mar 7, 2015, at 10:57 PM, Rajesh Kumar wrote:
> 
> > I have one Huge SVN repos which is around 1TB in terms of size. I have two requirement as follows and i would like to know the best approach to be followed to save time and effort.
> > 
> > 1.    Duplicating the whole repos of 1TB in shorter span of time and create another SVN repos.
> > 
> > 2.    How to reduce the Repos size drastically without impacting the integrity and version of the files? Here my repos size is 1TB and i want to make it smaller without deleting any files? what are the ways of doing so.....?
> 
> How long has your repository been in operation? With what version of Subversion did you create it originally?

I don't think there's a good way to answer the latter question.
However, our documentation expects people to be able to answer it, e.g.:
https://subversion.apache.org/docs/release-notes/1.9#format7-comparison

Perhaps the FS backends should start recording their SVN_VER_NUMBER as
an immutable part of the filesystem at creation time, and 'svnadmin
info' could show it?

Daniel

> I ask because newer versions of Subversion store revisions more efficiently
> than older versions. If your repository was created with, say, Subversion
> 1.4, and you dump it and load the dump into a new repository created by
> Subversion 1.8, it will probably be smaller on disk, while containing exactly
> the same data. There may also be settings you can set in the new repository
> (before loading) that would make it even smaller.

Re: Copy and Reduce the size of SVn repos

Posted by Ryan Schmidt <su...@ryandesign.com>.

On Mar 7, 2015, at 10:57 PM, Rajesh Kumar wrote:

> I have one Huge SVN repos which is around 1TB in terms of size. I have two requirement as follows and i would like to know the best approach to be followed to save time and effort.
> 
> 1.    Duplicating the whole repos of 1TB in shorter span of time and create another SVN repos.
> 
> 2.    How to reduce the Repos size drastically without impacting the integrity and version of the files? Here my repos size is 1TB and i want to make it smaller without deleting any files? what are the ways of doing so.....?

How long has your repository been in operation? With what version of Subversion did you create it originally?

I ask because newer versions of Subversion store revisions more efficiently than older versions. If your repository was created with, say, Subversion 1.4, and you dump it and load the dump into a new repository created by Subversion 1.8, it will probably be smaller on disk, while containing exactly the same data. There may also be settings you can set in the new repository (before loading) that would make it even smaller.

Re: flushing caches upon repository replacement - was: Copy and Reduce the size of SVn repos

Posted by Ben Reser <be...@reser.org>.

On 3/9/15 3:12 PM, Ivan Zhakov wrote:
> On 9 March 2015 at 23:31, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
>> Andreas Stieger wrote on Sun, Mar 08, 2015 at 17:52:55 +0100:
>>> On 08/03/15 17:45, Branko Čibej wrote:
>>>> And it bears repeating: If you replace a repository, please make sure to
>>>> restart Apache and/or svnserve to clear stale caches.
>>>
>>> Is there something that can be done in the code to take care of that?
>>> Like watching the inode as {uuid,inode}-2-tuple and flush the caches if
>>> it changes?
> This problem already has been fixed in r1618138 [1]
> 
> [1] http://svn.apache.org/r1618138

The reason Branko said you need to restart the server after replacing the
repository isn't fixed in a released version (nor will it be fixed in 1.9).
1.9 does fix collisions between two repositories in different paths that have
the same UUID.  The original commit (that you linked in your post) tried to
solve both but we ultimately reverted the instance ids being in the cache keys
because the problem is ultimately impossible to fully fix given our
architecture (there's a note from the future on the commit message you mention
pointing to the discussion about this and ultimately why this was reverted).

But the short version is that even inserting the instance ids into the cache
only decreases the window that the resulting problem from not doing what Branko
suggests (restarting the server after replacing a repository).  Most of what we
end up fixing is the warning signs that help educate users that they shouldn't
be doing this.  I.E. we eliminate errors in non-destructive cases and leave the
potentially destructive cases that are less likely to happen.

Re: flushing caches upon repository replacement - was: Copy and Reduce the size of SVn repos

Posted by Ivan Zhakov <iv...@visualsvn.com>.

On 9 March 2015 at 23:31, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
> Andreas Stieger wrote on Sun, Mar 08, 2015 at 17:52:55 +0100:
>> On 08/03/15 17:45, Branko Čibej wrote:
>> > And it bears repeating: If you replace a repository, please make sure to
>> > restart Apache and/or svnserve to clear stale caches.
>>
>> Is there something that can be done in the code to take care of that?
>> Like watching the inode as {uuid,inode}-2-tuple and flush the caches if
>> it changes?
This problem already has been fixed in r1618138 [1]

[1] http://svn.apache.org/r1618138


-- 
Ivan Zhakov

Re: flushing caches upon repository replacement - was: Copy and Reduce the size of SVn repos

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.

Andreas Stieger wrote on Sun, Mar 08, 2015 at 17:52:55 +0100:
> On 08/03/15 17:45, Branko Čibej wrote:
> > And it bears repeating: If you replace a repository, please make sure to
> > restart Apache and/or svnserve to clear stale caches.
> 
> Is there something that can be done in the code to take care of that?
> Like watching the inode as {uuid,inode}-2-tuple and flush the caches if
> it changes?

It might be interesting to try and flush the caches if "youngest" had
gone back in time.  That is, if we know that the repository at a given
(UUID, dirent) had once had r200, and right now its youngest is r100,
then flush all caches related to that (UUID, dirent) pair.

Perhaps there are other invariants we could also check in a similar
manner, for example, the md5 of the text-rep of the node-revision of the
root directory at the (cached value of youngest) revision: if the md5
changes, then the cache is stale..

Daniel

flushing caches upon repository replacement - was: Copy and Reduce the size of SVn repos

Posted by Andreas Stieger <an...@gmx.de>.

Hello,

On 08/03/15 17:45, Branko Čibej wrote:
> And it bears repeating: If you replace a repository, please make sure to
> restart Apache and/or svnserve to clear stale caches.

Is there something that can be done in the code to take care of that?
Like watching the inode as {uuid,inode}-2-tuple and flush the caches if
it changes?

Andreas

Re: Copy and Reduce the size of SVn repos

Posted by Branko Čibej <br...@wandisco.com>.

On 08.03.2015 17:39, Andreas Stieger wrote:
> Hello,
>
> On 08/03/15 05:57, Rajesh Kumar wrote:
>> 2.    How to reduce the Repos size drastically without impacting the
>> integrity and version of the files? 
> Several points:
> A. Are you talking about the on-server repository size or the size of a
> working copy? The reasons one needs to ask are that many users regularly
> confuse the terms, and that many will check out the root of the
> repository into a working copy which unnecessarily increases the on-disk
> size of a working copy by duplicating /branches and /tags.
>
> B. For the server on-disk size, ensure representation sharing is enabled
> throughout the lifetime of the repository. When using deep tree
> structures and large properties also enable "directory and property
> storage reduction" available in 1.8. As these only take affect for added
> data, you need to perform what is referred to as a dump-load cycle and
> switch to the new but content-identical repository. Dump/load are
> documented.
>
>> Here my repos size is 1TB and i want to make it smaller without
>> deleting any files?
> Whoa this is kind of what a scm is designed not to allow. And deleting
> any files inside the repository tree does not reduce it's size as you of
> course retain all history, including deleted items.
>
>> 1.    Duplicating the whole repos of 1TB in shorter span of time and
>> create another SVN repos.
> You can perform a seamless migration to a second otherwise identical
> repository with reduced size. First prepare a replacement offline while
> keeping it up-to-date with the original by using svnsync configuration
> as documented in the Subversion book. You will need some migration space
> for that, can be on the same or another server. The repository URL may
> or may not change in the course of that - if it does do take care that
> you seamlessly direct users to the new data.

And it bears repeating: If you replace a repository, please make sure to
restart Apache and/or svnserve to clear stale caches.

-- Brane

Re: Copy and Reduce the size of SVn repos

Posted by Andreas Stieger <an...@gmx.de>.

Hello,

On 08/03/15 05:57, Rajesh Kumar wrote:
> 2.    How to reduce the Repos size drastically without impacting the
> integrity and version of the files? 

Several points:
A. Are you talking about the on-server repository size or the size of a
working copy? The reasons one needs to ask are that many users regularly
confuse the terms, and that many will check out the root of the
repository into a working copy which unnecessarily increases the on-disk
size of a working copy by duplicating /branches and /tags.

B. For the server on-disk size, ensure representation sharing is enabled
throughout the lifetime of the repository. When using deep tree
structures and large properties also enable "directory and property
storage reduction" available in 1.8. As these only take affect for added
data, you need to perform what is referred to as a dump-load cycle and
switch to the new but content-identical repository. Dump/load are
documented.

> Here my repos size is 1TB and i want to make it smaller without
> deleting any files?

Whoa this is kind of what a scm is designed not to allow. And deleting
any files inside the repository tree does not reduce it's size as you of
course retain all history, including deleted items.

> 1.    Duplicating the whole repos of 1TB in shorter span of time and
> create another SVN repos.

You can perform a seamless migration to a second otherwise identical
repository with reduced size. First prepare a replacement offline while
keeping it up-to-date with the original by using svnsync configuration
as documented in the Subversion book. You will need some migration space
for that, can be on the same or another server. The repository URL may
or may not change in the course of that - if it does do take care that
you seamlessly direct users to the new data.

Andreas