You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@subversion.apache.org by Jason Keltz <ja...@cse.yorku.ca> on 2013/01/31 23:10:30 UTC

software distribution with subversion

Hi.

I am faced with a problem where I need to distribute a directory 
containing about 60 GB worth of software on a Linux file server to about 
100 systems.  The software must be localized on those systems and not 
shared out over NFS.  On a regular basis, software may be added or 
removed from the directory, and all the clients should update 
accordingly in the evening.  During the update period, some client 
systems may be off.

I think that Subversion would be a reasonable way to solve this problem 
which isn't quite the type of problem that rsync is intended to handle 
(because of the number of machines).  However, for a variety of reasons, 
I don't want to run subversion on the actual file server.  Instead, 
nightly, I'd like to rsync changes in the contents of the software 
directory on the file server to a software distribution server which 
would run its own svnserve.  The clients would then connect up to the 
server nightly, and update themselves accordingly.  Because of the 
versioning, if a client misses an update, it would be updated the next 
time around, even if its been off for a while.

The inital update between the file server and the software update server 
would require rsyncing the whole 60 GB of software to a "working 
directory", after which, to make subversion see this as a "working 
directory", I would have to commit the entire directory, then check it 
back out.  This process seems like a bit of a waste, but it's a one time 
process, and I don't really see any way around it.  In the future, I 
would like to be able to rsync changes between the file server and the 
working directory on the software distribution server, which would 
including using --delete to ensure that software deleted from the file 
server is also deleted from the subversion working idrectory, and 
including the excluding of the .svn directory from the working copy.  
However, after the rsync happens, I now need to run a command that would 
update the repository with the state of the working directory.  However, 
it's not exactly clear how this would work?  Running an "svn update" 
isn't going to delete directories from the repository that were deleted 
from the working directory.  I believe you need to use "svn delete" for 
this?

Any ideas that anyone might be able to offer?

I'm not on the list, so please ensure that you CC: me in any response.

Thanks for your help!

Jason.

Re: software distribution with subversion

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.

The OP isn't subscribed and so probably didn't see your reply.

Branko Čibej wrote on Fri, Feb 01, 2013 at 00:37:41 +0100:
> I expect you've considered this option, but just to add it to the list:

Re: software distribution with subversion

Posted by Branko Čibej <br...@wandisco.com>.

I expect you've considered this option, but just to add it to the list:
Why not use a package manager like apt or yum, or a distributed
configuration manager such as puppet, to manage your servers?

While Subversion can be used as a substitute, it's not really suited for
this kind of application -- especially if the software has to be
customized for each node.

-- Brane

On 31.01.2013 23:10, Jason Keltz wrote:
> Hi.
>
> I am faced with a problem where I need to distribute a directory
> containing about 60 GB worth of software on a Linux file server to
> about 100 systems.  The software must be localized on those systems
> and not shared out over NFS.  On a regular basis, software may be
> added or removed from the directory, and all the clients should update
> accordingly in the evening.  During the update period, some client
> systems may be off.
>
> I think that Subversion would be a reasonable way to solve this
> problem which isn't quite the type of problem that rsync is intended
> to handle (because of the number of machines).  However, for a variety
> of reasons, I don't want to run subversion on the actual file server. 
> Instead, nightly, I'd like to rsync changes in the contents of the
> software directory on the file server to a software distribution
> server which would run its own svnserve.  The clients would then
> connect up to the server nightly, and update themselves accordingly. 
> Because of the versioning, if a client misses an update, it would be
> updated the next time around, even if its been off for a while.
>
> The inital update between the file server and the software update
> server would require rsyncing the whole 60 GB of software to a
> "working directory", after which, to make subversion see this as a
> "working directory", I would have to commit the entire directory, then
> check it back out.  This process seems like a bit of a waste, but it's
> a one time process, and I don't really see any way around it.  In the
> future, I would like to be able to rsync changes between the file
> server and the working directory on the software distribution server,
> which would including using --delete to ensure that software deleted
> from the file server is also deleted from the subversion working
> idrectory, and including the excluding of the .svn directory from the
> working copy.  However, after the rsync happens, I now need to run a
> command that would update the repository with the state of the working
> directory.  However, it's not exactly clear how this would work? 
> Running an "svn update" isn't going to delete directories from the
> repository that were deleted from the working directory.  I believe
> you need to use "svn delete" for this?
>
> Any ideas that anyone might be able to offer?
>
> I'm not on the list, so please ensure that you CC: me in any response.
>
> Thanks for your help!
>
> Jason.


-- 
Branko Čibej
Director of Subversion | WANdisco | www.wandisco.com

Re: software distribution with subversion

Posted by Les Mikesell <le...@gmail.com>.

On Thu, Jan 31, 2013 at 8:14 PM, Jason Keltz <ja...@cse.yorku.ca> wrote:
>>
>> I'd think it is exactly the problem that rsync is intended to handle.
>
> rsync is great when you want to sync the contents from one machine to
> another machine in one direction.. (unison if you need dual direction
> sync...) ....  I thought about using rsync to solve this problem... two ways
> I can think of..
>
> 1)  All the machines run rsync against the server.. kills the server, but
> let's say they do it all at different times.. the server is hefty..  hey, it
> would work, but for every single rsync, the server needs to look at its
> entire file tree to see which files have changed.... 100 syncs = 100 times
> processing the same thing over and over again... If only rsync would let me
> save that state to a file so that it doesn't need to reload it every time it
> runs, then I know which solution I'd be using...  other problem is, it would
> take a long time..

Is this on linux?  If the host has a reasonable amount of RAM the
directory info will mostly be cached between accesses. Have you timed
it before deciding it is a problem?  I suspect it won't be unless
there are millions of small files in that tree or a very fast rate of
change.

> 2) log/tree approach --- server updates one client, then the server and the
> one client each update another client, then each of those 3 update
> another...  much faster, but again, you have to read the server state each
> and every time... and then I have to deal with the fact that various random
> machines are off ...

If you can't complete from one distribution server - or if the
geographic location makes sense, fan it out to a few redistribution
instances.   Not sure what you mean about reading server state -
directory reads aren't all that expensive.

>> Subversion would give you the option of intentionally maintaining your
>> targets at different revision levels, but at a cost of needing a
>> 'working copy' format where you have an unneeded 'pristine' duplicate
>> copy of everything.
>
> The truth is, I wouldn't intentionally have the machines at different
> software levels... (well, that could be useful for testing, but that's
> another story)....  but a machine could be off during the update and would
> be able to "catch up" no longer how long it was off...

Rsync would always catch up.  If the exact refresh timing isn't
critical you could just run from cron on the clients with some skew to
avoid overloading the distribution server.

>>> I believe you need to use "svn delete"
>>> for this?
>>
>> That is for when you are making the changes you intend to commit.
>>
>
> I'll have to try that again .. didn't seem to be working the way I expected
> it to...

You have to 'svn delete'  in the working copy, then commit the change,
then an update will replicate the action in another working copy.

-- 
   Les Mikesell
      lesmikesell@gmail.com

Re: software distribution with subversion

Posted by Jason Keltz <ja...@cse.yorku.ca>.

On 31/01/2013 6:40 PM, Les Mikesell wrote:
> On Thu, Jan 31, 2013 at 4:10 PM, Jason Keltz <ja...@cse.yorku.ca> wrote:
>> I am faced with a problem where I need to distribute a directory containing
>> about 60 GB worth of software on a Linux file server to about 100 systems.
>> The software must be localized on those systems and not shared out over NFS.
>> On a regular basis, software may be added or removed from the directory, and
>> all the clients should update accordingly in the evening.  During the update
>> period, some client systems may be off.
>>
>> I think that Subversion would be a reasonable way to solve this problem
>> which isn't quite the type of problem that rsync is intended to handle
>> (because of the number of machines).
> I'd think it is exactly the problem that rsync is intended to handle.
rsync is great when you want to sync the contents from one machine to 
another machine in one direction.. (unison if you need dual direction 
sync...) ....  I thought about using rsync to solve this problem... two 
ways I can think of..

1)  All the machines run rsync against the server.. kills the server, 
but let's say they do it all at different times.. the server is hefty..  
hey, it would work, but for every single rsync, the server needs to look 
at its entire file tree to see which files have changed.... 100 syncs = 
100 times processing the same thing over and over again... If only rsync 
would let me save that state to a file so that it doesn't need to reload 
it every time it runs, then I know which solution I'd be using...  other 
problem is, it would take a long time..
2) log/tree approach --- server updates one client, then the server and 
the one client each update another client, then each of those 3 update 
another...  much faster, but again, you have to read the server state 
each and every time... and then I have to deal with the fact that 
various random machines are off ...

It's a really interesting problem..

>> However, for a variety of reasons, I
>> don't want to run subversion on the actual file server.  Instead, nightly,
>> I'd like to rsync changes in the contents of the software directory on the
>> file server to a software distribution server which would run its own
>> svnserve.  The clients would then connect up to the server nightly, and
>> update themselves accordingly.  Because of the versioning, if a client
>> misses an update, it would be updated the next time around, even if its been
>> off for a while.
> Subversion would give you the option of intentionally maintaining your
> targets at different revision levels, but at a cost of needing a
> 'working copy' format where you have an unneeded 'pristine' duplicate
> copy of everything.
The truth is, I wouldn't intentionally have the machines at different 
software levels... (well, that could be useful for testing, but that's 
another story)....  but a machine could be off during the update and 
would be able to "catch up" no longer how long it was off...
>> However, after the rsync happens, I now need to run a
>> command that would update the repository with the state of the working
>> directory.  However, it's not exactly clear how this would work?  Running an
>> "svn update" isn't going to delete directories from the repository that were
>> deleted from the working directory.
> Sure it will - it will make it match the state of whatever version you
> are updating to.
>
>> I believe you need to use "svn delete"
>> for this?
> That is for when you are making the changes you intend to commit.
>

I'll have to try that again .. didn't seem to be working the way I 
expected it to...

Jason.


-- 
Jason Keltz
Manager of Development
Department of Computer Science and Engineering
York University, Toronto, Canada
Tel: 416-736-2100 x. 33570
Fax: 416-736-5872

Re: software distribution with subversion

Posted by Les Mikesell <le...@gmail.com>.

On Thu, Jan 31, 2013 at 4:10 PM, Jason Keltz <ja...@cse.yorku.ca> wrote:
>
> I am faced with a problem where I need to distribute a directory containing
> about 60 GB worth of software on a Linux file server to about 100 systems.
> The software must be localized on those systems and not shared out over NFS.
> On a regular basis, software may be added or removed from the directory, and
> all the clients should update accordingly in the evening.  During the update
> period, some client systems may be off.
>
> I think that Subversion would be a reasonable way to solve this problem
> which isn't quite the type of problem that rsync is intended to handle
> (because of the number of machines).

I'd think it is exactly the problem that rsync is intended to handle.

> However, for a variety of reasons, I
> don't want to run subversion on the actual file server.  Instead, nightly,
> I'd like to rsync changes in the contents of the software directory on the
> file server to a software distribution server which would run its own
> svnserve.  The clients would then connect up to the server nightly, and
> update themselves accordingly.  Because of the versioning, if a client
> misses an update, it would be updated the next time around, even if its been
> off for a while.

Subversion would give you the option of intentionally maintaining your
targets at different revision levels, but at a cost of needing a
'working copy' format where you have an unneeded 'pristine' duplicate
copy of everything.

> However, after the rsync happens, I now need to run a
> command that would update the repository with the state of the working
> directory.  However, it's not exactly clear how this would work?  Running an
> "svn update" isn't going to delete directories from the repository that were
> deleted from the working directory.

Sure it will - it will make it match the state of whatever version you
are updating to.

> I believe you need to use "svn delete"
> for this?

That is for when you are making the changes you intend to commit.

-- 
  Les Mikesell
     lesmikesell@gmail.com

Re: software distribution with subversion

Posted by Les Mikesell <le...@gmail.com>.

On Thu, Jan 31, 2013 at 8:18 PM, Jason Keltz <ja...@cse.yorku.ca> wrote:
> >
> See my email to Les...  If only the rsync server could save a copy of the
> file checksums when it runs, it would probably decrease the sync time by
> half and save a whole lot of disk activity...

If you don't use the --ignore-times option with rsync it will skip the
checksum comparison on files where the timestamp, length, etc. match.
It should only have to read the directory on both sides if nothing
changes between runs.   Be sure you are using options that propagate
the timestamp (like -a) for this to work.

-- 
   Les Mikesell
     lesmikesell@gmail.com

Re: software distribution with subversion

Posted by Nico Kadel-Garcia <nk...@gmail.com>.

On Thu, Jan 31, 2013 at 9:18 PM, Jason Keltz <ja...@cse.yorku.ca> wrote:
> On 31/01/2013 9:13 PM, Ryan Schmidt wrote:

>> Subversion is not a software distribution tool; it is a document and
>> revision management system. Use a different tool. As someone else said,
>> rsync seems like a good tool for this job; I didn't understand why you think
>> using rsync directly between your file server and your clients won't work.
>>
>
> See my email to Les...  If only the rsync server could save a copy of the
> file checksums when it runs, it would probably decrease the sync time by
> half and save a whole lot of disk activity...

This.... sounds like somone wants to use the same screwdriver for all
screws in this birdhouse.

It's theoretically possible to set a canonical Subversion and
auto-propagate changes to it, from the "file server" or from the an
rsynced copy of the fileserver with a local working copy on the
Subversion master. But it's going to be bulky, and slow. If that 60
GBytes has a lot of churn due to rapidly changing binaries or
extensive static database files, it's going to get awkward indeed. And
because the "file server" you're propagating these changes from is
neither a Subversion server, nor a Subversion client, it's much
harder. Moreover, this doesn't seem to be the kind of "rollback the
changes to a well-defined date" that Subversion does so well,a nd the
changes from the master get fed to a trunk and will then have to be
propagated to branches., and each machine will need a different
branch.

This.... gets tricky. One can differentiate among the slightly
different environments by maintaining a trunk and merging the changes
to the branches, but that can get awkward. Is it possible to set up
tags that haven "svn:external" settings that point to sets of software
from the master, and then the individual hosts are configured locally
and have their changes propagated to the branches on the master?

And you know, this sounds like an absolute flipping deployment
disaster I dealt with about 12 years ago. The site architect thought
the clever thing to do was make a complete tarball bundle for all
deployments, and the whole compressed tarball had to be pushed *every
time*, and releases could only happen with the complete tarball.
Various forms of chaos ensued. I taught them to use packages, to
deploy kernels, in particular, as a separate object so they could be
deployed separately and with rollback separate from the rest of the
system. This fixed the ongoing problem that any one component that
failed would stop the *whole* deployment and push back even the
smallest fixes for as much as six months.

So while I've offered some hints, I'm gong to really suggest to Jason
that he think hard about modularizing the components of this set of
packages before he even starts this project.

Re: software distribution with subversion

Posted by Jason Keltz <ja...@cse.yorku.ca>.

On 31/01/2013 9:13 PM, Ryan Schmidt wrote:
> On Jan 31, 2013, at 20:05, Jason Keltz wrote:
>
>> On 31/01/2013 6:06 PM, Bob Archer wrote:
>>> What you need to do could work. I assume this "software" in order to run can build built or whatever during your nightly update on each client?
>>>
>>> You keep saying "rsyncing" ... you wouldn't use that. You wouldn't use that of course, you would use the svn client binary.
>> Actually, maybe I wasn't clear..
>> The software includes various packages like say, Matlab, or Maple, or whatever else, already installed...  imagine a directory on the fileserver.. say, /local/software which includes "bin", "lib", etc...    I'm not "installing" the software.   it's already been installed..  I'm just syncing a directory between machines..
>> As for rsyncing.. I would rsync the software from the "file server" to the "software distribution" server, and then use svn from there to check in all the changes.
>>
>>> For you initial load... if the software is on the server where you will house your repository you can just import the data into the repository from that file... there is no need to send the data twice. In other words, you can have both a working copy and a repository on your central server.
>> Yes.  Initially I would do an import, but the problem is... the next day, the software gets updated on the "real" file server... say, new version of Matlab or something...  in the evening, I want the process to run that would rsync the data (with all the changes) from the file server to the software distribution server,  do something to commit the changes, then the 100 clients would eventually each "svn update".     However, to be able to commit the changes, I need to have a working copy on the software distribution server....
>>
>>>> However, after the rsync happens, I now need to run a command that would
>>>> update the repository with the state of the working directory.  However, it's not
>>>> exactly clear how this would work?  Running an "svn update"
>>> "svn update" brings any changes in the repository to your working copy. "svn commit" does the opposite... it puts any changes in a working directory into the repository.
>> See, this is where I'm confused... I created a few directories including "bin" and "pkg" for a test.  All committed fine... erased them from the working copy, did a commit then a status and I see:
>>
>> !       bin
>> !       pkg
>>
>> but when I go into a different directory and check out the current state..
>>
>> A    pkg
>> A    bin
>> Checked out revision 2.
>>
>> they're still there...
> Correct. Subversion does not track your movements. You must tell Subversion what you are moving and deleting by doing the moves and deletes using "svn mv" and "svn rm", not using regular OS commands.
>
>
>>> Hth...
>>>
>>> That said, if this is actual software, wouldn't using one of the many package management tools available in Linux be a better fit?
>> The thing is, I'm moving around already installed software, and there's nothing that great, as far as I can see, for doing that. The twitter guys are using something they wrote called "murder" which uses torrent to do this kind of thing...  excellent idea, but it uses Ruby and several other tools ...   and I don't want to get into that at the moment...
> Subversion is not going to be a satisfactory solution for this use case. Besides all the issues you're describing with setting up the server-side infrastructure for this, and as was already mentioned, when you check out a working copy of this on your clients, there will be a "duplicate" pristine copy of everything. So if you have 60GB of software, it'll take up 120GB of space on the client machine.
I'm glad you brought that up :)

> Subversion is not a software distribution tool; it is a document and revision management system. Use a different tool. As someone else said, rsync seems like a good tool for this job; I didn't understand why you think using rsync directly between your file server and your clients won't work.
>

See my email to Les...  If only the rsync server could save a copy of 
the file checksums when it runs, it would probably decrease the sync 
time by half and save a whole lot of disk activity...


-- 
Jason Keltz
Manager of Development
Department of Computer Science and Engineering
York University, Toronto, Canada
Tel: 416-736-2100 x. 33570
Fax: 416-736-5872

Re: software distribution with subversion

Posted by Ryan Schmidt <su...@ryandesign.com>.

On Jan 31, 2013, at 20:05, Jason Keltz wrote:

> On 31/01/2013 6:06 PM, Bob Archer wrote:
>> 
>> What you need to do could work. I assume this "software" in order to run can build built or whatever during your nightly update on each client?
>> 
>> You keep saying "rsyncing" ... you wouldn't use that. You wouldn't use that of course, you would use the svn client binary.
> Actually, maybe I wasn't clear..
> The software includes various packages like say, Matlab, or Maple, or whatever else, already installed...  imagine a directory on the fileserver.. say, /local/software which includes "bin", "lib", etc...    I'm not "installing" the software.   it's already been installed..  I'm just syncing a directory between machines..
> As for rsyncing.. I would rsync the software from the "file server" to the "software distribution" server, and then use svn from there to check in all the changes.
> 
>> For you initial load... if the software is on the server where you will house your repository you can just import the data into the repository from that file... there is no need to send the data twice. In other words, you can have both a working copy and a repository on your central server.
> Yes.  Initially I would do an import, but the problem is... the next day, the software gets updated on the "real" file server... say, new version of Matlab or something...  in the evening, I want the process to run that would rsync the data (with all the changes) from the file server to the software distribution server,  do something to commit the changes, then the 100 clients would eventually each "svn update".     However, to be able to commit the changes, I need to have a working copy on the software distribution server....
> 
>>> However, after the rsync happens, I now need to run a command that would
>>> update the repository with the state of the working directory.  However, it's not
>>> exactly clear how this would work?  Running an "svn update"
>> "svn update" brings any changes in the repository to your working copy. "svn commit" does the opposite... it puts any changes in a working directory into the repository.
> See, this is where I'm confused... I created a few directories including "bin" and "pkg" for a test.  All committed fine... erased them from the working copy, did a commit then a status and I see:
> 
> !       bin
> !       pkg
> 
> but when I go into a different directory and check out the current state..
> 
> A    pkg
> A    bin
> Checked out revision 2.
> 
> they're still there...

Correct. Subversion does not track your movements. You must tell Subversion what you are moving and deleting by doing the moves and deletes using "svn mv" and "svn rm", not using regular OS commands.


>> Hth...
>> 
>> That said, if this is actual software, wouldn't using one of the many package management tools available in Linux be a better fit?
> 
> The thing is, I'm moving around already installed software, and there's nothing that great, as far as I can see, for doing that. The twitter guys are using something they wrote called "murder" which uses torrent to do this kind of thing...  excellent idea, but it uses Ruby and several other tools ...   and I don't want to get into that at the moment...

Subversion is not going to be a satisfactory solution for this use case. Besides all the issues you're describing with setting up the server-side infrastructure for this, and as was already mentioned, when you check out a working copy of this on your clients, there will be a "duplicate" pristine copy of everything. So if you have 60GB of software, it'll take up 120GB of space on the client machine.

Subversion is not a software distribution tool; it is a document and revision management system. Use a different tool. As someone else said, rsync seems like a good tool for this job; I didn't understand why you think using rsync directly between your file server and your clients won't work.

Re: software distribution with subversion

Posted by Jason Keltz <ja...@cse.yorku.ca>.

On 31/01/2013 6:06 PM, Bob Archer wrote:
>> I am faced with a problem where I need to distribute a directory containing
>> about 60 GB worth of software on a Linux file server to about
>> 100 systems.  The software must be localized on those systems and not shared
>> out over NFS.  On a regular basis, software may be added or removed from the
>> directory, and all the clients should update accordingly in the evening.  During
>> the update period, some client systems may be off.
>>
>> I think that Subversion would be a reasonable way to solve this problem which
>> isn't quite the type of problem that rsync is intended to handle (because of the
>> number of machines).  However, for a variety of reasons, I don't want to run
>> subversion on the actual file server.  Instead, nightly, I'd like to rsync changes in
>> the contents of the software directory on the file server to a software
>> distribution server which would run its own svnserve.  The clients would then
>> connect up to the server nightly, and update themselves accordingly.  Because
>> of the versioning, if a client misses an update, it would be updated the next
>> time around, even if its been off for a while.
>>
>> The inital update between the file server and the software update server would
>> require rsyncing the whole 60 GB of software to a "working directory", after
>> which, to make subversion see this as a "working directory", I would have to
>> commit the entire directory, then check it back out.  This process seems like a
>> bit of a waste, but it's a one time process, and I don't really see any way around
>> it.  In the future, I would like to be able to rsync changes between the file
>> server and the working directory on the software distribution server, which
>> would including using --delete to ensure that software deleted from the file
>> server is also deleted from the subversion working idrectory, and including the
>> excluding of the .svn directory from the working copy.
>> However, after the rsync happens, I now need to run a command that would
>> update the repository with the state of the working directory.  However, it's not
>> exactly clear how this would work?  Running an "svn update"
>> isn't going to delete directories from the repository that were deleted from the
>> working directory.  I believe you need to use "svn delete" for this?
>>
>> Any ideas that anyone might be able to offer?
>>
>> I'm not on the list, so please ensure that you CC: me in any response.
>>
>> Thanks for your help!
>>
>
> What you need to do could work. I assume this "software" in order to run can build built or whatever during your nightly update on each client?
>
> You keep saying "rsyncing" ... you wouldn't use that. You wouldn't use that of course, you would use the svn client binary.
Actually, maybe I wasn't clear..
The software includes various packages like say, Matlab, or Maple, or 
whatever else, already installed...  imagine a directory on the 
fileserver.. say, /local/software which includes "bin", "lib", etc...    
I'm not "installing" the software.   it's already been installed..  I'm 
just syncing a directory between machines..
As for rsyncing.. I would rsync the software from the "file server" to 
the "software distribution" server, and then use svn from there to check 
in all the changes.

> For you initial load... if the software is on the server where you will house your repository you can just import the data into the repository from that file... there is no need to send the data twice. In other words, you can have both a working copy and a repository on your central server.
Yes.  Initially I would do an import, but the problem is... the next 
day, the software gets updated on the "real" file server... say, new 
version of Matlab or something...  in the evening, I want the process to 
run that would rsync the data (with all the changes) from the file 
server to the software distribution server,  do something to commit the 
changes, then the 100 clients would eventually each "svn update".     
However, to be able to commit the changes, I need to have a working copy 
on the software distribution server....

>> However, after the rsync happens, I now need to run a command that would
>> update the repository with the state of the working directory.  However, it's not
>> exactly clear how this would work?  Running an "svn update"
> "svn update" brings any changes in the repository to your working copy. "svn commit" does the opposite... it puts any changes in a working directory into the repository.
See, this is where I'm confused... I created a few directories including 
"bin" and "pkg" for a test.  All committed fine... erased them from the 
working copy, did a commit then a status and I see:

!       bin
!       pkg

but when I go into a different directory and check out the current state..

A    pkg
A    bin
Checked out revision 2.

they're still there...

> Hth...
>
> That said, if this is actual software, wouldn't using one of the many package management tools available in Linux be a better fit?

The thing is, I'm moving around already installed software, and there's 
nothing that great, as far as I can see, for doing that. The twitter 
guys are using something they wrote called "murder" which uses torrent 
to do this kind of thing...  excellent idea, but it uses Ruby and 
several other tools ...   and I don't want to get into that at the moment...

Jason.

RE: software distribution with subversion

Posted by Bob Archer <Bo...@amsi.com>.

> I am faced with a problem where I need to distribute a directory containing
> about 60 GB worth of software on a Linux file server to about
> 100 systems.  The software must be localized on those systems and not shared
> out over NFS.  On a regular basis, software may be added or removed from the
> directory, and all the clients should update accordingly in the evening.  During
> the update period, some client systems may be off.
> 
> I think that Subversion would be a reasonable way to solve this problem which
> isn't quite the type of problem that rsync is intended to handle (because of the
> number of machines).  However, for a variety of reasons, I don't want to run
> subversion on the actual file server.  Instead, nightly, I'd like to rsync changes in
> the contents of the software directory on the file server to a software
> distribution server which would run its own svnserve.  The clients would then
> connect up to the server nightly, and update themselves accordingly.  Because
> of the versioning, if a client misses an update, it would be updated the next
> time around, even if its been off for a while.
> 
> The inital update between the file server and the software update server would
> require rsyncing the whole 60 GB of software to a "working directory", after
> which, to make subversion see this as a "working directory", I would have to
> commit the entire directory, then check it back out.  This process seems like a
> bit of a waste, but it's a one time process, and I don't really see any way around
> it.  In the future, I would like to be able to rsync changes between the file
> server and the working directory on the software distribution server, which
> would including using --delete to ensure that software deleted from the file
> server is also deleted from the subversion working idrectory, and including the
> excluding of the .svn directory from the working copy.
> However, after the rsync happens, I now need to run a command that would
> update the repository with the state of the working directory.  However, it's not
> exactly clear how this would work?  Running an "svn update"
> isn't going to delete directories from the repository that were deleted from the
> working directory.  I believe you need to use "svn delete" for this?
> 
> Any ideas that anyone might be able to offer?
> 
> I'm not on the list, so please ensure that you CC: me in any response.
> 
> Thanks for your help!
> 


What you need to do could work. I assume this "software" in order to run can build built or whatever during your nightly update on each client?

You keep saying "rsyncing" ... you wouldn't use that. You wouldn't use that of course, you would use the svn client binary.

For you initial load... if the software is on the server where you will house your repository you can just import the data into the repository from that file... there is no need to send the data twice. In other words, you can have both a working copy and a repository on your central server. 

> However, after the rsync happens, I now need to run a command that would
> update the repository with the state of the working directory.  However, it's not
> exactly clear how this would work?  Running an "svn update"

"svn update" brings any changes in the repository to your working copy. "svn commit" does the opposite... it puts any changes in a working directory into the repository. 

Hth... 

That said, if this is actual software, wouldn't using one of the many package management tools available in Linux be a better fit?

BOb

Re: software distribution with subversion

Posted by Jason Keltz <ja...@cse.yorku.ca>.

Thanks to everyone who provided me with very helpful feedback re: my 
problem of "software distribution with subversion".  I am re-evaluating 
the project, and how to complete it best.

Thanks!

Jason.

Re: software distribution with subversion

Posted by Thorsten Schöning <ts...@am-soft.de>.

Guten Tag Jason Keltz,
am Donnerstag, 31. Januar 2013 um 23:10 schrieben Sie:

> Any ideas that anyone might be able to offer?

As it seems most answers vote against using Subversion and use rsync
or some alternative instead, I would like to add some ideas which vote
for Subversion because I use a similar approach like yours with one of
our products some of my customers. It's not about 60 GB of data,
though, only around 75 MB per client, but each client needs some
little customizations, for example. In my environment my customers use
SSH to establish a tunnel to access a special svnserve instance only
serving binary data which can directly be used without installation or
else. It's just a simple directory structure which is in most cases
saved on a file server of a customer and used by it's own clients from
there. I found this to work quite well as it's an easy and flexible
setup. Some of my customers use this access to get the data and build
customized MSI installation packets for Windows on their own.

1. rsync files to deployment server

Besides the reasons not running svnserve on the file server itself,
why don't you seem to consider running the working copy for your trunk
or whatever will be the source for your deployment server on the file
server? This would need more space on the file server, of course, but
it would save you the rsync to the deployment server and the working
copy on the deployment server. How are changes to the directory
structure on the file server applied at all? If it is from users, they
could already use Subversion clients to apply those changes and could
deal with moves, renames, deletes etc. in the proper Subversion way
which would provide full tracking and history of the changes.

2. repo size

Depending on your data, Subversions representation sharing of content
could save you a lot of repo space. While the clients still had to
deal with pristine copies etc., the needed space for your deployment
server may be a lot less than your mentioned 60 GB. Another good thing
on representation sharing is that it works on a per repo basis, not
e.g. per directory, which means that even if you branch a lot of
clients for any reason and each client needs to get updated binaries
the total amount of space needed won't scale with the number of
clients branched, but only with the different binary changes
committed, which could be a lot less in an environment were all
clients need to have the same binaries.

3. customizations for some clients

You descriptions reads like each and every client has exactly the same
data set to fetch. What's the chance for exceptions and that some
clients need special binaries, configuration files or whatever for
some reasons? With a "simple" rsync approach this would really
complicate your setup as you would need another directory structure
with full data or need to symlink some parts of your directory
structure or whatever. Using subversion you could easily, fast and
efficient branch the clients which need customizations and if you use
working copies on the file server your users could easily apply those
customizations and see which customizations are made. This would make
your maintenance life much easier than generating a diff between do
directory structures which are used as a rsync basis.

4. updates by revision

Depending on the size of your changes, I agree that using Subversion
for updates will be much more efficient than running rsync and letting
it calculate the needed changes. It's not a matter of if rsync will be
slower but only how slow it will perform and if this would be a real
problem in environment or not. But from my point of view it a simple
space vs. processing time if you look at Subversion vs. rsync.

5. pristine space needed on clients

You didn't seem to mention what kind of clients you need to update.
Depending on their kind the doubled space for pristine copies may not
even be a problem at all.

6. Server access and security

Did you already think of security, is there any need to secure clients
against each other at all? Using rsync and especially customizatioins
you would maybe need to create a lot of users and or groups and use
security on the file system and OS level. Subversion provides it's own
configuration which may even be versioned itself for documentations
purposes etc.

Mit freundlichen Grüßen,

Thorsten Schöning

-- 
Thorsten Schöning       E-Mail:Thorsten.Schoening@AM-SoFT.de
AM-SoFT IT-Systeme      http://www.AM-SoFT.de/

Telefon...........05151-  9468- 55
Fax...............05151-  9468- 88
Mobil..............0178-8 9468- 04

AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow