You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@subversion.apache.org by Brad Rhoads <br...@zethcon.com> on 2004/05/28 21:42:56 UTC

Organization Advise

Hi Everyone,

I'm just getting started with subversion and would like some advise on how
to best organize our repository. We have a fairly complex system consisting
of the following:

1. Desktop app written in Delphi
2. Background processes (~ 7 different executables) written in C
3. Terminal based application written in C
4. Web interface written in Java/JSP
5. All of the above access an Oracle database. We script everything. There
are some procs and tables that are only used by 1 of the above, but most are
used by all.

I would like to be able to easily tag 1-5 as release x.x. But I don't want
every developer to have to have working copies of all everything. Plus we
are always doing bug fixes and installing patches for individual clients. So
we need a good way to keep track of exactly which version each client is
running. What's the best way to organize our repository in terms of projects
based on these requirements? One big project vs. several projects? How do we
deal with the SQL as it is functionally related to other parts of the
system?

Thank you very much for your advise.

-------------------------------------------
Brad Rhoads
Zethcon Corporation
847-318-0800 x 53
http://www.zethcon.com
http://www.bydesignwebsights.com/bradrhoads 



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Organization Advise

Posted by Wesley J Landaker <wj...@icecavern.net>.

On Friday 28 May 2004 3:42 pm, Brad Rhoads wrote:
> Hi Everyone,
>
> I'm just getting started with subversion and would like some advise
> on how to best organize our repository. We have a fairly complex
> system consisting of the following:

I'd suggest that you read the Subversion Book 
<http://svnbook.red-bean.com/>, particular the part on choosing a 
repository layout 
<http://svnbook.red-bean.com/svnbook/ch05s04.html#svn-ch-5-sect-6.1>, 
and all of chapter 4, which goes over the issues in branching and 
merging (since your desired branching structure may influence how you 
want layout your repository).

> 1. Desktop app written in Delphi
> 2. Background processes (~ 7 different executables) written in C
> 3. Terminal based application written in C
> 4. Web interface written in Java/JSP
> 5. All of the above access an Oracle database. We script everything.
> There are some procs and tables that are only used by 1 of the above,
> but most are used by all.

Plus the ones you described, we get:

6. Easily be able to tag entire project.
7. Don't make every developer check out everything
8. Track patches for individual clients

If it's feasible to have a single developer check out just one of 1-5 
and work on it with little or no dependancies on the other pieces, you 
might be best suited to have a layout like this:

/repos
  /desktop_app
  /bg_proc[1-7]
  /terminal
  /web_interface
  /database
  /common_code

Where each subdir has it's own trunk/tag/branch subdirectories. The only 
issues left are #7 and #8.

To handle #7, you could handle this pretty easily by adding:

/repos
  /global_tags
     /release-1.0.0
       /desktop_app
       /bg_proc[1-7]
       /terminal
       ...
     /release-1.0.2
       ...

Where to create each release, you would tag the individual modules, and 
then collect those tags into a /global_tags/release-x.x.x/ directory 
for each access/reference.

To handle #8, you can do something like the above, but just use a 
branch, instead of a tag, named after each client:

/repos
  /clients_current
    /wes_landaker
      /desktop_app
      /bg_proc[1-7]
      ...

Anyway, that's just one method, and not necessarily "the best way to do 
it", but the important thing to realize is that with subversion, it's 
not only pretty easy to do all this stuff, but it's pretty easy to 
reorganize your repos after-the-fact if you ever have a change in 
requirements. =)

-- 
Wesley J. Landaker <wj...@icecavern.net>
OpenPGP FP: 4135 2A3B 4726 ACC5 9094  0097 F0A9 8A4C 4CD6 E3D2

Re: Managing SQL in the repository

Posted by Gary Affonso <gl...@greywether.com>.

On 6/1/04 6:31 AM, "m christensen" <df...@xmission.com> wrote:

> Gary Affonso wrote:
> 
>> On 5/30/04 4:08 PM, "m christensen" <df...@xmission.com> wrote:
>>> I don't know what you mean by "keeping configuration data in the DB".

>> You original reply stated this:
>>> #3 Compiled binaries often pull parameters or configuration info from
>>> the database
>>> 
>> This item doesn't apply to our situation.

> I wouldn't be too quick to assume so.

Uh, good for you, I guess.  Caution is a virtue and all that.

> The data I'm thinking of is database schema information and lots of
> systems validate, configure or bind
> variables or memory locations by looking at the database schema as part
> of the 'compile' process.

Again, ours does not.

- Gary


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Managing SQL in the repository

Posted by m christensen <df...@xmission.com>.

Gary Affonso wrote:

>On 5/30/04 4:08 PM, "m christensen" <df...@xmission.com> wrote:
>
>  
>
>>I don't know what you mean by "keeping configuration data in the DB".
>>    
>>
>
>You original reply stated this:
>
>  
>
>>#3 Compiled binaries often pull parameters or configuration info from
>>the database
>>    
>>
>
>This item doesn't apply to our situation.
>
>- Gary
>
>  
>
I wouldn't be too quick to assume so.

The data I'm thinking of is database schema information and lots of 
systems validate, configure or bind
variables or memory locations by looking at the database schema as part 
of the 'compile' process.

Marc


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

RE: Managing SQL in the repository

Posted by Mark <ma...@msdhub.com>.

We recently started managing our database schema in svn, and we came up with
a similar method, with one difference. When a dev checks in a change to a
create (table|proc|func|...) script, they must also check in an alter or a
drop/create than will migrate the current version to the new version,
enforced by a pre-commit hook. That way, for any given revision, the raw
creates and the last tag's creates + subsequent modifications will give the
same schema.

This might only be possible because we're a pretty small team (6) of people
committing.

Mark

-----Original Message-----
From: Gary Affonso [mailto:glists@greywether.com] 
Sent: Monday, May 31, 2004 1:05 PM
Subject: Re: Managing SQL in the repository

On 5/30/04 4:08 PM, "m christensen" <df...@xmission.com> wrote:

> I don't know what you mean by "keeping configuration data in the DB".

You original reply stated this:

> #3 Compiled binaries often pull parameters or configuration info from
> the database

This item doesn't apply to our situation.

- Gary

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Managing SQL in the repository

Posted by Gary Affonso <gl...@greywether.com>.

On 5/30/04 4:08 PM, "m christensen" <df...@xmission.com> wrote:

> I don't know what you mean by "keeping configuration data in the DB".

You original reply stated this:

> #3 Compiled binaries often pull parameters or configuration info from
> the database

This item doesn't apply to our situation.

- Gary


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Managing SQL in the repository

Posted by m christensen <df...@xmission.com>.

Gary Affonso wrote:

>On 5/29/04 10:47 PM, "m christensen" <df...@xmission.com> wrote:
>
>  
>
>>First Off I'll ASSUME y'all are not falling for the Micro$oft pitfall of
>>bastardizing an
>>official 30 year old standard definition, i.e."SQL" into a closed source
>>proprietary brand name for
>>Microsoft SQL Server.
>>    
>>
>
>Good lord, man.  What do you think I am?  Of course I know the difference.
>Does anybody on this list really not?
>
>  
>
Well, that was a joke. mostly...
Previous posts used the term "Sproc" while the concept of Database Side 
Stored Procedures is rather
common with high-end RDBMS systems. the term SPROC is usually used in 
reference to MS SQL Server.
If there is any doubt about the misuse/abuse (IMHO)  of the term SQL,  
reply to job postings simply listing
SQL as a main requirement. With 20 years experience in 
Oracle/Informix/DB2/Sybase. Odds are you'll get a
bunch of  "You are clearly clueless and UNQUALIFIED We obviously mean 
SQL Server. Sheesh we SAID 'SQL'"

>>The approach taken will depend on the target of your development effort.
>>The 2 general targets I see are
>>a shipped product and incremental in-house system modification.
>>    
>>
>
>We are an incremental, in-house project so I'll respond to that portion of
>your email.
>
>  
>
>>A. Actual backup copies of  the Production database are restored as a
>>starting point for databased used in part D.
>>B. No Diff.
>>C. All SQL EVER run against The Production database are checked in as
>>patches. Both DDL & DML are included.
>>    
>>
>
>So to rebuild the database schema in this scenario, you'd have to run the
>initial "create" script and then run the series of alter scripts that
>brought everything "current"?  Is that right?
>
>  
>
No.
In this situation you would NEVER rebuild the database schema from your 
code.
As I said in point A.You will only Restore it to a point-in-time via. 
the recovery tools specific to the
RDBMS in question.

>>   In other words, I don't care if it's 'Just a data change' or major
>>table schema change.
>>   It's a script, it's checked in, and it has been tested or it
>>doesn't run in production, Period.
>>    
>>
>
>I'm confused by this.  You say above that you script (and revision-control)
>all changes to the db including those for data.  But then you say:
>
>  
>
>>It's hard enough to track system schema changes as shown here and in
>>previous posts, but it is nearly IMPOSSIBLE to
>>track production database data changes from a live system, after all the
>>application is constantly changing  data by definition.
>>    
>>
>
>Can you clarify?
>
>  
>
Those scripts are checked in for several reasons.
--->None of those reasons are to rebuild the database from scratch.<---

The Reasons to archive them in my experience are to track who is doing 
what to MY database.
Interactive system modification leaves no audit trail. Some bonehead 
developer dropping a table using a tool like
TOAD or Via an interactive database login would NEVER just Drop a table, 
or at least never admit to it.
Boy, do I have war stories to back-up this personal requirement but I'll 
spare y'all.
When the system suddenly fails after a patch They are VERY valuble for 
figuring out what went wrong and
what  actions to take to fiix the problem

>>D. ALL Modern Decent Production level RDBMS Systems have a "Point in
>>Time" recovery capability. IMHO.
>>    
>>
>
>We are, of course, doing incremental backups.  My desire for a DDL "script
>history" isn't just for their recovery value.
>
>A DDL script history also lets us:
>
>1) Easily setup duplicate test/development systems or build from scratch if
>we need/want to.
>
>  
>
As I stated, I only build test and development systems from Production 
backups.

I can't stress enough the DATA in the database is as critical as the 
schema itself.
Without that data, development and testing is done in a vacum, all too 
often I've had patches
kill the PRODUCTION system because they were 'tested" against a customer 
table with 200 dummy
records.Poorly constructed update statements or logic can choke and run 
out of memory or take
several days to run when they hit REAL production data or relationships 
the developer or builder of
dummy data didn't anticipate.
"Really, a customer can have multiple addresses, or even NONE ?!?!?. I 
didn't know that."
Now I have a dead/hung or very SLOW production system to recover and as 
the DBA, I'm the one the
users blame.

>2) They provide clear and detailed documentation on the evolution of the
>database schema
>
>  
>
My current system is 16 months old. I have 983 incremental changes. I 
HAVE all those scripts.
Using them alone to deduce a general picture of the evolution of the 
database schema is of no value to me.
Those 5-6,000 script snippets may as well be printed and piled in the 
middle of the floor. Without some other
method to relate those scripts to the database itself that are worthless.
Automated tools that run on a schedule and document schema changes to an 
archive for later reference
is handy, As are good design tools like ERwin or Oracle Designer, if 
kept current.

>3) They let us correlate this evolution to issues in our issue tracker.
>
>  
>
Yes, they are critical to deduce what was done and why. All versions of 
code in my repository are tied to
Specific SCRs in out problem tracking and assignment system.

>4) They give us a good starting point for building scripts that implement
>the database schema on other servers (something we're likely to be faced
>with this summer).
>
>  
>
I have tools that will generate ALL the SQL required to build an emply 
copy of a database based on
extracting schema info from a running system. That is how I build 
databases in this case.

>>It's hard enough to track system schema changes as shown here and in
>>previous posts, but it is nearly IMPOSSIBLE to
>>track production database data changes from a live system, after all the
>>application is constantly changing  data by definition.
>>I think the only real approach is to track overt system changes from the
>>developers for other purposes but to admit the
>>only way to get a copy of a database at a specific point is thru system
>>backups and  point-in-time recovery capabilities
>>of the RDBMS itself.
>>    
>>
>
>Thanks for your detailed reply.
>
>I don't think all the aspects of your system apply to us (we are not keeping
>configuration data in the DB) but it's helpful to see that I'm not missing
>"easy" solution.  Misery loves company, I guess.
>
>  
>
I don't know what you mean by "keeping configuration data in the DB".

Marc


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Managing SQL in the repository

Posted by Gary Affonso <gl...@greywether.com>.

On 5/29/04 10:47 PM, "m christensen" <df...@xmission.com> wrote:

> First Off I'll ASSUME y'all are not falling for the Micro$oft pitfall of
> bastardizing an
> official 30 year old standard definition, i.e."SQL" into a closed source
> proprietary brand name for
> Microsoft SQL Server.

Good lord, man.  What do you think I am?  Of course I know the difference.
Does anybody on this list really not?

> The approach taken will depend on the target of your development effort.
> The 2 general targets I see are
> a shipped product and incremental in-house system modification.

We are an incremental, in-house project so I'll respond to that portion of
your email.

> A. Actual backup copies of  the Production database are restored as a
> starting point for databased used in part D.
> B. No Diff.
> C. All SQL EVER run against The Production database are checked in as
> patches. Both DDL & DML are included.

So to rebuild the database schema in this scenario, you'd have to run the
initial "create" script and then run the series of alter scripts that
brought everything "current"?  Is that right?

>    In other words, I don't care if it's 'Just a data change' or major
> table schema change.
>    It's a script, it's checked in, and it has been tested or it
> doesn't run in production, Period.

I'm confused by this.  You say above that you script (and revision-control)
all changes to the db including those for data.  But then you say:

> It's hard enough to track system schema changes as shown here and in
> previous posts, but it is nearly IMPOSSIBLE to
> track production database data changes from a live system, after all the
> application is constantly changing  data by definition.

Can you clarify?

> D. ALL Modern Decent Production level RDBMS Systems have a "Point in
> Time" recovery capability. IMHO.

We are, of course, doing incremental backups.  My desire for a DDL "script
history" isn't just for their recovery value.

A DDL script history also lets us:

1) Easily setup duplicate test/development systems or build from scratch if
we need/want to.

2) They provide clear and detailed documentation on the evolution of the
database schema

3) They let us correlate this evolution to issues in our issue tracker.

4) They give us a good starting point for building scripts that implement
the database schema on other servers (something we're likely to be faced
with this summer).

> It's hard enough to track system schema changes as shown here and in
> previous posts, but it is nearly IMPOSSIBLE to
> track production database data changes from a live system, after all the
> application is constantly changing  data by definition.
> I think the only real approach is to track overt system changes from the
> developers for other purposes but to admit the
> only way to get a copy of a database at a specific point is thru system
> backups and  point-in-time recovery capabilities
> of the RDBMS itself.

Thanks for your detailed reply.

I don't think all the aspects of your system apply to us (we are not keeping
configuration data in the DB) but it's helpful to see that I'm not missing
"easy" solution.  Misery loves company, I guess.

- Gary

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Managing SQL in the repository

Posted by m christensen <df...@xmission.com>.

Gary Affonso wrote:

>On 5/28/04 2:42 PM, "Brad Rhoads" <br...@zethcon.com> wrote:
>
>  
>
>>How do we
>>deal with the SQL as it is functionally related to other parts of the
>>system?
>>    
>>
>
>We're dealing this this one, too, right now and our solution for Store
>Procedures is this:
>
>1) We require that our developers "script" their schema-related (not data
>related) changes to the script.  In short, they're not allowed to simply
>open up an sproc-editor and make a quick change.  They have to put the
>change in a "create sproc" script.
>
>We keep these scripts in a directory on our repository.
>
>2) Ensure that the "script" is written such that it drops the old resource
>before adding the new one.  In short we disallow the use of alter statements
>for sprocs.
>
>3) We require that there is one file for every sproc on the database.
>
>
>And then we just manage those scripts as if they were standard code.
>
>This gets us all the normal repository abilities (snapshots, branches,
>merges, diffs) on sproc-related SQL resources.
>
>----
>
>Tables are trickier because we can't "drop" the old table before adding new
>one (the data would go away).  This means we're forced to use alter
>statements to make table-related mods (instead of the drop/replace scheme we
>use for sprocs).
>
>And that means that a table's given state is a combination of the original
>create-scripts and a handful of alter scripts that have come after its
>original creation.  Which means no ability to easily diff different table
>versions.
>
>So for table-related DDL, I'm still not sure how we'll be handling it, I'd
>love to hear some suggestions.
>
>- Gary
>
>  
>
First Off I'll ASSUME y'all are not falling for the Micro$oft pitfall of 
bastardizing an
official 30 year old standard definition, i.e."SQL" into a closed source 
proprietary brand name for
Microsoft SQL Server.

So, assuming we are talking about 'SQL' as a standard language used by 
over a dozen RDBMS
systems answers are by definition generalities....

My particular target Database is Oracle, but the issues are common.

#1 Application 'Source'  rev. ctl. is fairly straightforward.
#2 RDBMS data and Schema changes are problematic as mentioned above.
#3 Compiled binaries often pull parameters or configuration info from 
the database, this means a particular
      Binary is a product of the "Source Code", the database Schema AND 
the data in the database.

In the intrest of brevity I'll skip the proofs of the above 3 
statements, if the validity of the points are in question
I'd be happy to elaborate, but I'll assume they are a given.

The approach taken will depend on the target of your development effort. 
The 2 general targets I see are
a shipped product and incremental in-house system modification. 

The way I manage a 'shipped product' development is based on several 
general principles and steps.
A. Representative copies of  supported legacy databases are maintained.
B. Application source code is managed in a Rev. Ctl. system.
C. Database build and migration scripts are managed as part of the 
source tree.
      As such there is a Create table, Alter table or similar 'single 
point' which defines every specific table.
       If the table is new, there is a create table statement, if a new 
column is needed, the create statement is changed.
        How this applies to alter statement or migration code is a 
exercise for the reader.
      The point is there is NOT a stream of schema modifications spread 
thru hundreds of  SQL 'Patch scripts'.
D. A 'current' database is built 'From Scratch' via the Creation / 
Migration scripts for EVERY build of the
     Development, Testing and Production Databases.
E. Application Binaries are built from the source in the repository 
against the database built from the same version of
     the build / patch scripts.
F. Those Binaries are checked into the Rev. Ctl Software (Some may not 
agree this is required or appropriate, I think is is, IMHO).
G. Those biraries go thru the system testing process and are what gets 
released, They are NEVER rebuilt without retesting.

An ongoing development effort has more pitfalls, I'll only detail the 
differences from above.

A. Actual backup copies of  the Production database are restored as a 
starting point for databased used in part D.
B. No Diff.
C. All SQL EVER run against The Production database are checked in as 
patches. Both DDL & DML are included.
     In other words, I don't care if it's 'Just a data change' or major 
table schema change.
     It's a script, it's checked in, and it has been tested or it 
doesn't run in production, Period.
D. ALL Modern Decent Production level RDBMS Systems have a "Point in 
Time" recovery capability. IMHO.
     I would never run a Production RDBMS with ongoing development with 
this feature turned off.
    As such, I have a backup copy of all files and logs required to 
rebuild a full and complete copy of the database
    as of 'May 30 2004 at 15:34:12'.
E. Some SQL will need to be run before the binaries are rebuilt, and 
some can only be run after, how these dependencies are
     Tracked and enforced is an excersise for the reader.
     The current  "Timestamp" or System Change Number is logged to 
identify a specific recoverable point-in-time of the database itself.
     Required pre-build patch scripts are pulled from the repository and 
run.
     Binaries are built from a specific rev of source code and the 
database at a specific point in time.
     Required post-build patch scripts are pulled from the repository 
and run.
  
F. No Diff.
G. No Diff.
H. Production moves also need to execute the SQL PRE & POST build patch 
scripts although the binaries are not rebuilt.

It's hard enough to track system schema changes as shown here and in 
previous posts, but it is nearly IMPOSSIBLE to
track production database data changes from a live system, after all the 
application is constantly changing  data by definition.
I think the only real approach is to track overt system changes from the 
developers for other purposes but to admit the
only way to get a copy of a database at a specific point is thru system 
backups and  point-in-time recovery capabilities
of the RDBMS itself.


Marc


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Managing SQL in the repository (was: Organization Advise)

Posted by Gary Affonso <gl...@greywether.com>.

On 5/28/04 2:42 PM, "Brad Rhoads" <br...@zethcon.com> wrote:

> How do we
> deal with the SQL as it is functionally related to other parts of the
> system?

We're dealing this this one, too, right now and our solution for Store
Procedures is this:

1) We require that our developers "script" their schema-related (not data
related) changes to the script.  In short, they're not allowed to simply
open up an sproc-editor and make a quick change.  They have to put the
change in a "create sproc" script.

We keep these scripts in a directory on our repository.

2) Ensure that the "script" is written such that it drops the old resource
before adding the new one.  In short we disallow the use of alter statements
for sprocs.

3) We require that there is one file for every sproc on the database.

And then we just manage those scripts as if they were standard code.

This gets us all the normal repository abilities (snapshots, branches,
merges, diffs) on sproc-related SQL resources.

----

Tables are trickier because we can't "drop" the old table before adding new
one (the data would go away).  This means we're forced to use alter
statements to make table-related mods (instead of the drop/replace scheme we
use for sprocs).

And that means that a table's given state is a combination of the original
create-scripts and a handful of alter scripts that have come after its
original creation.  Which means no ability to easily diff different table
versions.

So for table-related DDL, I'm still not sure how we'll be handling it, I'd
love to hear some suggestions.

- Gary

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org