You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by "Glenn A. Thompson" <gt...@cdr.net> on 2002/05/11 20:51:27 UTC

DB and Node sanity.

Folks:

Will whatever Node sanity you arrive at be three separate thingy's?  Or all one
blob sort of thing?  I'm hoping for three separate thing-a-ma-bobs or
(do-hickys:-).  When you guys mention space conservation in your discussions, I
get a little squeamish.  SQL DBs piss disk away like beer at a football game.  Am
I going to get beat about the head and shoulder for their wastefulness?  I was
planing on keeping this ID as a number in the SQL Tables.  But I don't want to
argue about it.  I will do strings if you guys want me to.  One thing I do prefer
is that I keep any (ID/or potentially indexed sort of thing) a fixed length if
possible. Some DBs have the ability to widen a column on the fly if needed or you
can always export then import to the wider columns.

I've been getting my DB dev environment sorted out.  I have resurrected (Twice,
damn disk drives) a Sparc Ultra10 with Ultra2 SCSI drives running Oracle 9i and
MySQL 4.0.1.  I may put up PostGres as well.  Anyhoo.  It will serve only as my DB
box.

I want to give ODBC a run around the block.  I've never used it before but there
seems to be many recent improvements under 3.51.x.  I've always used PRO C or
MySql's native interface from my C code.  However, the bulk of  my DB usage has
been from Java (JDBC) and Perl (DBI).  ODBC feels more like them to me.  Plus the
Management layer may come in handy.  Given all the heated debates over performance
I thought I would give the list an oportunity  to veto ODBC before I waste any
time evaluating it.  My mind is hardly made up.  I thought some of the Windows
guys might want to way in.  I've always understoood them to be the biggest users
of ODBC.

I have to run.  I'll check back to the list later tonight.

Thanks,
gat

Karl Fogel wrote:

> Greg Stein <gs...@lyra.org> writes:
> > > Also, Greg, you argued for txn keys to be base36 when you were here in
> > > Chicago last.  Well, not "argued", because we all agreed with you, but
> > > you were in fact the person who emphasized it first.
> >
> > Wha??!?!  Um, no. If I *really* did, then I was smoking crack because that
> > certainly isn't what we want to build.
>
> Oh -- hunh, I do remember it pretty clearly, but since you don't
> really hold that position, it probably means I misunderstood what you
> were saying at the time.
>
> > Base-36 (or your suggestion for base-62) is way beyond the insane.
> > Completely non-standard in all respects, none of the speed Brane was talking
> > about with base-16 encoding, and still obfuscates the underlying integer
> > that these keys are derived from.
>
> Base 36 is what I'd like to do; I've already presented the reasons.
> Calling it "way beyond insane" doesn't add any substance to the
> discussion.
>
> > Work out the math. If you manage to *sustain* one per second, you've got 136
> > years worth (unsigned 32-bit number). You'll definitely burst at times, but
> > a sustained rate of one per second, year after year, is absolutely amazing.
> > Running out is simply not going to happen. If there is any possible scenario
> > where we believe that 4 billion transactions will actually occur, then hell:
> > make it an apr_uint64_t.
>
> This is not "amazing" at all.  Not only humans use repositories; other
> programs will start using them too.  And Moore's Law still held last
> time I checked.  Why not make *sure* the problem is solved, once and
> for all?  It's trivial for us to do -- I know, because I did it for
> txns already and it was no effort -- it doesn't affect performance in
> any meaningful way, and (for me at least, don't know about others)
> improves maintainability by not implying ordering when there is no
> ordering going on, and by making IDs a bit more readable.
>
> When the designers of IPv6 considered network address allocation
> (which was already supposed to be more than enough at 32 bits, years
> ago when IPv4 was designed), they jumped all the way to 128 bits,
> wisely skipping 64.  They realized that the *rate* of consumption can
> increase unexpectedly -- since it already happened once, as everyone
> and their toaster started wanting an IP address.  (Nevertheless,
> within our lifetimes I expect to see debates about how to handle the
> impending exhaustion of 128 bits.)
>
> +1 on base36 with `const char *'; -1 on integers whatever the
> marshalling.
>
> I believe we've both presented our complete arguments for and against.
> If you have something new and constructive to add, please do!  FUD
> like "way beyond insane" and "completely non-standard in all respects"
> doesn't count :-).
>
> -K
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DB and Node sanity.

Posted by "Glenn A. Thompson" <gt...@cdr.net>.
Hey

>    - The DB backend can/should impose a key width limit.  In the
>      unlikely event that some repos actually bumps up against the
>      limit, it's no big deal to resize.
>

Cool!

>
>    - But the in-memory form of the keys should not impose any limit.
>      That way, no code needs to change just because someone expanded
>      their backend's width.
>
> (Hmmm, you know, taken out of context by a non-programmer, that last
> sentence could really sound odd.)
>

Even to a programmer:-)

gat


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DB and Node sanity.

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
cmpilato@collab.net writes:
> Regardless of what is chosen, we will not allow a schema in which the
> keys can grow without bound (heh...though in this case, we sure wish
> we could easily let them grow forever :-).  If we choose a 64-bit
> integer, you can simply make sure to set your table width to however
> many characters needed to hold the largest collection of those things.

Ah, okay, now I think I know how to make the distinction I've been
trying to make:

   - The DB backend can/should impose a key width limit.  In the
     unlikely event that some repos actually bumps up against the
     limit, it's no big deal to resize.

   - But the in-memory form of the keys should not impose any limit.
     That way, no code needs to change just because someone expanded
     their backend's width.

(Hmmm, you know, taken out of context by a non-programmer, that last
sentence could really sound odd.)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DB and Node sanity.

Posted by cm...@collab.net.
"Glenn A. Thompson" <gt...@cdr.net> writes:

> Folks:
> 
> Will whatever Node sanity you arrive at be three separate thingy's?
> Or all one blob sort of thing?  I'm hoping for three separate
> thing-a-ma-bobs or (do-hickys:-).  When you guys mention space
> conservation in your discussions, I get a little squeamish.  SQL DBs
> piss disk away like beer at a football game.  Am I going to get beat
> about the head and shoulder for their wastefulness?  I was planing
> on keeping this ID as a number in the SQL Tables.  But I don't want
> to argue about it.  I will do strings if you guys want me to.  One
> thing I do prefer is that I keep any (ID/or potentially indexed sort
> of thing) a fixed length if possible. Some DBs have the ability to
> widen a column on the fly if needed or you can always export then
> import to the wider columns.

Regardless of what is chosen, we will not allow a schema in which the
keys can grow without bound (heh...though in this case, we sure wish
we could easily let them grow forever :-).  If we choose a 64-bit
integer, you can simply make sure to set your table width to however
many characters needed to hold the largest collection of those things.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org