You are viewing a plain text version of this content. The canonical link for it is here.

Posted to derby-user@db.apache.org by Legolas Woodland <le...@gmail.com> on 2006/05/03 18:08:59 UTC

Re: Spawning Data on Multiple Directories

Rodrigo Madera wrote:
> Hello to all,
>
> I was wondering how can I span Derby's Database on multiple disks. For
> example, if I were to use:
>
> - C:\Somewhere\Derby1     40 GB
> - D:\ANicePlace\Derby2     20 GB
> - F:\Looking\Good\Dir\Derby3    15GB
>
> Is this possible?
>
> Thanks for any input,
> Rodrigo
>
hi
Does derby could handle such a big Database effectively in a multi user 
environment ?
I think it is a big big database , 75GB .
does any one has experiences with such databases and Derby ?

Re: Spawning Data on Multiple Directories

Posted by Rodrigo Madera <ro...@gmail.com>.

Well, it's been a hot discussion on the other generated thread
discussion politics...

I prefer to remain in my field. I will file a JIRA and will check what
has to be done. If the StorageFactory is correctly designed then it
will not be that hard to implement (fingers crossed).

Let's Rock and Roll...

Rodrigo

On 5/4/06, Andrew McIntyre <mc...@gmail.com> wrote:
> On 5/4/06, Rodrigo Madera <ro...@gmail.com> wrote:
> >
> > What I'm asking for is available in other DBMS, where you specify how
> > many bytes to store in each directory. So you can make special
> > combinations on this.
> >
> > Indeed there is a select group of people to actually take advantage of
> > such a feature.
>
> I'd suggest filing an enhancement request in JIRA. As you point out,
> there may be others who would like such a feature and if their is a
> JIRA filed for it, those interested can track any progress.
>
> > Of course, people who are playing with embedding databases in web
> > pages will be concerned. In that case, turn the plugin off. People who
> > upload the Derby engine in their wrist watch will like it to be small
> > too... but then again, is Derby supposed to be functional? small? 100%
> > Java?... Why not all of the above? Just make it modular.
>
> Derby is somewhat modularized but could probably use some improvement
> in this area. But there is a StorageFactory interface that might make
> it easier for you to write support for this feature. For example, one
> user wrote an all-in-memory StorageFactory:
>
> http://issues.apache.org/jira/browse/DERBY-646
>
> Perhaps you could write a StorageFactory that allowed you to span
> multiple disks. I don't know if that's possible, since I'm not an
> expert in the store code, but at first glance it seems like it might.
>
> > I would love to contribute on the spirit of open source and make such
> > an improvement, but I just don't have time to Get To Know The Code
> > (TM) first. If there was a structured and accepted plugin engine I
> > would definitely invest two weekends on this.
>
> After filing the enhancement request in JIRA, I would suggest
> following up on derby-dev. You can probably get some pointers from
> people knowledgable in the relevant code who could help you scope out
> the project and help you determine if this is possible and something
> you have time for.
>
> Best regards,
> andrew
>

Re: Spawning Data on Multiple Directories

Posted by Andrew McIntyre <mc...@gmail.com>.

On 5/4/06, Rodrigo Madera <ro...@gmail.com> wrote:
>
> What I'm asking for is available in other DBMS, where you specify how
> many bytes to store in each directory. So you can make special
> combinations on this.
>
> Indeed there is a select group of people to actually take advantage of
> such a feature.

I'd suggest filing an enhancement request in JIRA. As you point out,
there may be others who would like such a feature and if their is a
JIRA filed for it, those interested can track any progress.

> Of course, people who are playing with embedding databases in web
> pages will be concerned. In that case, turn the plugin off. People who
> upload the Derby engine in their wrist watch will like it to be small
> too... but then again, is Derby supposed to be functional? small? 100%
> Java?... Why not all of the above? Just make it modular.

Derby is somewhat modularized but could probably use some improvement
in this area. But there is a StorageFactory interface that might make
it easier for you to write support for this feature. For example, one
user wrote an all-in-memory StorageFactory:

http://issues.apache.org/jira/browse/DERBY-646

Perhaps you could write a StorageFactory that allowed you to span
multiple disks. I don't know if that's possible, since I'm not an
expert in the store code, but at first glance it seems like it might.

> I would love to contribute on the spirit of open source and make such
> an improvement, but I just don't have time to Get To Know The Code
> (TM) first. If there was a structured and accepted plugin engine I
> would definitely invest two weekends on this.

After filing the enhancement request in JIRA, I would suggest
following up on derby-dev. You can probably get some pointers from
people knowledgable in the relevant code who could help you scope out
the project and help you determine if this is possible and something
you have time for.

Best regards,
andrew

RE: Spawning Data on Multiple Directories

Posted by de...@segel.com.

See my comments below

> -----Original Message-----
> From: Rodrigo Madera [mailto:rodrigo.madera@gmail.com]
> Sent: Thursday, May 04, 2006 3:09 PM
> To: Derby Discussion; msegel@segel.com
> Subject: Re: Spawning Data on Multiple Directories
> 
> Well, back to the original subject...
> 
> Yes, it's a real benefit to be able to use different _directories_ in
> the three only for certain users. In an enterprise where you have
> large RAIDs and special RAM disks this isn't attractive, but since
> Derby is not in that market space we can just leave that to
> PostgreSQL.
> 
[mjs] 
Well, don't sell your idea short.

How databases store their data on the physical disk is very important with
respect to size and performance.

Suppose you wanted to use Derby as part of a digital jukebox. You may want
to store the titles and artist information in one space, and then store the
actual MP3s on a different disk. 

> What I'm asking for is available in other DBMS, where you specify how
> many bytes to store in each directory. So you can make special
> combinations on this.
> 
> Indeed there is a select group of people to actually take advantage of
> such a feature.
> 
> Anyways, this feature is not going to be an impact in footprint. And
> yes, Derby needs an architectural improvement to allow plugins (or raw
> branches) for people who would like to add technical candy to it. It's
> nice to have a 2MB footprint, but in my case, I have one thousand
> times that in RAM, so I just don't care.
> 
[mjs] 
Uhm, well, actually it could. Take a step back and look at the larger
picture. You're now asking that Derby store certain tables in certain
spaces, and that these spaces can occur anywhere on disk. So now instead of
managing a single source of data, you now have to manage multiple sources.

> Of course, people who are playing with embedding databases in web
> pages will be concerned. In that case, turn the plugin off. People who
> upload the Derby engine in their wrist watch will like it to be small
> too... but then again, is Derby supposed to be functional? small? 100%
> Java?... Why not all of the above? Just make it modular.
> 
[mjs] 
Exactly.

But to make Derby modular, and to do it right would mean taking the time to
do a redesign of Derby. And that's the crux of it. Revamping the storage
infrastructure is a large project, larger than just a Jira issue.

You could say hey! Lets make authentication modular and possibly support
PAM. 
(Well you get the idea.) 

Since IBM and Sun have both the resources and a vested interest in Derby,
then it would make sense that they step up to the plate.

> I would love to contribute on the spirit of open source and make such
> an improvement, but I just don't have time to Get To Know The Code
> (TM) first. If there was a structured and accepted plugin engine I
> would definitely invest two weekends on this.
> 
[mjs] 
Two weekends? LOL...

Its actually a larger effort than that.
Of course before you volunteer, you really need to read the Apache licensing
and then you need to either sign off or get your employer's sign off on
indemnifying Apache for any potential litigation you may face on code that
you have provided.

Re: Spawning Data on Multiple Directories

Posted by Rodrigo Madera <ro...@gmail.com>.

Well, back to the original subject...

Yes, it's a real benefit to be able to use different _directories_ in
the three only for certain users. In an enterprise where you have
large RAIDs and special RAM disks this isn't attractive, but since
Derby is not in that market space we can just leave that to
PostgreSQL.

What I'm asking for is available in other DBMS, where you specify how
many bytes to store in each directory. So you can make special
combinations on this.

Indeed there is a select group of people to actually take advantage of
such a feature.

Anyways, this feature is not going to be an impact in footprint. And
yes, Derby needs an architectural improvement to allow plugins (or raw
branches) for people who would like to add technical candy to it. It's
nice to have a 2MB footprint, but in my case, I have one thousand
times that in RAM, so I just don't care.

Of course, people who are playing with embedding databases in web
pages will be concerned. In that case, turn the plugin off. People who
upload the Derby engine in their wrist watch will like it to be small
too... but then again, is Derby supposed to be functional? small? 100%
Java?... Why not all of the above? Just make it modular.

I would love to contribute on the spirit of open source and make such
an improvement, but I just don't have time to Get To Know The Code
(TM) first. If there was a structured and accepted plugin engine I
would definitely invest two weekends on this.

Thanks for the input to all,
Rodrigo

On 5/4/06, derby@segel.com <de...@segel.com> wrote:
>
>
> > -----Original Message-----
> > From: Rodrigo Madera [mailto:rodrigo.madera@gmail.com]
> > Sent: Wednesday, May 03, 2006 1:28 PM
> > To: Derby Discussion; mikem_app@sbcglobal.net
> > Subject: Re: Spawning Data on Multiple Directories
> >
> > In that case, what about a proposal for the development team to allow
> > such one-database-multiple-directories architectural improvements?
> >
> [mjs]
> Well,
>
> Why stop at "multiple directories"?
>
> Not to get on a soap box, but what Rodrigo is asking for is the ability to
> create chunks or table spaces on either cooked or raw file system space.
> (Each chunk could be allocated on different disks/filesystems/etc...)
>
> The crux is that such a design will increase the footprint, and limit
> Derby's viability in certain niche markets. Heck, just the other day,
> someone was touting a url to someone who embedded Derby in to a web page...
>
> This goes back to the larger issue is the need to consider a look at
> redesigning Derby's framework to allow for options to be installed/removed
> at the time of deployment. (And there are some headaches even there too...)
>
>
> The point is that Derby is growing in popularity and certain larger issues
> than just quick fixes that can be addressed in a single JIRA issue need to
> be addressed.
>
> Of course, what do I know? ;-)
>
> I'm just a voice in cyber space. I no longer work for the blue pig nor do I
> work for Sun. Both heavy weights in Derby and both have their own agendas.
> What is being asked, will require the deep pockets of either or both
> companies. My bet is on Sun. IBM is still digesting the Informix vs DB2
> issue and adding Derby to the mix is not in their best interest. (At least
> on the surface. ;-)
>
> -Gumby
>
> > Depending on the OS to mount several devices (or directories) into a
> > parent one is not something trivial. And the gain in having such
> > flexibility is a sure winner.
> >
> > Rodrigo
> >
> > On 5/3/06, Mike Matrigali <mi...@sbcglobal.net> wrote:
> > > The data of a single database can only be in a single directory in
> > > the current derby implementation, spreading that data across multiple
> > > devices is left to the OS.  The log can be placed on
> > > a separate disk.
> > >
> [SNIP]
>
>
>

Re: How Open Source Works (was Re: Spawning Data on Multiple Directories)

Posted by Andrew McIntyre <mc...@gmail.com>.

On 5/5/06, derby@segel.com <de...@segel.com> wrote:
>
> Yet, when user's constantly ask for features that are not currently
> implemented in Derby, they are all pointed to creating a Jira entry.

Yes, because that's the first step to working on something here, and
the last for a user without the time to get involved. Even large,
complex, architectural issues can be resolved by a single user if they
are willing and able and have enough time and support. That's what
Google's Summer of Code is all about, for example. The effort involved
in implementing the originally requested feature is unknown, since all
that's been done so far is to describe the feature. Personally, I
don't think the effort involved with what Rodrigo requested is
necessarily that large.

> It is only when someone points out that there needs to be a redesign of Derby
> to allow for a systematic approach or to discuss the ramifications of these
> requested mods that you step in and try to "control" the discussion.

I have seen neither any evidence that this user's requested feature
requires a redesign of Derby nor have I seen that anyone is trying to
control discussion about the feature. In fact, I suggested that
discussion about the feature continue on the dev list if Rodrigo was
interested in working on it and that seems to be as far as it's gone.

andrew

Re: How Open Source Works (was Re: Spawning Data on Multiple Directories)

Posted by Daniel John Debrunner <dj...@apache.org>.

Jean T. Anderson wrote:

> A better subject would have been "How Open Source Works at Apache". In
> fact, the Derby web site uses "How Development Works at Apache":

I think, given the context it was in, it was obvious to most people that
you were talking about how Apache Derby open source community worked.

Dan.

Re: How Open Source Works (was Re: Spawning Data on Multiple Directories)

Posted by Andrew McIntyre <mc...@gmail.com>.

On 5/5/06, Michael Segel <ms...@segel.com> wrote:
>
> Posting in Jira and introducing it are two different things.

Filter by Resolved.

> I notice that there is an Andrew McIntyre who works for IBM in Atlanta.
> Is that you?

Nope.

andrew

RE: How Open Source Works (was Re: Spawning Data on Multiple Directories)

Posted by Michael Segel <ms...@segel.com>.

Uhm...
Posting in Jira and introducing it are two different things.

I notice that there is an Andrew McIntyre who works for IBM in Atlanta.
Is that you? 


> -----Original Message-----
> From: Andrew McIntyre [mailto:mcintyre.a@gmail.com]
> Sent: Friday, May 05, 2006 11:20 AM
> To: Derby Discussion; msegel@segel.com
> Subject: Re: How Open Source Works (was Re: Spawning Data on Multiple
> Directories)
> 
> On 5/5/06, derby@segel.com <de...@segel.com> wrote:
> >
> > Where in what you wrote is there any discussion of net new technology
> being
> > introduced by either Sun or IBM?
> 
> You can search in JIRA for new features. I suggest you do so, you
> might find the results enlightening.
> 
> andrew

Re: How Open Source Works (was Re: Spawning Data on Multiple Directories)

Posted by Andrew McIntyre <mc...@gmail.com>.

On 5/5/06, derby@segel.com <de...@segel.com> wrote:
>
> Where in what you wrote is there any discussion of net new technology being
> introduced by either Sun or IBM?

You can search in JIRA for new features. I suggest you do so, you
might find the results enlightening.

andrew

RE: How Open Source Works (was Re: Spawning Data on Multiple Directories)

Posted by de...@segel.com.

> -----Original Message-----
> From: Daniel John Debrunner [mailto:djd@apache.org]
> Sent: Friday, May 05, 2006 9:34 AM
> To: Derby Discussion
> Subject: Re: How Open Source Works (was Re: Spawning Data on Multiple
> Directories)
> 
> derby@segel.com wrote:
> 
> > Thus, any involvement of IBM and Sun are going to be limited in scope in
> > order to avoid the potential of "leakage" of patented and/or licensed
> IP. It
> > appears that their scope is limited to "bug fixes" only.
> 
> That is plainly not true.
> 
> IBM and Sun have assigned engineers on Derby that have implemented many
> features, written documentation & whitepapers, setup regression testing,
> run additional testing, produced releases, answered user & developer
> questions and also fixed bugs.
> 
[mjs] 
Ok. 
Lets look at what you just wrote and what I said?

Where in what you wrote is there any discussion of net new technology being
introduced by either Sun or IBM?

I would suggest that you consult with someone in IBM legal and have a
serious discussion about IP.

I seem to recall SCO's lawsuit against IBM contains issues on this topic.
SCO did make claims that IBM did allow some "IP leakage" to occur.... ;-)

Oh there' more, but since this is an open forum, I'm not allowed to discuss
it. ;-)

But I digress.

It seems that you fail to recognize that the actions you have listed occur
because its required by both IBM and Sun to support their commercial
operations... In short, these actions are consistent with a "break/fix"
model. So what I posted is not incorrect.

> People from outside IBM and Sun have also done all of the above. That's
> what Apache Derby open source is about, a community of people working
> together.
> 
> Dan.

[mjs] 
I think that there must be an outbreak of SVS occurring. (SVS == Silicon
Valley Syndrome). The paragraph was discussing IBM and Sun involvement in
Derby.  In context, both IBM and Sun as commercial entities that are
publicly traded companies have a fiduciary responsibility to their
shareholders to protect IP that has competitive value. 

Preventing leakage in to code/projects under Apache is a high priority.
Again, talk to IBM's legal about IP and IP protection. 

I find your post highly interesting.  

Corporations do not perform altruistic acts.

Re: How Open Source Works (was Re: Spawning Data on Multiple Directories)

Posted by Daniel John Debrunner <dj...@apache.org>.

derby@segel.com wrote:

> Thus, any involvement of IBM and Sun are going to be limited in scope in
> order to avoid the potential of "leakage" of patented and/or licensed IP. It
> appears that their scope is limited to "bug fixes" only.

That is plainly not true.

IBM and Sun have assigned engineers on Derby that have implemented many
features, written documentation & whitepapers, setup regression testing,
run additional testing, produced releases, answered user & developer
questions and also fixed bugs.

People from outside IBM and Sun have also done all of the above. That's
what Apache Derby open source is about, a community of people working
together.

Dan.

Re: How Open Source Works (was Re: Spawning Data on Multiple Directories)

Posted by "Jean T. Anderson" <jt...@bristowhill.com>.

derby@segel.com wrote:
> Jean,
> 
> I sense some hostility on your part as well as a continued difficulty in
> comprehending the thread of posts. 

hostility? no. But you seem to be deliberately baiting debates and I
frankly don't have time/inclination. Other things on my plate need
attention.

 -jean

<snip>

RE: How Open Source Works (was Re: Spawning Data on Multiple Directories)

Posted by de...@segel.com.

Jean,

I sense some hostility on your part as well as a continued difficulty in
comprehending the thread of posts. 

Please see below....

> -----Original Message-----
> From: Jean T. Anderson [mailto:jta@bristowhill.com]
> Sent: Friday, May 05, 2006 8:52 AM
> To: Derby Discussion
> Subject: Re: How Open Source Works (was Re: Spawning Data on Multiple
> Directories)
> 
> derby@segel.com wrote:
> > Jean,
> >
> > In IBM speak, what is your value add proposition?
> 
> My valued added to this thread is to clarify how the ASF works.
> 
[mjs] 
I don't believe how ASF "works" was ever an issue.  I am very familiar with
how different models of Open Source works as well as how it is possible for
companies to get involved in Open Source from a commercial perspective.

Specifically with respect to Derby, IBM did release their Cloudscape code
under Apache's Opens Source License, as well as continuing to sell support
for Cloudscape. Sun Microsystems did announce a commercial licensed version
JavaDB which they also pledged to support and to maintain as part of the
derby code stream.  Neither is altruistic in their actions.

> Don't demand that others fix *your* issues.
[mjs] 
I don't recall ever demanding anything.

In fact, I don't recall that I ever raised any *issues* that were specific
to my adoption of Derby.

> Don't demand that others take the product in *your* direction.
> 
[mjs] 
Again, this is a tad confusing. What exactly is *my* direction?
I do not recall in every attempting to *dictate* the future of Derby or to
even exert an effort to control the direction of Derby. I'll leave that to
you.

What I did raise is that Derby is going to approach a junction where
requested features by a certain segment of core users will start to conflict
with those that adopted Derby based on its early core features. (ie small
embeddable footprint.)

What I and others have done is to suggest that there be some design changes
that would allow those who implement solutions using Derby, to use a "plug n
play" method of including those features they require, limiting the size of
the footprint. Note too that I didn't *dictate* the direction. Actually the
suggestion came from a Sun Employee. I merely stated the need or rather
identified the problem.  What I did state was that such a discussion *is*
required since any redesign of this magnitude would cross multiple Jira
issues along with requiring a concerted effort.

Again, this discussion is at such a high level, it has nothing to do with
the actual *development* of Derby code, just a proposal to determine the
future direction of development of Derby.

I merely pointed out that both Sun and IBM have the ability to gain from
this discussion on design and that they should step up to the plate and
allocate resources to work on the future product development of Derby. They
have the most to gain. 

Going back to your earlier post, software development under Apache, while an
Open Source model, still has to follow the same path as development of
software under a commercial entity.

Unless the software is in a "break/fix" mode, meaning that there will be no
net new features, just patches to correct existing defects, then there has
to be a centralized team directing the flow of resources.

Its also interesting to note that while both IBM and Sun have the most to
gain from any talks about the future direction, they also have the most to
lose.

>From IBM's perspective, they need to protect their existing implementations,
thus they would want to continue to keep Derby's footprint small. Hence, the
"break/fix" model. It would also be consistent in that it would limit IP
"leakage" risk.  If anyone were to "donate" usable IP, then it would be to
IBM's advantage in that they would gain from that IP...

>From Sun Micro's perspective, they lack a DB to compete with IBM, and HP and
Microsoft/Intel so they could use JavaDB as a full featured relational
database, which would increase the size of the foot print. In addition, Sun
can see that there is a benefit to maintaining the small footprint.

It is interesting that neither company, both with deep enough pockets to
fund the development costs, are willing to cede an IP advantage to the
other....

> You're welcome to become part of this community and work with others
> toward goals that are commonly agreed upon. We welcome any contributor.
> 
[mjs] 
Sorry, unlike you, I don't have the deep pockets of IBM to cover any costs
of indemnification that might arise.  There is no economic incentive for me
to contribute as a developer.

But since you asked earlier ...

I am here as a user of Derby.  I use Derby in solutions that I know can take
advantage of existing code and that the current defects have a minimal
impact. Actually I intend to use Derby in phase II of a project that I am
currently developing....

It is as a user that I have noticed the trend and that there has been a lack
of overt leadership by either IBM or Sun. (Both are looking for a free lunch
so to speak....)

It seems that you, as an IBM employee, are defensive when someone suggests
that IBM or even Sun to step forward with resources to enhance Derby.

I would suggest that your time would be better spent on focusing on the
future direction of Derby, than trying to act as a gatekeeper.

But hey!
What do I know? ;-) 
It's not like I had the sense to talk to an attorney about the restrictions
that they wanted to place upon me as an employee, and about any potential IP
ownership issues I might face even if I did work on my own equipment on my
own time... Or that I hand the sense to stay pigeon holed as a sales critter
when I could have worked in the lab as a developer....
Naw. Stuff Like that takes too much thought process. ;-) Yet I digress.

What do I know? I guess, not a whole heck of a lot... ;-)

-Gumby

Re: How Open Source Works (was Re: Spawning Data on Multiple Directories)

Posted by "Jean T. Anderson" <jt...@bristowhill.com>.

derby@segel.com wrote:
> Jean,
> 
> In IBM speak, what is your value add proposition?

My valued added to this thread is to clarify how the ASF works.

Don't demand that others fix *your* issues.
Don't demand that others take the product in *your* direction.

You're welcome to become part of this community and work with others
toward goals that are commonly agreed upon. We welcome any contributor.

 -jean

<snip>

RE: How Open Source Works (was Re: Spawning Data on Multiple Directories)

Posted by de...@segel.com.

Jean,

In IBM speak, what is your value add proposition?

Clearly, you admit that you've overstated your position on how Open Source
works and even when the topic is limited scope, to deal specifically with
Apache's Open Source world, you still fail to grasp the significance of Sun
and IBM's involvement. You seem to focus on the mechanics of posting rather
than the topic itself. It's very disappointing. 

Yet I digress.

Let's be clear.
As posted earlier in this thread, Commercial and Open Source are *not*
mutually exclusive concepts. One concept deals with making money, the other
concept deals with the control of intellectual property rights.

Thus, due to IBM and Sun's reselling of support services, and their
commitment to maintain a single code stream, Derby has become a "commercial"
product.

Under Apache, there are no IP rights withheld from the public. Once
published, Apache code has a very liberal use and re-license policy. So much
so, in an effort to shield Apache from potential litigation, all
contributors agree to indemnify Apache. (I won't bore you with contractual
issues surrounding software development... )

Thus, any involvement of IBM and Sun are going to be limited in scope in
order to avoid the potential of "leakage" of patented and/or licensed IP. It
appears that their scope is limited to "bug fixes" only.

You are correct that this is the user thread. Sort of a "catch all" for all
the questions that users of Derby may have. Yet when a topic questions IBM
or Sun's involvement in Derby, you seem to jump in and try to act as a gate
keeper. (Seems ironic that Stanley's WWD is not off topic in your eyes...
;-)

Yet, when user's constantly ask for features that are not currently
implemented in Derby, they are all pointed to creating a Jira entry. It is
only when someone points out that there needs to be a redesign of Derby to
allow for a systematic approach or to discuss the ramifications of these
requested mods that you step in and try to "control" the discussion.

The simple fact is that because this topic of discussion is at a high level,
it is not appropriate to occur on the developer's list. This discussion is
more conceptual in nature along with the politics of Apache and the
commercialization of Derby.

Thus, it is appropriate to have some level of discussion as to the pros and
cons regarding "feature requests" that many users have repeatedly asked for.

Keeping on topic in both the older thread and this new "thread"...

Sun and IBM both have the resources along with a potential economic gain in
advancing Derby. Therefore it is appropriate to ask them to "step up to the
plate" and donate resources to an effort to restructure Derby's framework.
The amount of work, exceeds that of a simple patch or fix and then
regression testing. (Which they already do...)

So, why won't they step up? 

To a couple of posters... While there is "modular", this is different from
the term "modular" when used in context of "plug n play". Also if you take
the approach of enhancing Derby that all you want is a "quick fix" to solve
an immediate issue, the odds are that within 1 - 2 generations, Derby will
become a kludge.

The sad truth is that Open Source development must also follow the same
process of code development in commercial products. 

While Rodrigo asked for what he thought was a simple enhancement, there is
much more to this issue. Even though this is open source, you can not become
myopic and lose the overall perspective....

But hey! What do I know? ;-)

-Gumby

PS.

Jean,
I know your employment agreement with IBM. Had you been offered a retention
package, you would have had to sign an even more restrictive agreement.
While portions of said agreement would be unenforceable under Californian
law, the IP portion would be. (And this is true of any other IBM employee)

Therefore, as IBM employees, you are limited in your Open Source
involvement. Any IP that you may create for Derby in the form of an
enhancement would need to be vetted by IBM for two reasons. 1) To ensure
that it does not conflict with existing IP. 2) To ensure that the IP could
not be retained, giving IBM competitive advantage.
(Hint: Why do you think I never took a development role within IBM? ;-)

> -----Original Message-----
> From: Jean T. Anderson [mailto:jta@bristowhill.com]
> Sent: Thursday, May 04, 2006 3:48 PM
> To: Derby Discussion
> Subject: Re: How Open Source Works (was Re: Spawning Data on Multiple
> Directories)
> 
> derby@segel.com wrote:
> > Jean,
> >
> > Your subject line is a tad dangerous.
> 
> A better subject would have been "How Open Source Works at Apache". In
> fact, the Derby web site uses "How Development Works at Apache":
> 
> http://db.apache.org/derby/derby_comm.html#Understand+How+Development+Work
> s+at+Apache
> 
> In any event, it's important to change the subject to move the off topic
> discussion out of the original thread. See that tip for mail list
> participation and more at:
> 
> http://www.apache.org/dev/contrib-email-tips.html
> 
> The derby-user list is for users to obtain (and exchange) help using the
> product -- it isn't the right place for discussions about product
> restructuring. Feel free to raise development-related proposals on
> derby-dev@db.apache.org -- that's where the development discussions and
> decisions get made by *all* development participants in the Derby
> community.
> 
> regards,
> 
>  -jean
> 
> <snip>

Re: How Open Source Works (was Re: Spawning Data on Multiple Directories)

Posted by "Jean T. Anderson" <jt...@bristowhill.com>.

derby@segel.com wrote:
> Jean,
> 
> Your subject line is a tad dangerous. 

A better subject would have been "How Open Source Works at Apache". In
fact, the Derby web site uses "How Development Works at Apache":

http://db.apache.org/derby/derby_comm.html#Understand+How+Development+Works+at+Apache

In any event, it's important to change the subject to move the off topic
discussion out of the original thread. See that tip for mail list
participation and more at:

http://www.apache.org/dev/contrib-email-tips.html

The derby-user list is for users to obtain (and exchange) help using the
product -- it isn't the right place for discussions about product
restructuring. Feel free to raise development-related proposals on
derby-dev@db.apache.org -- that's where the development discussions and
decisions get made by *all* development participants in the Derby
community.

regards,

 -jean

<snip>

RE: How Open Source Works (was Re: Spawning Data on Multiple Directories)

Posted by de...@segel.com.

Jean,

Your subject line is a tad dangerous. 

While Apache is "open source", it is only one model of OpenSource.
There's Eclipse, the Linux Kernel, and individual apps under GPL like
Dovecot that all have different models of control over the opensource.

But since you're both a member of Apache and IBM, lets focus on the three
individual groups: IBM, SUN and Apache since they all have a vested
interest.

First and foremost, being open source and being a commercial product are not
mutually exclusive. (Sendmail, Dovecot, MySQL and of course Derby prove
this...)

Derby is both an "open source" database, along with a "commercial" product.
Note that both IBM and SUN are selling support for Cloudscape and JavaDB
respectively. Both pledge to keep the codestreams consistent with Derby
which implies that any "fixes" made by IBM or Sun will be placed back in to
the public eye.

Now your comments regarding Apache being a "volunteer" based approach to
software development... 

That again is not mutually exclusive, of having either IBM or Sun step up
and "volunteer" some of their resources that are supporting Cloudscape /
JavaDB and to help "direct" the future architect of Derby.

And that's the point.

The amount of resources required to drive any major restructuring of Derby's
framework would require the involvement of a corporation stepping up and
"volunteering" the services of their employees. (Hmmm. Come to think of it,
you're an IBM employee... ;-)

The point of my postings is that such an endeavor is more than just a single
Jira issue it would actually encompass several issues. It would also have a
major impact on existing Derby/Cloudscape implementations... Since both IBM
and Sun are selling support services, they both have a vested interest in
the future development of Derby. 

So, it again appears that you fail to grasp the situation and how software
is developed. 

I am kind of shocked that you don't realize the importance of identifying
stakeholders and aligning your ideas to their needs. That's consulting 101,
or actually software development 101. 

Specifically to this situation, both IBM and Sun hold potential resources
that have the necessary skills to drive this sort of redesign. And since
this *is* Apache, these resources, could not even "volunteer" on an
individual level without first gaining the approval of their employer.

So, rather than add constructive input, you seem to detract from the issue
at hand.

This is not the first post from users of Derby who want advanced features.
Rather than look at each feature individually, its important to see the
larger picture.

So why don't you focus on that?

But hey! What do I know? 
Its not like I've worked for Informix/IBM, or ran my own shop. ;-)

> -----Original Message-----
> From: Jean T. Anderson [mailto:jta@bristowhill.com]
> Sent: Thursday, May 04, 2006 10:45 AM
> To: Derby Discussion
> Subject: How Open Source Works (was Re: Spawning Data on Multiple
> Directories)
> 
> derby@segel.com wrote:
> <snip>
> > This goes back to the larger issue is the need to consider a look at
> > redesigning Derby's framework to allow for options to be
> installed/removed
> > at the time of deployment. (And there are some headaches even there
> too...)
> >
> > The point is that Derby is growing in popularity and certain larger
> issues
> > than just quick fixes that can be addressed in a single JIRA issue need
> to
> > be addressed.
> <snip>
> 
> It's time for a reminder to the list of how software development works
> at Apache because it *is* open source and it *isn't* commercial. This
> can sometimes be a bit confusing to new users who are familiar with how
> commercial products work.
> 
> The first thing to understand is *individuals* volunteer to tackle
> various tasks. Here are some relevant snippets from
> http://www.apache.org/foundation/how-it-works.html :
> 
>    "Projects are normally auto governing and driven by the people who
>     volunteer for the job. This is sometimes referred to as "do-ocracy"
>     -- power of those who do. This functions well for most cases."
> 
>    "All of the ASF including the board, the other officers, the
>     committers, and the members, are participating as individuals.
>     That is one strength of the ASF, affiliations do not cloud the
>     personal contributions.
> 
>     Unless they specifically state otherwise, whatever they post on any
>     mailing list is done *as themselves*. It is the individual
>     point-of-view, wearing their personal hat and not as a mouthpiece
>     for whatever company happens to be signing their paychecks right
>     now, and not even as a director of the ASF."
> 
> But, now lest it look like everybody is working diligently solely as an
> individual (and possibly at cross purposes with others), a lot of
> community coordination and contribution occurs on the Apache mail lists.
> The http://www.apache.org/foundation/how-it-works.html page has lots of
> helpful context.
> 
> At any rate, Derby depends on the community to work together as a whole
> to change the product.
> 
> How can Derby users actively contribute to these changes?
> 
> First, you can open Jira issues to report problems you have stumbled
> upon. More information is at
> http://db.apache.org/derby/DerbyBugGuidelines.html . However, remember
> that volunteers fix issues -- here's a valuable snippet from
> http://www.apache.org/foundation/faq.html#what-is-apache-NOT-about :
> 
>     What is Apache not about?
> 
>     To [... jean deleted text to highlight tail of sentence ...] demand
>     someone else to fix your bugs.
> 
> Second, you can vote on Jira issues that you feel strongly should be
> fixed. The Derby developers do look at the votes.
> 
> Third, if you want to participate even more in the Derby development
> process, you're welcome to subscribe to derby-dev@db.apache.org. That's
> where core development discussions occur and decisions get made.
> 
> Fourth, if you want to actually start doing development, the
> http://wiki.apache.org/db-derby/ForNewDevelopers page has wonderful
> suggestions and tips for new Derby developers.
> 
> Whether a particular user decides to volunteer or not, it's still
> helpful to understand how Apache works.
> 
>  -jean
>

How Open Source Works (was Re: Spawning Data on Multiple Directories)

Posted by "Jean T. Anderson" <jt...@bristowhill.com>.

derby@segel.com wrote:
<snip>
> This goes back to the larger issue is the need to consider a look at
> redesigning Derby's framework to allow for options to be installed/removed
> at the time of deployment. (And there are some headaches even there too...)
>  
> The point is that Derby is growing in popularity and certain larger issues
> than just quick fixes that can be addressed in a single JIRA issue need to
> be addressed.  
<snip>

It's time for a reminder to the list of how software development works
at Apache because it *is* open source and it *isn't* commercial. This
can sometimes be a bit confusing to new users who are familiar with how
commercial products work.

The first thing to understand is *individuals* volunteer to tackle
various tasks. Here are some relevant snippets from
http://www.apache.org/foundation/how-it-works.html :

   "Projects are normally auto governing and driven by the people who
    volunteer for the job. This is sometimes referred to as "do-ocracy"
    -- power of those who do. This functions well for most cases."

   "All of the ASF including the board, the other officers, the
    committers, and the members, are participating as individuals.
    That is one strength of the ASF, affiliations do not cloud the
    personal contributions.

    Unless they specifically state otherwise, whatever they post on any
    mailing list is done *as themselves*. It is the individual
    point-of-view, wearing their personal hat and not as a mouthpiece
    for whatever company happens to be signing their paychecks right
    now, and not even as a director of the ASF."

But, now lest it look like everybody is working diligently solely as an
individual (and possibly at cross purposes with others), a lot of
community coordination and contribution occurs on the Apache mail lists.
The http://www.apache.org/foundation/how-it-works.html page has lots of
helpful context.

At any rate, Derby depends on the community to work together as a whole
to change the product.

How can Derby users actively contribute to these changes?

First, you can open Jira issues to report problems you have stumbled
upon. More information is at
http://db.apache.org/derby/DerbyBugGuidelines.html . However, remember
that volunteers fix issues -- here's a valuable snippet from
http://www.apache.org/foundation/faq.html#what-is-apache-NOT-about :

    What is Apache not about?

    To [... jean deleted text to highlight tail of sentence ...] demand
    someone else to fix your bugs.

Second, you can vote on Jira issues that you feel strongly should be
fixed. The Derby developers do look at the votes.

Third, if you want to participate even more in the Derby development
process, you're welcome to subscribe to derby-dev@db.apache.org. That's
where core development discussions occur and decisions get made.

Fourth, if you want to actually start doing development, the
http://wiki.apache.org/db-derby/ForNewDevelopers page has wonderful
suggestions and tips for new Derby developers.

Whether a particular user decides to volunteer or not, it's still
helpful to understand how Apache works.

 -jean

RE: Spawning Data on Multiple Directories

Posted by de...@segel.com.

> -----Original Message-----
> From: Rodrigo Madera [mailto:rodrigo.madera@gmail.com]
> Sent: Wednesday, May 03, 2006 1:28 PM
> To: Derby Discussion; mikem_app@sbcglobal.net
> Subject: Re: Spawning Data on Multiple Directories
> 
> In that case, what about a proposal for the development team to allow
> such one-database-multiple-directories architectural improvements?
> 
[mjs] 
Well,

Why stop at "multiple directories"?

Not to get on a soap box, but what Rodrigo is asking for is the ability to
create chunks or table spaces on either cooked or raw file system space.
(Each chunk could be allocated on different disks/filesystems/etc...)

The crux is that such a design will increase the footprint, and limit
Derby's viability in certain niche markets. Heck, just the other day,
someone was touting a url to someone who embedded Derby in to a web page...

This goes back to the larger issue is the need to consider a look at
redesigning Derby's framework to allow for options to be installed/removed
at the time of deployment. (And there are some headaches even there too...)

The point is that Derby is growing in popularity and certain larger issues
than just quick fixes that can be addressed in a single JIRA issue need to
be addressed.  

Of course, what do I know? ;-)

I'm just a voice in cyber space. I no longer work for the blue pig nor do I
work for Sun. Both heavy weights in Derby and both have their own agendas.
What is being asked, will require the deep pockets of either or both
companies. My bet is on Sun. IBM is still digesting the Informix vs DB2
issue and adding Derby to the mix is not in their best interest. (At least
on the surface. ;-)

-Gumby

> Depending on the OS to mount several devices (or directories) into a
> parent one is not something trivial. And the gain in having such
> flexibility is a sure winner.
> 
> Rodrigo
> 
> On 5/3/06, Mike Matrigali <mi...@sbcglobal.net> wrote:
> > The data of a single database can only be in a single directory in
> > the current derby implementation, spreading that data across multiple
> > devices is left to the OS.  The log can be placed on
> > a separate disk.
> >
[SNIP]

Re: Spawning Data on Multiple Directories

Posted by Stanley Bradbury <St...@gmail.com>.

Rodrigo Madera wrote:

> In that case, what about a proposal for the development team to allow
> such one-database-multiple-directories architectural improvements?
>
> Depending on the OS to mount several devices (or directories) into a
> parent one is not something trivial. And the gain in having such
> flexibility is a sure winner.
>
> Rodrigo

===== SNIP ===
Hi Rodrigo -

It's not clear to me why you want this capability built into the DBMS 
engine.  Derby has managed to maintain a very small footprint by 
carefully weighing the benefit of adding new features and generally only 
implementing capabilities that are not available by utilizing existing 
technologies.   It seems that your request might be able to be handled 
by OS configuration.  If the goal of this is spreading I/O over multiple 
spindles (e.g. performance) then RAID arrays and disk-striping are good 
technologies to implement.  If none of these options meet your need then 
I recommend filing a feature request for DERBY in JIRA.  Be sure to 
explain the need and benefit of having the new feature.

HTH

Re: Spawning Data on Multiple Directories

Posted by Rodrigo Madera <ro...@gmail.com>.

In that case, what about a proposal for the development team to allow
such one-database-multiple-directories architectural improvements?

Depending on the OS to mount several devices (or directories) into a
parent one is not something trivial. And the gain in having such
flexibility is a sure winner.

Rodrigo

On 5/3/06, Mike Matrigali <mi...@sbcglobal.net> wrote:
> The data of a single database can only be in a single directory in
> the current derby implementation, spreading that data across multiple
> devices is left to the OS.  The log can be placed on
> a separate disk.
>
> Depending on the OS involved one may be able to make those 3 disks
> look like one logical disk and thus be able to place one derby db
> on those 3 disk disks.  This is usually easier done when setting up
> a machine rather than after the fact looking at an existing machine.
>
> Rodrigo Madera wrote:
> > Well even if the numbers were smaller, would it be possible?
> >
> > Rodrigo
> >
> > On 5/3/06, Legolas Woodland <le...@gmail.com> wrote:
> >
> >> Rodrigo Madera wrote:
> >> > Hello to all,
> >> >
> >> > I was wondering how can I span Derby's Database on multiple disks. For
> >> > example, if I were to use:
> >> >
> >> > - C:\Somewhere\Derby1     40 GB
> >> > - D:\ANicePlace\Derby2     20 GB
> >> > - F:\Looking\Good\Dir\Derby3    15GB
> >> >
> >> > Is this possible?
> >> >
> >> > Thanks for any input,
> >> > Rodrigo
> >> >
> >> hi
> >> Does derby could handle such a big Database effectively in a multi user
> >> environment ?
> >> I think it is a big big database , 75GB .
> >> does any one has experiences with such databases and Derby ?
> >>
> >>
> >
> >
>
>

Re: Spawning Data on Multiple Directories

Posted by Mike Matrigali <mi...@sbcglobal.net>.

The data of a single database can only be in a single directory in
the current derby implementation, spreading that data across multiple
devices is left to the OS.  The log can be placed on
a separate disk.

Depending on the OS involved one may be able to make those 3 disks
look like one logical disk and thus be able to place one derby db
on those 3 disk disks.  This is usually easier done when setting up
a machine rather than after the fact looking at an existing machine.

Rodrigo Madera wrote:
> Well even if the numbers were smaller, would it be possible?
> 
> Rodrigo
> 
> On 5/3/06, Legolas Woodland <le...@gmail.com> wrote:
> 
>> Rodrigo Madera wrote:
>> > Hello to all,
>> >
>> > I was wondering how can I span Derby's Database on multiple disks. For
>> > example, if I were to use:
>> >
>> > - C:\Somewhere\Derby1     40 GB
>> > - D:\ANicePlace\Derby2     20 GB
>> > - F:\Looking\Good\Dir\Derby3    15GB
>> >
>> > Is this possible?
>> >
>> > Thanks for any input,
>> > Rodrigo
>> >
>> hi
>> Does derby could handle such a big Database effectively in a multi user
>> environment ?
>> I think it is a big big database , 75GB .
>> does any one has experiences with such databases and Derby ?
>>
>>
> 
>

Re: Spawning Data on Multiple Directories

Posted by Rodrigo Madera <ro...@gmail.com>.

Well even if the numbers were smaller, would it be possible?

Rodrigo

On 5/3/06, Legolas Woodland <le...@gmail.com> wrote:
> Rodrigo Madera wrote:
> > Hello to all,
> >
> > I was wondering how can I span Derby's Database on multiple disks. For
> > example, if I were to use:
> >
> > - C:\Somewhere\Derby1     40 GB
> > - D:\ANicePlace\Derby2     20 GB
> > - F:\Looking\Good\Dir\Derby3    15GB
> >
> > Is this possible?
> >
> > Thanks for any input,
> > Rodrigo
> >
> hi
> Does derby could handle such a big Database effectively in a multi user
> environment ?
> I think it is a big big database , 75GB .
> does any one has experiences with such databases and Derby ?
>
>