You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@commons.apache.org by Rodney Waldhoff <rw...@apache.org> on 2002/12/31 00:46:58 UTC

[general][lang] monolithic components considered harmful

It may seem that I'm picking on [lang] here, but that's not my intention.
I just feel like I'm watching an impending train-wreck, and intend to
throw the switch while there's still time.

The Jakarta-Commons charter suggests (well, literally requires [0]) that:

"Each package must have a clearly defined purpose, scope, and API -- Do
one thing well, and keep your contracts"

and suggests in a number of ways that small, single-purpose components are
preferable to monolithic ones.  (Perhaps most succinctly as "Place types
that are commonly used, changed, and released together, or mutually
dependant on each other, into the same package [and types that are not
used, changed, and released together, or mutually dependent into different
packages].)  Yet there seems to be an increasing tendency here toward
lumping discrete units into monolithic components.

Allow me justify this position.

The arguments in favor of monolithic components I've seen seem to boil
down to concerns about minimizing dependencies and preventing
circularities.  This may seem superficially correct, but it is misguided.
The number of JARs I need to have in my classpath is at best an indirect
metric for the absence or presence of dependency issues, and at worst a
misleading one.  Adding a new JAR to the classpath is a trivial issue, and
tools like Maven [1], ClassWorld's UberJar [2], Commons-Combo [3] and even
Java Web Start [4] make it even less of an issue (for better or worse).
The real concerns here should be those of configuration management. For
example, which version of X does Y require, and is that compatible with
the version of X that Z requires?  How many applications will be impacted
by a given change?  How small can I make my (end-user) application?

Monolithic components make configuration management problems worse, not
better.

Here's how:

1) Monolithic components introduce false dependencies.

Let's suppose, as some have suggested, that we release [lang] with new
reflection and math packages.  Suppose further that [cli] uses the
lang.math utilities and that [beanutils] uses the lang.reflect utilities,
and that I've got an application that uses both [cli] and [beanutils].

One might think this gives a simple dependency graph:

      [LANG]
        ^
        |
    .--' '--.
    |       |
  [CLI] [BEANUTILS]
    ^       ^
    |       |
    '--. .--'
        |
     [MY APP]

(where [X] <-- [Y] means Y depends on X)

but the reality is more complicated.  Suppose the latest version of
[beanutils] required some changes to lang.reflect.  In the same period,
some changes have been made to lang.math, but [cli] has not yet been
updated to support that.  This makes the version of [lang] required by
[beanutils] incompatible with the version of [lang] required by [cli].
(And if your solution is "we'll just keep [cli] up-to-date", replace [cli]
in this example with some third-party, possibly closed-source component.)

This means:

  [LANG]  [LANG']
    ^       ^
    |       |
    |       |
  [CLI] [BEANUTILS]
    ^       ^
    |       |
    '--. .--'
        |
     [MY APP]


but since [lang] != [lang'], I can't do that.  This problem isn't caused
by any true incompatibilities, but by an artificial coupling of unrelated
code.

If [reflect] and [math] are teased apart, the artificial problems go away:

  [MATH] [REFLECT]
    ^       ^
    |       |
    |       |
  [CLI] [BEANUTILS]
    ^       ^
    |       |
    '--. .--'
        |
     [MY APP]

I can now replace [reflect] with [reflect'], and I only need to worry
about updating those components that depend upon the [reflect] classes.
This is true even if both [math] and [reflect] depend upon some other
stuff in [lang]:

      [LANG]
        ^
        |
    .--' '--.
    |       |
  [MATH] [REFLECT]
    ^       ^
    |       |
    |       |
  [CLI] [BEANUTILS]
    ^       ^
    |       |
    '--. .--'
        |
     [MY APP]


2) Monolithic components encourage superfluous dependencies and
inappropriate coupling.

Bundling unrelated code into a single component inappropriately lowers the
cost of crossing interface boundaries.  Since the code is distributed
together, it would seem that the cost of using, say, a method of
lang.SerializationUtils within lang.functor.FactoryUtils, is negligible.
But the true cost here isn't in getting SerializationUtils into the
classpath, it's in coupling of the two classes--making FactoryUtils
sensitive to changes in SerializationUtils.

Consider, for instance, lang.StringUtils.  There are number of handy
methods there, some of them non-trivial and all of them offering better
readability than the naive alternative.  I sympathize with the desire for
increased readability and reuse, and in some circumstances it may be a
Good Thing to use, for example, StringUtils.trim(String):

    public static String trim(String str) {
        return (str == null ? null : str.trim());
    }

instead of simply inlining the (str == null ? null : str.trim()) clause.

But when used infrequently in an otherwise unrelated class, the price paid
for this trivial reuse is fairly high, coupling this code with a 1700+
line class to reuse 33 characters of code. (And StringUtils uses
CharSetUtils, which uses CharSet, which uses various java collection
classes, etc.)

There are times when trivial code is just that.  Lumping together
unrelated code in a monolithic component encourages me to be lazy about
these dependencies and more importantly, these couplings.  Packaging
unrelated code into distinct components forces me to consider whether
introducing a new coupling is justified.

3) Monolithic components slow the pace of development.

When components are small and single purpose, changes are small,
well-contained, readily tested and easily understood. New releases can be
performed more readily, more easily and hence more frequently.

Bundling unrelated code into a monolithic component means I need to
synchronize development of that unrelated code: Maybe I'd like to do a new
release of sub-component X, but I can't since sub-component Y is in the
midst of a major refactoring.  Maybe I'd like to do a major refactoring of
sub-component A but I can't since sub-component B is preparing for a
release.

The more "foundational" a component is, the more this problem multiplies.
E.g., suppose we can't release lang.reflect because we're screwing around
with lang.time, and beanutils can't release without a released version of
lang.reflect, and struts can't release with released version of beanutils,
etc.

(Decoupling the CVS HEAD of lang.time and released version of lang.reflect
(i.e., releasing lang with the latest lang.reflect but without lang.time),
as we've done in other circumstances only demonstrates that these really
are unrelated packages, and causes problems for those that work from a
SNAPSHOT.)

4) Monolithic components make it more difficult for clients to track and
communicate their dependencies.

Following our versioning guidelines [5], non-backward compatible changes
to public APIs require new major version numbers.  Hence a non-backward
compatible change to sub-component X will require new major version
number, even though sub-component Y may be fully backwards compatible.
Clients that only depend upon Y (and since X and Y are not strongly
related, this is a significant set) will find the contract implied by the
versioning guidelines broken--the version numbers suggest a major change,
but there isn't as far as Y is concerned.  Clients that only depend upon Y
are forced to confirm that nothing has been broken, and perhaps even
update existing deployments even though there has been no change to Y.
This weakens the utility of the versioning heuristics, and makes it more
difficult for clients to track and manage their dependencies.

5) Monolithic components only hide circularities, and may even encourage
them.

Whenever A depends upon B and B depends on A, we have a circular
dependency, wherever the code for A and B is located.  As with most forms
of strong coupling, such circularities should be avoided whenever
possible.  Building A and B in the same compilation run may make it
possible to deal with a circular dependency, but it doesn't prevent it.
Similarly, placing A and B are in different components doesn't create a
circular dependency, it exposes it.

The "circular dependency" issue is largely hypothetical anyway.  In case
of [lang] for example, several of the sub-packages have literally no
dependency on the rest of the package, and most that do have very weak
coupling at best.  Moreover, it is trivial to combine two previously
independent components.  Following (1) and (2), it may be substantially
more difficult to tease apart classes that were once part of the same
component.

6) Monolithic components only get bigger, making all of these problems
worse.

For instance, the [lang] proposal that was approved describes its scope
as:

"[A] package of Java utility classes for the classes that are in
java.lang's hierarchy, or are considered to be so standard as to justify
existence in java.lang. The Lang Package also applies to primitives and
arrays." [6]

In the five months since that proposal was accepted, the scope of lang has
expanded significantly ([7], [8], [9], [10], [11]) and now includes or is
proposed to include:

 * math utilities [12]
 * serialization utilities [13]
 * currency and unit classes [14]
 * date and time utilities [15]
 * reflection and introspection utilities [16]
 * functors [17]
 * and much more [18], [19], [20], [21], [22]

And the more the scope expands, the more the scope expands--the existence
of the [lang] monolith has encouraged a reduction in ([23], [24], others)
and discouraged the growth of ([25], [26], others) other components, and
has discouraged the introduction of new components ([27], [28], others).


As above and before, if classes aren't commonly used, changed, and
released together, or mutually dependant on each other, they should be in
distinct components.  If we want a catch-all JAR, we've got one [3].
Given the principles enumerated in the commons guidelines and detrimental
effects enumerated here, I'm not sure why we'd follow any other course.

 - Rod

[0] <http://jakarta.apache.org/commons/charter.html>
[1] <http://jakarta.apache.org/turbine/maven/>
[2] <http://classworlds.werken.com/uberjar.html>
[3] <http://cvs.apache.org/viewcvs/jakarta-commons/combo/>
[4] <http://java.sun.com/products/javawebstart/>
[5] <http://jakarta.apache.org/commons/versioning.html>
[6] <http://cvs.apache.org/viewcvs/jakarta-commons/lang/PROPOSAL.html?rev=1.1&content-type=text/vnd.viewcvs-markup>
[7] <http://cvs.apache.org/viewcvs/jakarta-commons/lang/STATUS.html.diff?r1=1.10&r2=1.12&diff_format=h>
[8] <http://cvs.apache.org/viewcvs/jakarta-commons/lang/STATUS.html.diff?r1=1.25&r2=1.26&diff_format=h>
[9] <http://cvs.apache.org/viewcvs/jakarta-commons/lang/STATUS.html.diff?r1=1.28&r2=1.29&diff_format=h>
[10] <http://cvs.apache.org/viewcvs/jakarta-commons/lang/STATUS.html.diff?r1=1.30&r2=1.31&diff_format=h>
[11] <http://cvs.apache.org/viewcvs/jakarta-commons/lang/STATUS.html.diff?r1=1.31&r2=1.32&diff_format=h>
[12] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgId=586315>
[13] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgId=457636>
[14] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgNo=18957>
[15] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgId=577799>
[16] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgId=411302>
[17] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgId=577713>
[18] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgNo=16718>
[19] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgNo=18778>
[20] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgNo=19885>
[21] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgId=512176>
[22] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgId=581065>
[23] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgId=519705>
[24] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgNo=20304>
[25] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgNo=19847>
[26] <http://archives.apache.org/eyebrowse/ReadMsg?listId=15&msgNo=19865>
[27] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgId=551801>
[28] <http://archives.apache.org/eyebrowse/ReadMsg?listId=15&msgNo=19221>


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

back to functors (was Re: [general][lang] monolithic components considered harmful)

Posted by Rodney Waldhoff <rw...@apache.org>.

On Wed, 1 Jan 2003, Costin Manolache wrote:

> If you define "functor" the way Craig did - an interface that
> could be used as a common hook mechanism - I would gladly change my vote
> to +1.
>
> And it will be: "if you want a hook mechanism, use commons-functor"

I would have phrased it as "if you want to treat functions as objects",
but I think we're largely talking about the same thing.  Being able to
"plug in" a new function by registering an "extension point" or passing in
a "callback" method are an examples of treating functions as
objects--i.e., they rely upon being able to pass around references to
functions the way Java supports passing around references to objects.

> There are few requirements I have:

This won't be fully determined by me of course, but...

> 1. You must be well aware that this is an interface package.

This is very much an interface package.  One could easily imagine a core
distribution containing nothing but interfaces and simple adapters between
those interfaces (e.g., something that turns a Predicate into
Boolean-returning Transformer, or vice versa).

> It just won't work with 3 +1 votes as a regular commons interface.
> You need buy-in and participation from significant apache projects.

To be considered successful, [functor] should be used and/or supported by
other projects (apache or other).  The same is true for most if not all
commons components.

> 2. It shouldn't get too much into implementing functors, just define
> the interface and tools ( and maybe wrappers for existing hook
> mechanisms ).

Agreed. I imagine functor containing only the interfaces and the most
basic implementations and adapters.  One can imagine the possibility of
add-on components that implement a non-trivial collection of functors for
some specialized purpose but that's a topic for another discussion and
other proposals.

> 3. It should be able to support existing patterns:

> - iterative invocation of functors as well as recursive (
> valve/interceptors:-)

I would expect support for, as Craig and Tom discussed, the composition of
functors and the chain of responsibility pattern, as well as things like
strategy, visitor (with support from the component defining the structure
being iterated over, e.g., collections), variations on template method
(using composition rather than inheritance), etc.  Indeed many of the GoF
"behavioral" patterns seem to apply, extend or suggest the functor idiom.

> - JDK1.1 compat ( so it could be used someday in Ant ).

I also imagine that the core interfaces and most if not all basic
implementations wouldn't need anything not found in JDK 1.1.

> This must be config-neutral and support regular bean patterns ( so
> it can be managed by modeler and integrated in existing apps ).

I would expect most if not all of [functor] to be not just config-neutral,
but config-free.

> I also have a problem with the name "functor". I would rather have
> "callback", "hook", "extension point", "plugin" ( if this is what you
> have in mind ).

I believe "functor" to be a less ambiguous label for this component than
something like "plugin" or "callback", but frankly if that's the
difference between moving forward with [functor] or not, I'd be open to
alternatives.

> In other words: one hook mechanism to rule them all :-)

I'm not interested in ruling anything, although "one hook mechanism usable
by all" sounds admirable enough.  More than anything what I'd like to see
is a world where simple functor and functor-based utilities are
interoperable, so that, for example, if [io] (or [io-functors] or
[functors-io] or whatever) implements IsDirectoryPredicate and
[collections] implements PredicateIterator, I can put the two together to
iterate over the directories in a List of Files.  And where I can do this
without caring about the release status of EnumUtils or DoubleRange.

 - Rod

> Costin

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: [general][lang] monolithic components considered harmful

Posted by Costin Manolache <cm...@yahoo.com>.

Rodney Waldhoff wrote:

> Try this: fill in the blanks in the following
> 
>   If you want to ___, you may want to use ___.
> 
> For example:
> 
> * interact with JavaBeans via reflection and introspection; beanutils
> * translate between JavaBeans and XML; betwixt
> * parse command line arguments; cli
> * work with abstract data structures; collections
> * parse xml configuration files; digester
> * discover services that have been externally configured; discovery
> * pool database connections; dbcp
> * implement an XML scripting language; jelly
> * process multipart/form-data HTTP requests; fileupload
> * interact with HTTP servers; httpclient
> * work with XPath expresssions in java; jxpath
> * functional test HTTP applications; latka
> * write debugging and logging messages; logging
> * support JMX via Model MBeans; modeler
> * pool objects; pool
> * validate user input; validator
> 
> Now try it with [lang], [util] or [pattern] and any scope signficantly
> different from "develop software in Java".

If you define "functor" the way Craig did - an interface that 
could be used as a common hook mechanism - I would gladly change my vote
to +1.

And it will be: "if you want a hook mechanism, use commons-functor"

There are few requirements I have:

1. You must be well aware that this is an interface package. 
It just won't work with 3 +1 votes as a regular commons interface.
You need buy-in and participation from significant apache projects.

2. It shouldn't get too much into implementing functors, just define
the interface and tools ( and maybe wrappers for existing hook
mechanisms ).

3. It should be able to support existing patterns:
- iterative invocation of functors as well as recursive ( 
valve/interceptors:-)
- be able to replace existing mechanisms
- JDK1.1 compat ( so it could be used someday in Ant ).

It could go the same way as logging and define adapters and wrappers
for other interfaces used for this ( in tomcat, ant, axis, etc ). 
It would be ideal if bidirectional wrappers would be provided so
that functors are useable in existing systems.

This must be config-neutral and support regular bean patterns ( so
it can be managed by modeler and integrated in existing apps ).

I also have a problem with the name "functor". I would rather have
"callback", "hook", "extension point", "plugin" ( if this is what you
have in mind ).

In other words: one hook mechanism to rule them all :-)

Costin

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: [general][lang] monolithic components considered harmful

Posted by Berin Loritsch <bl...@apache.org>.

Stephen Colebourne wrote:
> Let me attempt to demonstrate why multiple jars won't work. Imagine we do
> the split of [lang] into jars based on Common Reuse/Reuse-Release
> Equivalence/Common Closure Principles.

<snip/>

> 
> So, 22 new commons components. <sarcasm>Now thats a good idea isn't
> it</sarcasm>

:) Fragmentation and Separation are two completely different things.
I think you hit on that one pretty well.

<snip/>

> Should we complain that the JDK contains stuff we don't use?

I do--but it doesn't do any good.  I'd say on any one project I only
use 10% of what is in the JDK.  That is because J2SE is so vast.  I
don't need all of it.  I think there is merit to lessening the weight
of downloading 30 MB all at one time if I can have the install only
download what I am using.  Core stuff like java.lang, java.util,
and java.text are important right off the bat.  However my server
projects don't use javax.swing and java.awt--and my GUI projects
don't always use java.security and java.net.

If the JDK was split into several 100-400 KB JARs, and only downloaded
what I needed--then Sun could provide partial upgrades and fix bugs alot
quicker and I would only ever have to download what I needed.

There is a limit.  When you get down into JAR files that are roughly
the same size as a Class or two, then it is ridiculous.  However,
I can see a separate JAR at the package level.

Keep in mind a new JAR doesn't necessarily mean a new project....

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: [general][lang] monolithic components considered harmful

Posted by Henri Yandell <ba...@generationjava.com>.

I agree with the views in this email. While I agree that there is a line
between whether something should go in a project or not [and this still
applies to Lang], I don't believe that Rodney's suggested criteria will
work as they would lead to Stephen's examples. Any criteria which cannot
be applied retroactively and produce sane results, will not work for the
future.

It doesn't even make sense in Rodney's own examples as a [time] project
and a [math] project do not currently make any sense. If we're going to
have a [time] project, we should just invite Joda-time in, and if we're
going to have a [math] project it ought to have enough functionality in it
to be worth doing. Any [math] project with enough functionality would, in
my opinion, no longer be for the common developer but for mathematicians,
in which case it is outside the scope of Jakarta Commons.

[functor] is worthy of argument, [I have long hovered on the fence on
functor], as it _just_ has the size and scope to be a project.
[exceptions] is an example of a project which lacks the scope to be a
project.

Lastly, I continue to believe that Jars are similar to Taglibs. While a
million tiny jars are or taglibs are easier to maintain dependencies
between [even if one is not reusing things but inlining them], JSTL's
success has shown that the user does not want a million tiny jars.

Hen

On Wed, 1 Jan 2003, Stephen Colebourne wrote:

> Let me attempt to demonstrate why multiple jars won't work. Imagine we do
> the split of [lang] into jars based on Common Reuse/Reuse-Release
> Equivalence/Common Closure Principles.
>
> [arrayutil]
> ArrayUtils.java
> - depends on [builder]
>
> [booleanutil]
> BooleanUtils.java
> - depends on [numberutil]
>
> [charsetutil]
> CharRange.java
> CharSet.java
> CharSetUtils.java
>
> [stringutil]
> RandomStringUtils.java
> StringUtils.java
>
> [classutil]
> ClassUtils.java
>
> [notifierutil]
> Notifier.java
> NotifierException.java
> - depends on [exception]
>
> [numberutil]
> NumberUtils.java
>
> [objectutil]
> ObjectUtils.java
>
> [serializationutil]
> SerializationUtils.java
> SerializationException.java
> - depends on [exception]
>
> [systemutil]
> SystemUtils.java
>
> [builder]
> CompareToBuilder.java
> EqualsBuilder.java
> HashCodeBuilder.java
> StandardToStringStyle.java
> ToStringBuilder.java
> ToStringStyle.java
> - depends on [numberutil], [systemutil]
>
> [enum]
> Enum.java
> EnumUtils.java
> ValuedEnum.java
>
> [exception]
> ExceptionUtils.java
> Nestable.java
> NestableDelegate.java
> NestableError.java
> NestableException.java
> NestableRuntimeException.java
> - depends on [arrayutil], [systemutil]
>
> [functor]
> Executor.java
> ExecutorException.java
> ExecutorUtils.java
> Factory.java
> FactoryException.java
> FactoryUtils.java
> Predicate.java
> PredicateException.java
> PredicateUtils.java
> Transformer.java
> TransformerException.java
> TransformerUtils.java
> - depends on [exception], [serialization]
>
> [numberrange]
> DoubleRange.java
> FloatRange.java
> IntRange.java
> LongRange.java
> NumberRange.java
> Range.java
> - depends on [numberutil]
>
> [fraction]
> Fraction.java
>
> [reflect]
> ConstructorUtils.java
> FieldUtils.java
> MethodUtils.java
> ReflectionException.java
> ReflectionUtils.java
> package.html
> - depends on [arrayutil], [classutil], [stringutil], [exception]
>
> [timeutil]
> CalendarUtils.java
> DateUtils.java
>
> [timingutil]
> StopWatch.java
>
> [bitfield]
> BitField.java
>
> [identifier]
> IdentifierUtils.java
> - depends on [functor]
>
> [validate]
> Validate.java
>
> So, 22 new commons components. <sarcasm>Now thats a good idea isn't
> it</sarcasm>
>
> And if you think this is pedantic look at the list again. Any combination of
> the above components is combining two concepts that don't have a direct
> connection.
>
> Because thats what [lang] is, and thats why is doesn't fit the holy commons
> charter. Its a combination of useful utilities
> - too small to exist alone (one class)
> - that gain strength by being together (see some of the more unusual
> dependencies above)
> - encourage reuse and discourage cut and paste by offering more (I might cut
> and paste if I want one routine. I might depend if I want more than one. The
> broader range available increases reuse over cut and paste)
> - builds a viable community (not just counting committers, but users. The
> time and math packages are user suggestions. The util package is the useful
> parts of the dead community [util]. The functor package is the useful parts
> of the dead community [pattern].)
>
> Should we complain that the JDK contains stuff we don't use?
>
> Stephen
>
> ----- Original Message -----
> From: "Rodney Waldhoff" <rw...@apache.org>
> >  Person ${p} suggests feature ${f} for component ${c}. Person ${q} insists
> > it belongs in [lang].
> >
> > and ${q} pulls from a very small set.
>
> and {q} = Stephen
> Rodney, please feel free to make it personal if thats what you believe.
>
>
>
> --
> To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
> For additional commands, e-mail: <ma...@jakarta.apache.org>
>
>



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: [general][lang] monolithic components considered harmful

Posted by Ola Berg <ol...@arkitema.se>.

On Thursday 02 January 2003 19.04, you wrote:
> Bah. I think you're being willfully obtuse about this.
>
> The heuristic is to place types that are commonly used or changed
> together, or mutually dependant on each other, into the same component.
>
> At the risk of getting mired in the details (and just considering the
> stuff currently in lang) here's one such partitioning:
>
> [functor]
>  - containing things like: lang.functor.*
>  - use this when you need to: treat functions as objects
>
> [reflect]
>  - containing things like: lang.reflect.*, ClassUtils, etc.
>  - use this when you need to: manipulate objects via reflection and
> introspection
>
> [math]
>  - containing things like: NumberUtils, lang.math.*, numerical analysis
> functions like gcd, lcm, isPrime, getFactors, etc.
>  - use this when you need to: manipulate numbers or use numerical
> algorithms
>
> [datetime]
>  - containing things like: lang.time.*
>  - use this when you need to: work with dates and times

You are forgetting the case when things in one component needs or would be 
better to utilize things in other components. The functors, the exception 
utilities, Validate etc are extremely useful in all the other components. And 
what do you do when the implementation of Validate utilizes the exception 
utilities, the exception utilities utilizes the reflection package, the 
reflection package utilizes Validate along with some of the functors etc? 
[Circular dependencies alert]

I agree that a full blown math package would be better off on its own, and 
the same applies to datetime, advanced reflection etc. 

But [lang] isn't scoping for the full blown in one field, but for the basic 
stuff in many fields. Basic date/time handling, number ranges, simple 
validation, things that the application programmer needs regardless of the 
application, and things that the components need themselves.

There is nothing that stops a more full blown 
datetime/math/reflection/whatever package. I just don't think that you should 
have to incorporate the full blown when you just need the basics.

> [assert]
>  - containing things like: Validate
>  - use this when you need to: use assertions before JDK 1.4

Well, no. Validate is for input validation. Assertions in JDK 1.4 is 
expressively not for input validation.

> * serialization stuff could move to [io] or to a new [serialization]

I agree with you on that one.

> * BitField could move to [collections] (i.e., as a collection of bits) or
> to [math] (for things like unsigned int, unsigned short, etc) or to
> [converter] (i.e., as bit-to-number conversion)

Note the "or" you just used. It indicates that BitFields are useful outside 
of both collections and math, which is an indication that it shouldn't go 
into any of them. Given their general usefulness, I think they fit in lang. 

> * BooleanUtils are by and large, converters like those derived from
> [functor], currently found in [beanutils], or could be part of a
> [converter] component as some have suggested

I disagree. They are very useful outside of the conversion applications. 
Handling a primitive value. Lang will do fine.

> * the methods in ArrayUtils, BooleanUtils and ObjectUtils really just do
> one of the following:
>
>  - wrap around various builder methods (toString, equals, hashCode)
>
>  - replace a trivial cast (clone, reverse, identityToString)
>
>  - implement trivial predicates, relations, or converters (isSameLength,
> isSameType, equals, defaultIfNull, toBoolean[Object], toInteger[Object],
> etc.)
>
> and could probably be naturally split among the other components
> (including lang) in ways more in line with the common reuse and common
> closure principles.

They are useful for any application doing generic handling of core java 
language elements (arrays, object, boolean/Boolean). Isn't that what [lang] 
is good for?

I put a general discussion on the scope of [lang] in a separate thread.

/O

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: [general][lang] monolithic components considered harmful

Posted by Ola Berg <ol...@arkitema.se>.

On Thursday 02 January 2003 19.04, you wrote:
> Bah. I think you're being willfully obtuse about this.
>
> The heuristic is to place types that are commonly used or changed
> together, or mutually dependant on each other, into the same component.
>
> At the risk of getting mired in the details (and just considering the
> stuff currently in lang) here's one such partitioning:
>
> [functor]
>  - containing things like: lang.functor.*
>  - use this when you need to: treat functions as objects
>
> [reflect]
>  - containing things like: lang.reflect.*, ClassUtils, etc.
>  - use this when you need to: manipulate objects via reflection and
> introspection
>
> [math]
>  - containing things like: NumberUtils, lang.math.*, numerical analysis
> functions like gcd, lcm, isPrime, getFactors, etc.
>  - use this when you need to: manipulate numbers or use numerical
> algorithms
>
> [datetime]
>  - containing things like: lang.time.*
>  - use this when you need to: work with dates and times

You are forgetting the case when things in one component needs or would be 
better to utilize things in other components. The functors, the exception 
utilities, Validate etc are extremely useful in all the other components. And 
what do you do when the implementation of Validate utilizes the exception 
utilities, the exception utilities utilizes the reflection package, the 
reflection package utilizes Validate along with some of the functors etc? 
[Circular dependencies alert]

I agree that a full blown math package would be better off on its own, and 
the same applies to datetime, advanced reflection etc. 

But [lang] isn't scoping for the full blown in one field, but for the basic 
stuff in many fields. Basic date/time handling, number ranges, simple 
validation, things that the application programmer needs regardless of the 
application, and things that the components need themselves.

There is nothing that stops a more full blown 
datetime/math/reflection/whatever package. I just don't think that you should 
have to incorporate the full blown when you just need the basics.

> [assert]
>  - containing things like: Validate
>  - use this when you need to: use assertions before JDK 1.4

Well, no. Validate is for input validation. Assertions in JDK 1.4 is 
expressively not for input validation.

> * serialization stuff could move to [io] or to a new [serialization]

I agree with you on that one.

> * BitField could move to [collections] (i.e., as a collection of bits) or
> to [math] (for things like unsigned int, unsigned short, etc) or to
> [converter] (i.e., as bit-to-number conversion)

Note the "or" you just used. It indicates that BitFields are useful outside 
of both collections and math, which is an indication that it shouldn't go 
into any of them. Given their general usefulness, I think they fit in lang. 

> * BooleanUtils are by and large, converters like those derived from
> [functor], currently found in [beanutils], or could be part of a
> [converter] component as some have suggested

I disagree. They are very useful outside of the conversion applications. 
Handling a primitive value. Lang will do fine.

> * the methods in ArrayUtils, BooleanUtils and ObjectUtils really just do
> one of the following:
>
>  - wrap around various builder methods (toString, equals, hashCode)
>
>  - replace a trivial cast (clone, reverse, identityToString)
>
>  - implement trivial predicates, relations, or converters (isSameLength,
> isSameType, equals, defaultIfNull, toBoolean[Object], toInteger[Object],
> etc.)
>
> and could probably be naturally split among the other components
> (including lang) in ways more in line with the common reuse and common
> closure principles.

They are useful for any application doing generic handling of core java 
language elements (arrays, object, boolean/Boolean). Isn't that what [lang] 
is good for?

I put a general discussion on the scope of [lang] in a separate thread.

/O

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: [general][lang] monolithic components considered harmful

Posted by Rodney Waldhoff <rw...@apache.org>.

On Thu, 2 Jan 2003, Henri Yandell wrote:

>
> On Thu, 2 Jan 2003, Rodney Waldhoff wrote:
>
> > The result is 4 to 7 new components, some of which might reasonably start
> > in the sandbox, some of which could be proposed for commons proper
> > immediately.  All of these have stand-alone utility, strong cohesion, a
> > readily identifiable scope, and few if any inter-dependencies.
>
> -1 to any released code moving to the sandbox on the general principle of
> it. Anything that is released should either be deprecated because it has
> moved, or deprecated because it is redundant.
>

None of the code I referenced is currently released.



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: [general][lang] monolithic components considered harmful

Posted by Henri Yandell <ba...@generationjava.com>.

On Thu, 2 Jan 2003, Rodney Waldhoff wrote:

> Bah. I think you're being willfully obtuse about this.

Possibly, but if your criteria can't be applied to everything then they
are not worth applying [okay, maybe a bit too bombastic].

> The heuristic is to place types that are commonly used or changed
> together, or mutually dependant on each other, into the same component.

I believe this heuristic is akin to the problem of over-modeling. You see
a future problem [enormous lib] and want to solve it now with a minimum
of knowledge. If [lang] grows enormous, I can easily see it evolving to
the structure you describe, but it does not have that size.

> At the risk of getting mired in the details (and just considering the
> stuff currently in lang) here's one such partitioning:
>
> [functor]
>  - containing things like: lang.functor.*
>  - use this when you need to: treat functions as objects

[functor]'s are about 18 years old it seems. They're not quite ready to
live on their own, but they want to be out there it seems. I am happy to
accept functors [and the name does seem to fit the pattern [see
proposal]] being their own project, but believe they will suffer to start
with. They have sufficient size to overcome the slowness of being a new
project.

One good thing this series of discussions has shown is that there is a lot
of interest in functors. Rodney, Stephen and Costin have all shown
creative input in this set of mail, and the original authors of functors
[Ola? Stephen? Steven? others?] obviously also have shown that input.

It seems that [functor] needs to do some soul-searching. If this were a
company and I were a big-cat CEO [get my shoes Smithers], then I would
suggest [this is why I'm not a CEO, I think I'm meant to demand or order]
that those involved start discussing the next step for functor.

IFF [functor] is to see its scope widen very soon, ie) Costin's ideas,
then it should form its own project. Functor with a lot of mail arguing,
increase in size and a big need for good documentation will quickly become
the pig in [lang]'s belly.

IFF [functor] is seen to be pretty much complete, then it should go into
[lang], to be promoted out IFF it is shown to grow in scope at a later
date.

I'm going to repeat this in another email for brevity.

> [reflect]
>  - containing things like: lang.reflect.*, ClassUtils, etc.
>  - use this when you need to: manipulate objects via reflection and
> introspection

The code is not even finished, and I believe [lang] is still the right
place for it, unless it becomes a religion that coders have to adhere to,
ie) clazz. I doubt it will grow that large.

> [math]
>  - containing things like: NumberUtils, lang.math.*, numerical analysis
> functions like gcd, lcm, isPrime, getFactors, etc.
>  - use this when you need to: manipulate numbers or use numerical
> algorithms

Even with these, which is stretching the limits of math, it will be a
sufficient sub-package. I believe [math] should continue [once it is even
coded] to live in [lang] until such a time as it is big enough to leave
home.

> [datetime]
>  - containing things like: lang.time.*
>  - use this when you need to: work with dates and times

Ditto as for [math], except that I think making a [time] project would be
a waste of time as Stephen has already walked this road and is suitably
licensed (i assume).

> A more fine-grained partitioning might add:
>
> [string]
>  - containing things like: CharRange, CharSet, CharSetUtils,
> RandomStringUtils, StringUtils, [WordWrapUtils], etc.
>  - use this when you need to: manipulate strings
>
> [exception]
>  - containing things like: lang.exception.*
>  - use this when you need to: manipulate Throwables, or use nested
> exceptions before JDK 1.4
>
>
> or even:
>
> [assert]
>  - containing things like: Validate
>  - use this when you need to: use assertions before JDK 1.4
>
> several of the remaining classes *might* logically move to an existing
> component, become the kernal of new component, or simply stay in lang:
>
> * serialization stuff could move to [io] or to a new [serialization]

I still reckon that Serializable should be in java.lang. Once [io] exists,
maybe this move would be right. As [io] is unlikely to surface soon, I
don't think this should happen for a long time.

> * BitField could move to [collections] (i.e., as a collection of bits) or
> to [math] (for things like unsigned int, unsigned short, etc) or to
> [converter] (i.e., as bit-to-number conversion)

I'm happy for this to happen now. Collections didn't want it when [util]
was broken up before [lang] started. It's been a problem-child.

> ....

> The result is 4 to 7 new components, some of which might reasonably start
> in the sandbox, some of which could be proposed for commons proper
> immediately.  All of these have stand-alone utility, strong cohesion, a
> readily identifiable scope, and few if any inter-dependencies.

-1 to any released code moving to the sandbox on the general principle of
it. Anything that is released should either be deprecated because it has
moved, or deprecated because it is redundant.

Many of these ideas will be applicable this time next year I suspect, but
not now.

Hen

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: [general][lang] monolithic components considered harmful

Posted by Rodney Waldhoff <rw...@apache.org>.

Bah. I think you're being willfully obtuse about this.

The heuristic is to place types that are commonly used or changed
together, or mutually dependant on each other, into the same component.

At the risk of getting mired in the details (and just considering the
stuff currently in lang) here's one such partitioning:

[functor]
 - containing things like: lang.functor.*
 - use this when you need to: treat functions as objects

[reflect]
 - containing things like: lang.reflect.*, ClassUtils, etc.
 - use this when you need to: manipulate objects via reflection and
introspection

[math]
 - containing things like: NumberUtils, lang.math.*, numerical analysis
functions like gcd, lcm, isPrime, getFactors, etc.
 - use this when you need to: manipulate numbers or use numerical
algorithms

[datetime]
 - containing things like: lang.time.*
 - use this when you need to: work with dates and times

A more fine-grained partitioning might add:

[string]
 - containing things like: CharRange, CharSet, CharSetUtils,
RandomStringUtils, StringUtils, [WordWrapUtils], etc.
 - use this when you need to: manipulate strings

[exception]
 - containing things like: lang.exception.*
 - use this when you need to: manipulate Throwables, or use nested
exceptions before JDK 1.4

or even:

[assert]
 - containing things like: Validate
 - use this when you need to: use assertions before JDK 1.4

several of the remaining classes *might* logically move to an existing
component, become the kernal of new component, or simply stay in lang:

* serialization stuff could move to [io] or to a new [serialization]

* BitField could move to [collections] (i.e., as a collection of bits) or
to [math] (for things like unsigned int, unsigned short, etc) or to
[converter] (i.e., as bit-to-number conversion)

* BooleanUtils are by and large, converters like those derived from
[functor], currently found in [beanutils], or could be part of a
[converter] component as some have suggested

* the methods in ArrayUtils, BooleanUtils and ObjectUtils really just do
one of the following:

 - wrap around various builder methods (toString, equals, hashCode)

 - replace a trivial cast (clone, reverse, identityToString)

 - implement trivial predicates, relations, or converters (isSameLength,
isSameType, equals, defaultIfNull, toBoolean[Object], toInteger[Object],
etc.)

and could probably be naturally split among the other components
(including lang) in ways more in line with the common reuse and common
closure principles.


The result is 4 to 7 new components, some of which might reasonably start
in the sandbox, some of which could be proposed for commons proper
immediately.  All of these have stand-alone utility, strong cohesion, a
readily identifiable scope, and few if any inter-dependencies.

 - R.

On Wed, 1 Jan 2003, Stephen Colebourne wrote:

> Let me attempt to demonstrate why multiple jars won't work. Imagine we do
> the split of [lang] into jars based on Common Reuse/Reuse-Release
> Equivalence/Common Closure Principles.
>
> [arrayutil]
> ArrayUtils.java
> - depends on [builder]
>
> [booleanutil]
> BooleanUtils.java
> - depends on [numberutil]
>
> [charsetutil]
> CharRange.java
> CharSet.java
> CharSetUtils.java
>
> [stringutil]
> RandomStringUtils.java
> StringUtils.java
>
> [classutil]
> ClassUtils.java
>
> [notifierutil]
> Notifier.java
> NotifierException.java
> - depends on [exception]
>
> [numberutil]
> NumberUtils.java
>
> [objectutil]
> ObjectUtils.java
>
> [serializationutil]
> SerializationUtils.java
> SerializationException.java
> - depends on [exception]
>
> [systemutil]
> SystemUtils.java
>
> [builder]
> CompareToBuilder.java
> EqualsBuilder.java
> HashCodeBuilder.java
> StandardToStringStyle.java
> ToStringBuilder.java
> ToStringStyle.java
> - depends on [numberutil], [systemutil]
>
> [enum]
> Enum.java
> EnumUtils.java
> ValuedEnum.java
>
> [exception]
> ExceptionUtils.java
> Nestable.java
> NestableDelegate.java
> NestableError.java
> NestableException.java
> NestableRuntimeException.java
> - depends on [arrayutil], [systemutil]
>
> [functor]
> Executor.java
> ExecutorException.java
> ExecutorUtils.java
> Factory.java
> FactoryException.java
> FactoryUtils.java
> Predicate.java
> PredicateException.java
> PredicateUtils.java
> Transformer.java
> TransformerException.java
> TransformerUtils.java
> - depends on [exception], [serialization]
>
> [numberrange]
> DoubleRange.java
> FloatRange.java
> IntRange.java
> LongRange.java
> NumberRange.java
> Range.java
> - depends on [numberutil]
>
> [fraction]
> Fraction.java
>
> [reflect]
> ConstructorUtils.java
> FieldUtils.java
> MethodUtils.java
> ReflectionException.java
> ReflectionUtils.java
> package.html
> - depends on [arrayutil], [classutil], [stringutil], [exception]
>
> [timeutil]
> CalendarUtils.java
> DateUtils.java
>
> [timingutil]
> StopWatch.java
>
> [bitfield]
> BitField.java
>
> [identifier]
> IdentifierUtils.java
> - depends on [functor]
>
> [validate]
> Validate.java
>
> So, 22 new commons components. <sarcasm>Now thats a good idea isn't
> it</sarcasm>
>
> And if you think this is pedantic look at the list again. Any combination of
> the above components is combining two concepts that don't have a direct
> connection.
>
> Because thats what [lang] is, and thats why is doesn't fit the holy commons
> charter. Its a combination of useful utilities
> - too small to exist alone (one class)
> - that gain strength by being together (see some of the more unusual
> dependencies above)
> - encourage reuse and discourage cut and paste by offering more (I might cut
> and paste if I want one routine. I might depend if I want more than one. The
> broader range available increases reuse over cut and paste)
> - builds a viable community (not just counting committers, but users. The
> time and math packages are user suggestions. The util package is the useful
> parts of the dead community [util]. The functor package is the useful parts
> of the dead community [pattern].)
>
> Should we complain that the JDK contains stuff we don't use?
>
> Stephen
>
> ----- Original Message -----
> From: "Rodney Waldhoff" <rw...@apache.org>
> >  Person ${p} suggests feature ${f} for component ${c}. Person ${q} insists
> > it belongs in [lang].
> >
> > and ${q} pulls from a very small set.
>
> and {q} = Stephen
> Rodney, please feel free to make it personal if thats what you believe.
>
>


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: [general][lang] monolithic components considered harmful

Posted by Stephen Colebourne <sc...@btopenworld.com>.

Let me attempt to demonstrate why multiple jars won't work. Imagine we do
the split of [lang] into jars based on Common Reuse/Reuse-Release
Equivalence/Common Closure Principles.

[arrayutil]
ArrayUtils.java
- depends on [builder]

[booleanutil]
BooleanUtils.java
- depends on [numberutil]

[charsetutil]
CharRange.java
CharSet.java
CharSetUtils.java

[stringutil]
RandomStringUtils.java
StringUtils.java

[classutil]
ClassUtils.java

[notifierutil]
Notifier.java
NotifierException.java
- depends on [exception]

[numberutil]
NumberUtils.java

[objectutil]
ObjectUtils.java

[serializationutil]
SerializationUtils.java
SerializationException.java
- depends on [exception]

[systemutil]
SystemUtils.java

[builder]
CompareToBuilder.java
EqualsBuilder.java
HashCodeBuilder.java
StandardToStringStyle.java
ToStringBuilder.java
ToStringStyle.java
- depends on [numberutil], [systemutil]

[enum]
Enum.java
EnumUtils.java
ValuedEnum.java

[exception]
ExceptionUtils.java
Nestable.java
NestableDelegate.java
NestableError.java
NestableException.java
NestableRuntimeException.java
- depends on [arrayutil], [systemutil]

[functor]
Executor.java
ExecutorException.java
ExecutorUtils.java
Factory.java
FactoryException.java
FactoryUtils.java
Predicate.java
PredicateException.java
PredicateUtils.java
Transformer.java
TransformerException.java
TransformerUtils.java
- depends on [exception], [serialization]

[numberrange]
DoubleRange.java
FloatRange.java
IntRange.java
LongRange.java
NumberRange.java
Range.java
- depends on [numberutil]

[fraction]
Fraction.java

[reflect]
ConstructorUtils.java
FieldUtils.java
MethodUtils.java
ReflectionException.java
ReflectionUtils.java
package.html
- depends on [arrayutil], [classutil], [stringutil], [exception]

[timeutil]
CalendarUtils.java
DateUtils.java

[timingutil]
StopWatch.java

[bitfield]
BitField.java

[identifier]
IdentifierUtils.java
- depends on [functor]

[validate]
Validate.java

So, 22 new commons components. <sarcasm>Now thats a good idea isn't
it</sarcasm>

And if you think this is pedantic look at the list again. Any combination of
the above components is combining two concepts that don't have a direct
connection.

Because thats what [lang] is, and thats why is doesn't fit the holy commons
charter. Its a combination of useful utilities
- too small to exist alone (one class)
- that gain strength by being together (see some of the more unusual
dependencies above)
- encourage reuse and discourage cut and paste by offering more (I might cut
and paste if I want one routine. I might depend if I want more than one. The
broader range available increases reuse over cut and paste)
- builds a viable community (not just counting committers, but users. The
time and math packages are user suggestions. The util package is the useful
parts of the dead community [util]. The functor package is the useful parts
of the dead community [pattern].)

Should we complain that the JDK contains stuff we don't use?

Stephen

----- Original Message -----
From: "Rodney Waldhoff" <rw...@apache.org>
>  Person ${p} suggests feature ${f} for component ${c}. Person ${q} insists
> it belongs in [lang].
>
> and ${q} pulls from a very small set.

and {q} = Stephen
Rodney, please feel free to make it personal if thats what you believe.



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: [general][lang] monolithic components considered harmful

Posted by Rodney Waldhoff <rw...@apache.org>.

In an attempt to make a coherent reply, I've snipped and reordered this
thread significantly.  I hope I haven't taken anything too far out of
context.

rw> 1) Monolithic components introduce
rw> false dependencies.

sc> Adding any dependency to your application
sc> adds a risk, and you need to get a
sc> reward for that risk. I would argue that
sc> adding more dependencies (more smaller jars,
sc> each with their own dependencies) makes the
sc> overall situation worse not better.

Again, the number JARs in my classpath is a poor metric for the magnitude
of my dependencies.

E.g., this:

jar -xf *.jar
rm *.jar
jar -cf monolithic.jar .

doesn't change my dependencies.

A class X depends upon class Y when a change in Y may cause a change in X.
We try to minimize X's dependencies so it is more resilient in the face of
change (this is what we mean by "adds a risk"--the risk that something we
depend upon may change).

For most users, a release is the unit of change.  When a new release
occurs, any of the classes in that bundle *may* have changed, so I
conservatively need to treat it as if they did--retest, perhaps even
modify my dependent code.

When Y and Z are only released and distributed together, I don't really
have a Y or Z, only the pair {Y,Z}.  Then a release of Z requires a
release of {Y,Z}, which is a release of Y.  But while a *change* in Z is a
change in {Y,Z}, it only *looks like* a change in Y.  This means I've had
to re-test X for no reason at all.  This has, for all practical purposes,
*added* a dependency on Z to X, since I can't readily distinguish changes
in Y from changes in Z.  This adds risk and effort to X, since the pair
{Y,Z} changes much more frequently than either of Y or Z, especially when
Y and Z are not mutually dependent or likely to be changed together.

rw> 4) Monolithic components make it more
rw> difficult for clients to track and
rw> communicate their dependencies.

sc> Maybe you place more store in version
sc> numbers than I do. If I pickup any new
sc> jar, I'd test it whatever the version
sc> number difference.

My point exactly.  When, say, I depend upon something in [math], and
[math] and [reflect] are only released together, and there's a new release
of [reflect], I need to "pickup a new jar" and therefore "test it", even
if [math] hasn't changed.  If [math] and [reflect] aren't released
together, users of [math] don't need to worry about changes to [reflect].
If they are not used together, they shouldn't be released together (the
Reuse/Release Equivalence Principle).

rw> Bundling unrelated code into a monolithic component
rw> means I need to synchronize development of that
rw> unrelated code: Maybe I'd like to do a new release
rw> of sub-component X, but I can't since sub-component
rw> Y is in the midst of a major refactoring.  Maybe
rw> I'd like to do a major refactoring of sub-component
rw> A but I can't since sub-component B is preparing
rw> for a release.

sc> Virtually all the [lang] classes are fundamentally
sc> independent, so refactoring isn't an issue. And
sc> this actually highlights that to be proper about
sc> this would require a component for virtually
sc> each class.

No. To be proper about this would require a component for each bundle of
classes that are likely at the same time (the Common Closure Principle).



rw> 6) Monolithic components only get bigger,
rw> making all of these problems worse.

sc> Not all of the ideas presented in
sc> the list above will end up in [lang]
sc> (some get rejected).

Of the 11 ideas I cited, all but one of them already exist in the HEAD of
lang, and that one is listed as an action item in the status file.
Specifically:

rw> * math utilities [12]
See o.a.c.lang.math.

rw> * serialization utilities [13]
See o.a.c.lang.SerializationUtils.

rw> * currency and unit classes [14]
See
<http://cvs.apache.org/viewcvs/~checkout~/jakarta-commons/lang/STATUS.html>.

rw> * date and time utilities [15]
See o.a.c.lang.time.

rw> * functors [17]
See o.a.c.lang.functor.

rw> * reflection and introspection utilities [16]
See o.a.c.lang.reflect.

rw> and much more [...]

rw> [18]
See o.a.c.lang.Notifier and others.

rw> [19]
See o.a.c.lang.StopWatch.

rw> [20]
See o.a.c.lang.exception, o.a.c.lang.functor,
o.a.c.lang.util.IdentifierUtils, o.a.c.lang.*Utils.

rw> [21]
See o.a.c.lang.util.Validator.

rw> [22]
See o.a.c.lang.util.BitField, o.a.c.lang.time.StopWatch,
o.a.c.lang.util.IdentifierUtils, and others


sc> Or viewed alternately, [lang] has had
sc> the community to grow and stay active
sc> while other components have not.

sc> [...] As a group of functions, [lang] has ideas,
sc> momentum, growth and life. [...]

sc> [a monolithic lang] allows us to actually develop
sc> code without arguing about which should depend on
sc> which all the time, and that is terribly wasteful.

Either of these arguments would carry more weight if people were clamoring
to add code to [lang], but the script seems to run more like this:

 Person ${p} suggests feature ${f} for component ${c}. Person ${q} insists
it belongs in [lang].

and ${q} pulls from a very small set.

Here's 10 examples:

* p=Travis; f=currency and unit utilities; c=any; q=Stephen
<http://archives.apache.org/eyebrowse/ReadMsg?listId=15&msgNo=18957>

* p=various; f=reflection and introspection utilities; c=beanutils;
q=Stephen
<http://archives.apache.org/eyebrowse/ReadMsg?listId=15&msgId=411302>

* p=Rodney; f=ConstructorUtils; c=beanutils, q=Robert
<http://archives.apache.org/eyebrowse/ReadMsg?listId=15&msgNo=19847>

* p=Tom; f=additional functors; c=collections; q=Stephen
<http://archives.apache.org/eyebrowse/ReadMsg?listId=15&msgNo=19865>

* p=various; f=functors; c=collections; q=Stephen
<http://archives.apache.org/eyebrowse/ReadMsg?listId=15&msgNo=19885>
<http://archives.apache.org/eyebrowse/ReadMsg?listId=15&msgId=577713>
and others.

* p=Michael; f=SerializationUtils; c=io; q=Stephen
<http://archives.apache.org/eyebrowse/ReadMsg?listId=15&msgId=457636>

* p=various; f=reflection and introspection utilities; c=beanutils; q=Robert
<http://archives.apache.org/eyebrowse/ReadMsg?listId=15&msgId=519705>

* p=Rodney; f=functors; c=functors; q=Stephen
<http://archives.apache.org/eyebrowse/ReadMsg?listId=15&msgId=577713>

* p=various; f=type conversion utilities; c=beanutils; q=Stephen
<http://archives.apache.org/eyebrowse/ReadMsg?listId=15&msgNo=19885>

* p=Ola; f=type conversion utilities; c=converter; q=Stephen
<http://archives.apache.org/eyebrowse/ReadMsg?listId=15&msgId=551801>


sc> I see it more about having a viable community.
sc> [lang] has that community, I don't believe
sc> 10 separate components would.

sc> Because open source is about community first, code second.

This argument is a non-starter.

All the commons components share user and dev mailing lists, voting
rights, and karma.  By design, there is a high degree of overlap between
the both the active developers and the regular users of a number of
components.  There is a high degree of interdependency between components.
There is no reason to think that the "community" around, say,
[lang]/[functor]/[reflect], would be any different that the community
around [lang]/[lang.functor]/[lang.reflect]. There are a number of
"sub-communities" that have devloped around related components.

Components that have a readily identifiable community largely distinct
from or with little overlap with the rest of jakarta-commons are probably
destined to move out of jakarta-commons (like Cactus, HttpClient, Jelly,
maybe even Latka someday). (And I say that expecting to follow HttpClient,
Jelly and Latka wherever they end up.)

sc> As a small isolated component (in line with commons
sc> guidelines), [util] and [pattern] have both
sc> died through lack of interest.

sc> [util] languished for over a year with no
sc> action. No one took responsibility to promote,
sc> fix, manage, look after or release the code.
sc> This has now been noted on the recent Jakarta
sc> PMC report

I'd characterize both [util] and [pattern] as more similiar to [lang] than
anything else in commons, and I'll argue that none of them are in line
with the commons guidelines--all of them lack cohesion and clearly defined
purpose or scope.

Try this: fill in the blanks in the following

  If you want to ___, you may want to use ___.

For example:

* interact with JavaBeans via reflection and introspection; beanutils
* translate between JavaBeans and XML; betwixt
* parse command line arguments; cli
* work with abstract data structures; collections
* parse xml configuration files; digester
* discover services that have been externally configured; discovery
* pool database connections; dbcp
* implement an XML scripting language; jelly
* process multipart/form-data HTTP requests; fileupload
* interact with HTTP servers; httpclient
* work with XPath expresssions in java; jxpath
* functional test HTTP applications; latka
* write debugging and logging messages; logging
* support JMX via Model MBeans; modeler
* pool objects; pool
* validate user input; validator

Now try it with [lang], [util] or [pattern] and any scope signficantly
different from "develop software in Java".



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: [general][lang] monolithic components considered harmful

Posted by Morgan Delagrange <md...@yahoo.com>.

--- Stephen Colebourne <sc...@btopenworld.com>
wrote:
> Reply in-line, not terribly elegant as its after
> midnight here ;-)
> 
> From: "Rodney Waldhoff" <rw...@apache.org>
> > It may seem that I'm picking on [lang] here, but
> that's not my intention.
> > I just feel like I'm watching an impending
> train-wreck, and intend to
> > throw the switch while there's still time.
> Or cause a fork to sourceforge.
> 
> > The arguments in favor of monolithic components
> I've seen seem to boil
> > down to concerns about minimizing dependencies and
> preventing
> > circularities.
> Partly, but I see it more about having a viable
> community. [lang] has that
> community, I don't believe 10 separate components
> would.

Community is important, but consider that developers
come and go.  You want components that are conducive
to focused development.  If you do too much
out-of-scope stuff, I think you risk alienating your
future committers.

> > 1) Monolithic components introduce false
> dependencies.
> Adding any dependency to your application adds a
> risk, and you need to get a
> reward for that risk. I would argue that adding more
> dependencies (more
> smaller jars, each with their own dependencies)
> makes the overall situation
> worse not better.

But with smaller jars the average per-component
dependencies become smaller.  It's much easier to
track and maintain the functionality you actually care
about.

> > 2) Monolithic components encourage superfluous
> dependencies and
> > inappropriate coupling.
> > <snip>
> > But when used infrequently in an otherwise
> unrelated class, the price paid
> > for this trivial reuse is fairly high, coupling
> this code with a 1700+
> > line class to reuse 33 characters of code. (And
> StringUtils uses
> > CharSetUtils, which uses CharSet, which uses
> various java collection
> > classes, etc.)
> This is actually a key point for the defence. If
> this argument is followed
> through, the logical conclusion is not to use
> StringUtils, not to share
> code, not to reuse. It is only if an application
> uses a sufficient amount of
> the functionality from any component that it will
> choose to add the
> dependency, rather than recode. If a component only
> contains a couple of
> classes people simply won't use it.

I think it's more complicated than that.

You actually should avoid adding a component
dependency in your code if you need some trivial piece
of functionality that's easily copied.  I actually
noticed you doing this earlier this week, which I
thought was very good:

http://marc.theaimsgroup.com/?l=jakarta-commons-dev&m=104059848924662&w=2

There are of course many circumstances where you
actually should add dependencies.  One is what you
mention; when a component has a lot of functionality
you plan to use.  It probably doesn't make sense to
copy code, even trivial code, in that case.

Another equally important reason to add dependencies
is when you're utilizing some non-trivial
functionality.  You don't want to copy and paste if
you expect the underlying implementation to evolve. 
You can find prime examples of this throughout DBCP,
Pool, HttpClient and Collections to name a few
components.  You can anticipate that HTTP clients will
get more stable, FastHashMaps will get faster, etc. 
Base64 encoding, on the other hand, will not change
very much at all.  IMO the proposed Functor component
also fits neatly into this category of evolving
implementations. 

I'm concerned with trying to create a
dependency-worthy component artificially by building
up it with lots of useful but disconnected code. 
Frankly, if you have a small component and a user
decides it's so trivial to implement that he simply
cuts and pastes, that's a GOOD thing.  Using a
component as a code snippet library for minor
functionality is totally legitimate in my view and a
good way to keep your dependencies reasonable.

> > 3) Monolithic components slow the pace of
> development.
> >
> > When components are small and single purpose,
> changes are small,
> > well-contained, readily tested and easily
> understood. New releases can be
> > performed more readily, more easily and hence more
> frequently.
> I totally disagree with this. [util] languished for
> over a year with no
> action. No one took responsibility to promote, fix,
> manage, look after or
> release the code. This has now been noted on the
> recent Jakarta PMC report

I think util failed because it lacked that specific
purpose.  Size of the component is irrelevant.

> 
> Releases take a significant amount of time. And
> thats time away from coding,
> bug fixing etc. 10 components means 10 times the
> effort on releases, and I
> simply do not accept that is viable.

Bigger components take longer to release, so I don't
think you're actually talking about 10 times as much
effort.  Smaller components can be released very
easily as long as releases are frequent.  I think
that's one of the problems we have in Commons; we let
components sit so long in development that the
releases get more difficult than necessary.  Everyone
is eager to write new code and reticent to spend time
stabilizing what they've already done; this is a
natural impulse.

I think that smaller specific components take no more
effort to release in aggregate than that same code
packaged up into one broader component.  However, if
there is increased technical debt but the result is
higher quality and longer lasting components, I think
it's worth it.

> > Bundling unrelated code into a monolithic
> component means I need to
> > synchronize development of that unrelated code:
> Maybe I'd like to do a new
> > release of sub-component X, but I can't since
> sub-component Y is in the
> > midst of a major refactoring.  Maybe I'd like to
> do a major refactoring of
> > sub-component A but I can't since sub-component B
> is preparing for a
> > release.
> Virtually all the [lang] classes are fundamentally
> independent, so
> refactoring isn't an issue. And this actually
> highlights that to be proper
> about this would require a component for virtually
> each class.

I don't think that's true.  The current code in lang
has lots of potentially separable domains with
substantial (and GROWABLE) implementations: functor,
reflect, math, etc.

The scope of lang seems to be stuff that should have
been in the JDK.  But that's how many good components
start out.  Technically, the JDK has Collections, HTTP
connections, and JavaBean classes, but there was room
for improvement.  That's fuel for good, focused
components.  It's a proven strategy.

> > 4) Monolithic components make it more difficult
> for clients to track and
> > communicate their dependencies.
> Maybe you place more store in version numbers than I
> do. If I pickup any new
> jar, I'd test it whatever the version number
> difference.

That doesn't mean the effort is wasted.

> > 5) Monolithic components only hide circularities,
> and may even encourage
> > them.
> Perhaps it does. But it allows us to actually
> develop code without arguing
> about which should depend on which all the time, and
> that is terribly
> wasteful.

Depends on who you ask.  I'd hesitate to depend on
code that I thought was out-of-scope for the
component.

> > 6) Monolithic components only get bigger, making
> all of these problems
> > worse.
> >
> > For instance, the [lang] proposal that was
> approved describes its scope
> > as:
> >
> > "[A] package of Java utility classes for the
> classes that are in
> > java.lang's hierarchy, or are considered to be so
> standard as to justify
> > existence in java.lang. The Lang Package also
> applies to primitives and
> > arrays." [6]
> I agree that the proposal does not fully define
> [lang] anymore. Nor does the
> name.
> 
> "A component of Java utility classes to supplement
> those provided in the JDK
> java.lang and java.util package hierarchies. The
> component also applies to
> primitives and arrays. The component shall depend
> only on the JDK."

And there are functors, and there is math, and there
is serialization, and there are Java implementations
of C structures.  And why is "[t]he component shall
depend only on the JDK" part of the scope?

I don't want to pick on lang specifically, and I don't
suggest that components can't grow and evolve. 
However scope changes should be made deliberately,
visibly and sensibly.

> > In the five months since that proposal was
> accepted, the scope of lang has
> > expanded significantly ([7], [8], [9], [10], [11])
> and now includes or is
> > proposed to include:
> >
> >  * math utilities [12]
> >  * serialization utilities [13]
> >  * currency and unit classes [14]
> >  * date and time utilities [15]
> >  * reflection and introspection utilities [16]
> >  * functors [17]
> >  * and much more [18], [19], [20], [21], [22]
> >
> > And the more the scope expands, the more the scope
> expands--the existence
> > of the [lang] monolith has encouraged a reduction
> in ([23], [24], others)
> > and discouraged the growth of ([25], [26], others)
> other components, and
> > has discouraged the introduction of new components
> ([27], [28], others).
> Or viewed alternately, [lang] has had the community
> to grow and stay active
> while other components have not. Not all of the
> ideas presented in the list
> above will end up in [lang] (some get rejected).
> Many should though, as they
> provide functionality that the JDK should provide -
> and thats what [lang] is
> about.

I think good long-term components descibe a specific
domain: HTTP requests, pools, Collections, math,
reflection, etc. etc. etc.  "Functionality that the
JDK should provide" is a pretty wide net.  

> > As above and before, if classes aren't commonly
> used, changed, and
> > released together, or mutually dependant on each
> other, they should be in
> > distinct components.  If we want a catch-all JAR,
> we've got one [3].
> > Given the principles enumerated in the commons
> guidelines and detrimental
> > effects enumerated here, I'm not sure why we'd
> follow any other course.
> Because open source is about community first, code
> second. As a group of
> functions, [lang] has ideas, momentum, growth and
> life. As a small isolated
> component (in line with commons guidelines), [util]
> and [pattern] have both
> died through lack of interest.
>
> Stephen

I think they died from lack of focus:

http://marc.theaimsgroup.com/?t=102504081900002&r=1&w=2

I thought that the 1.0 version of lang was a pretty
coherent package that mainly did supplement java.lang
classes as advertised.  I'm concerned that it's now
moving in so many different directions without
definition.  I've been careful not to be discouraging
of a lot of the recent proposals even though I felt
them out of scope, mainly because I'm not a lang
developer and I don't want to interfere.  However, I
do believe that lang is becoming rather monolithic.  

I'm also concerned that the functor package was not
approved when it's both worthy of a separate component
and out of scope in lang.  I'm really surprised that
you don't see the benefits of a visible functor
package, rather than burying the code in lang.  I
think it would get much more interest on its own.  I'm
also a little surprised that functor only got two
positive votes.  Robert asked for a formal vote so he
could support it, but then he never actually cast that
crucial vote.  Tsk, tsk.  He must have been waylaid by
bandits.  :)

- Morgan

=====
Morgan Delagrange
http://jakarta.apache.org/taglibs
http://jakarta.apache.org/commons
http://axion.tigris.org
http://jakarta.apache.org/watchdog

__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: [general][lang] monolithic components considered harmful

Posted by Stephen Colebourne <sc...@btopenworld.com>.

Reply in-line, not terribly elegant as its after midnight here ;-)

From: "Rodney Waldhoff" <rw...@apache.org>
> It may seem that I'm picking on [lang] here, but that's not my intention.
> I just feel like I'm watching an impending train-wreck, and intend to
> throw the switch while there's still time.
Or cause a fork to sourceforge.

> The arguments in favor of monolithic components I've seen seem to boil
> down to concerns about minimizing dependencies and preventing
> circularities.
Partly, but I see it more about having a viable community. [lang] has that
community, I don't believe 10 separate components would.

> 1) Monolithic components introduce false dependencies.
Adding any dependency to your application adds a risk, and you need to get a
reward for that risk. I would argue that adding more dependencies (more
smaller jars, each with their own dependencies) makes the overall situation
worse not better.

> 2) Monolithic components encourage superfluous dependencies and
> inappropriate coupling.
> <snip>
> But when used infrequently in an otherwise unrelated class, the price paid
> for this trivial reuse is fairly high, coupling this code with a 1700+
> line class to reuse 33 characters of code. (And StringUtils uses
> CharSetUtils, which uses CharSet, which uses various java collection
> classes, etc.)
This is actually a key point for the defence. If this argument is followed
through, the logical conclusion is not to use StringUtils, not to share
code, not to reuse. It is only if an application uses a sufficient amount of
the functionality from any component that it will choose to add the
dependency, rather than recode. If a component only contains a couple of
classes people simply won't use it.

> 3) Monolithic components slow the pace of development.
>
> When components are small and single purpose, changes are small,
> well-contained, readily tested and easily understood. New releases can be
> performed more readily, more easily and hence more frequently.
I totally disagree with this. [util] languished for over a year with no
action. No one took responsibility to promote, fix, manage, look after or
release the code. This has now been noted on the recent Jakarta PMC report

Releases take a significant amount of time. And thats time away from coding,
bug fixing etc. 10 components means 10 times the effort on releases, and I
simply do not accept that is viable.

> Bundling unrelated code into a monolithic component means I need to
> synchronize development of that unrelated code: Maybe I'd like to do a new
> release of sub-component X, but I can't since sub-component Y is in the
> midst of a major refactoring.  Maybe I'd like to do a major refactoring of
> sub-component A but I can't since sub-component B is preparing for a
> release.
Virtually all the [lang] classes are fundamentally independent, so
refactoring isn't an issue. And this actually highlights that to be proper
about this would require a component for virtually each class.

> 4) Monolithic components make it more difficult for clients to track and
> communicate their dependencies.
Maybe you place more store in version numbers than I do. If I pickup any new
jar, I'd test it whatever the version number difference.

> 5) Monolithic components only hide circularities, and may even encourage
> them.
Perhaps it does. But it allows us to actually develop code without arguing
about which should depend on which all the time, and that is terribly
wasteful.

> 6) Monolithic components only get bigger, making all of these problems
> worse.
>
> For instance, the [lang] proposal that was approved describes its scope
> as:
>
> "[A] package of Java utility classes for the classes that are in
> java.lang's hierarchy, or are considered to be so standard as to justify
> existence in java.lang. The Lang Package also applies to primitives and
> arrays." [6]
I agree that the proposal does not fully define [lang] anymore. Nor does the
name.

"A component of Java utility classes to supplement those provided in the JDK
java.lang and java.util package hierarchies. The component also applies to
primitives and arrays. The component shall depend only on the JDK."

> In the five months since that proposal was accepted, the scope of lang has
> expanded significantly ([7], [8], [9], [10], [11]) and now includes or is
> proposed to include:
>
>  * math utilities [12]
>  * serialization utilities [13]
>  * currency and unit classes [14]
>  * date and time utilities [15]
>  * reflection and introspection utilities [16]
>  * functors [17]
>  * and much more [18], [19], [20], [21], [22]
>
> And the more the scope expands, the more the scope expands--the existence
> of the [lang] monolith has encouraged a reduction in ([23], [24], others)
> and discouraged the growth of ([25], [26], others) other components, and
> has discouraged the introduction of new components ([27], [28], others).
Or viewed alternately, [lang] has had the community to grow and stay active
while other components have not. Not all of the ideas presented in the list
above will end up in [lang] (some get rejected). Many should though, as they
provide functionality that the JDK should provide - and thats what [lang] is
about.

> As above and before, if classes aren't commonly used, changed, and
> released together, or mutually dependant on each other, they should be in
> distinct components.  If we want a catch-all JAR, we've got one [3].
> Given the principles enumerated in the commons guidelines and detrimental
> effects enumerated here, I'm not sure why we'd follow any other course.
Because open source is about community first, code second. As a group of
functions, [lang] has ideas, momentum, growth and life. As a small isolated
component (in line with commons guidelines), [util] and [pattern] have both
died through lack of interest.

Stephen



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: [general][lang] monolithic components considered harmful

Posted by Henri Yandell <ba...@generationjava.com>.

On Mon, 30 Dec 2002, Rodney Waldhoff wrote:

> The Jakarta-Commons charter suggests (well, literally requires [0]) that:
>
> "Each package must have a clearly defined purpose, scope, and API -- Do
> one thing well, and keep your contracts"
>
> and suggests in a number of ways that small, single-purpose components are
> preferable to monolithic ones.  (Perhaps most succinctly as "Place types
> that are commonly used, changed, and released together, or mutually
> dependant on each other, into the same package [and types that are not
> used, changed, and released together, or mutually dependent into different
> packages].)  Yet there seems to be an increasing tendency here toward
> lumping discrete units into monolithic components.
>
> Allow me justify this position.
>
> The arguments in favor of monolithic components I've seen seem to boil
> down to concerns about minimizing dependencies and preventing
> circularities.

Also, management is increasingly easier. Community is easier. Incubating a
small piece of code in Commons tends to lead to a dead piece of code,
while incubating it inside an already active component and then breaking
it off when it reaches maturity is more successful [where that maturity
may not be apparant originally].

There is also a question of size. A conceptual component may only be worth
having six methods on one class, but they may be so useful and reusable
that many people will enjoy using them. Placing them in their own jar
for a year seems wrong.

An example away from Commons. Jakarta Taglib's Log taglib is a tiny
taglib. This is quite nice, but it could be argued that JSTL has shown
that the tiny taglibs can easily be replaced by a more palatable larger
taglib. However, my point is that the Log taglib contains a 'dump' tag
which has nothing to do with Log4J per se, but is just a generic tag which
would otherwise go in 'Misc' or 'Util' on its own. A taglib with one tag
is as nonsensical as a Commons component with 1 class.

Just a point, it's not intended to apply to functor/reflect/math/others,
though I do believe that they can begin in one project and then move out,
rather than immediately find life in another project. So with math I'm
quite in favour of it being in lang, with functor/reflect I am less
focused. They have both had time to grow a little in
Collections/BeanUtils, though neither would keep much of the same codebase
for a next version.

> This may seem superficially correct, but it is misguided.
> The number of JARs I need to have in my classpath is at best an indirect
> metric for the absence or presence of dependency issues, and at worst a
> misleading one.  Adding a new JAR to the classpath is a trivial issue, and
> tools like Maven [1], ClassWorld's UberJar [2], Commons-Combo [3] and even
> Java Web Start [4] make it even less of an issue (for better or worse).

Accepted, although this depends on who the user is perceived to be. Other
Jakarta projects will be on Maven, or Centipede, or Ant. Users outside of
Jakarta may not be. Do we care about them? I feel the general feeling is
that the more important users are the other Jakarta projects.

> The real concerns here should be those of configuration management. For
> example, which version of X does Y require, and is that compatible with
> the version of X that Z requires?  How many applications will be impacted
> by a given change?  How small can I make my (end-user) application?

I disagree. The number one concern for me is whether lots of tiny
components can maintain community above a level of anarchy. However the
configuration management is important.

> Monolithic components make configuration management problems worse, not
> better.
>
> Here's how:
>
> 1) Monolithic components introduce false dependencies.
>
> Let's suppose, as some have suggested, that we release [lang] with new
> reflection and math packages.  Suppose further that [cli] uses the
> lang.math utilities and that [beanutils] uses the lang.reflect utilities,
> and that I've got an application that uses both [cli] and [beanutils].
>
> but the reality is more complicated.  Suppose the latest version of
> [beanutils] required some changes to lang.reflect.  In the same period,
> some changes have been made to lang.math, but [cli] has not yet been
> updated to support that.  This makes the version of [lang] required by
> [beanutils] incompatible with the version of [lang] required by [cli].
> (And if your solution is "we'll just keep [cli] up-to-date", replace [cli]
> in this example with some third-party, possibly closed-source component.)
>
> but since [lang] != [lang'], I can't do that.  This problem isn't caused
> by any true incompatibilities, but by an artificial coupling of unrelated
> code.

This is a pushed point though. CLI and BeanUtils could have been dependent
on the same feature in Lang. This is particular to Lang in that all of
Lang can be split up into tiny components in exactly the way you describe.

I'm not against this, but I don't believe it should be done in Commons.
The tiny-weeny projects would becmoe a murky fog around the larger
goliaths like Jelly and HttpClient. True Commons stuff.

A Commons-Jade-like project [java additions to the default environment or
something, guy in france created it] which managed multiple internal jars
isn't bad.

> If [reflect] and [math] are teased apart, the artificial problems go away:
>
>   [MATH] [REFLECT]
>     ^       ^
>     |       |
>     |       |
>   [CLI] [BEANUTILS]
>     ^       ^
>     |       |
>     '--. .--'
>         |
>      [MY APP]
>
> I can now replace [reflect] with [reflect'], and I only need to worry
> about updating those components that depend upon the [reflect] classes.
> This is true even if both [math] and [reflect] depend upon some other
> stuff in [lang]:

Erm. Except that Reflect' is using Lang 1.2 and Math is using Lang 1.1, or
they want to be.

> 2) Monolithic components encourage superfluous dependencies and
> inappropriate coupling.
>
> Bundling unrelated code into a single component inappropriately lowers the
> cost of crossing interface boundaries.  Since the code is distributed
> together, it would seem that the cost of using, say, a method of
> lang.SerializationUtils within lang.functor.FactoryUtils, is negligible.
> But the true cost here isn't in getting SerializationUtils into the
> classpath, it's in coupling of the two classes--making FactoryUtils
> sensitive to changes in SerializationUtils.

Yes, though this is also a good thing. Code may be reused amongst, ie)
java.lang probably uses java.util things.

> Consider, for instance, lang.StringUtils.  There are number of handy
> methods there, some of them non-trivial and all of them offering better
> readability than the naive alternative.  I sympathize with the desire for
> increased readability and reuse, and in some circumstances it may be a
> Good Thing to use, for example, StringUtils.trim(String):
>
>     public static String trim(String str) {
>         return (str == null ? null : str.trim());
>     }
>
> instead of simply inlining the (str == null ? null : str.trim()) clause.
>
> But when used infrequently in an otherwise unrelated class, the price paid
> for this trivial reuse is fairly high, coupling this code with a 1700+
> line class to reuse 33 characters of code. (And StringUtils uses
> CharSetUtils, which uses CharSet, which uses various java collection
> classes, etc.)

But jar dependencies are easy to manage and we shouldn't worry about lots
of dependency. So Commons-Serialisation is dependent on Commons-String,
and who cares??

There is the performance hit of loading the Class though. The only
solution to that seems to be that we publish a reusable set of APIs, but
internally we write the ugliest inlined code we can achieve. ie) We don't
use our own products.

> There are times when trivial code is just that.  Lumping together
> unrelated code in a monolithic component encourages me to be lazy about
> these dependencies and more importantly, these couplings.  Packaging
> unrelated code into distinct components forces me to consider whether
> introducing a new coupling is justified.

Lang is in general compeltely unrelated code though. Is Commons ready for
Lang to be split into 6 to 10 new projects? Or would it be preferred that
Lang generates multiple jars? ie) commons-lang-exception.jar etc.

> 3) Monolithic components slow the pace of development.
>
> When components are small and single purpose, changes are small,
> well-contained, readily tested and easily understood. New releases can be
> performed more readily, more easily and hence more frequently.

True. With a system in which we can deploy a new verison with negligible
effort, I can see this being correct. However a deploy is time consuming,
it involves requesting a vote of all concerned, waiting a day or two,
[technically asking the PMC as well] then doing a test-deploy, checking
this worked with a user or so, then deploying, deploying the new
documentaiton, announcing, updating the website.

As Commons projects tend to the tiny, this becomes an impenetrable
barrier to release soon release often, unless the community grows
substantially to support the tiny components.

> Bundling unrelated code into a monolithic component means I need to
> synchronize development of that unrelated code: Maybe I'd like to do a new
> release of sub-component X, but I can't since sub-component Y is in the
> midst of a major refactoring.  Maybe I'd like to do a major refactoring of
> sub-component A but I can't since sub-component B is preparing for a
> release.

Yep. But that hits us anyway. Lang releases 5.0. Jelly wants to release
10.3 that evening but finds that the new release of Lang breaks it.

Maven had this exact same problem when a Lang beta was released. It broke
Velocity which broke Maven. At least in your scenario, the problem is seen
up front and dealt with [which has happened in Collections and Lang, a
sub-package is flagged as not for release]. I imagine in some cases a
sub-package could be flagged to be taken from the Tag for a release.

> The more "foundational" a component is, the more this problem multiplies.
> E.g., suppose we can't release lang.reflect because we're screwing around
> with lang.time, and beanutils can't release without a released version of
> lang.reflect, and struts can't release with released version of beanutils,
> etc.

Agreed. The decoupling does help here in that someone who cares not about
Commons-Time but is waiting on a release of Commons-Reflect can do a
release.

> 4) Monolithic components make it more difficult for clients to track and
> communicate their dependencies.
>
> Following our versioning guidelines [5], non-backward compatible changes
> to public APIs require new major version numbers.  Hence a non-backward
> compatible change to sub-component X will require new major version
> number, even though sub-component Y may be fully backwards compatible.
> Clients that only depend upon Y (and since X and Y are not strongly
> related, this is a significant set) will find the contract implied by the
> versioning guidelines broken--the version numbers suggest a major change,
> but there isn't as far as Y is concerned.  Clients that only depend upon Y
> are forced to confirm that nothing has been broken, and perhaps even
> update existing deployments even though there has been no change to Y.
> This weakens the utility of the versioning heuristics, and makes it more
> difficult for clients to track and manage their dependencies.
>
> 5) Monolithic components only hide circularities, and may even encourage
> them.
>
> Whenever A depends upon B and B depends on A, we have a circular
> dependency, wherever the code for A and B is located.  As with most forms
> of strong coupling, such circularities should be avoided whenever
> possible.  Building A and B in the same compilation run may make it
> possible to deal with a circular dependency, but it doesn't prevent it.
> Similarly, placing A and B are in different components doesn't create a
> circular dependency, it exposes it.
>
> The "circular dependency" issue is largely hypothetical anyway.  In case
> of [lang] for example, several of the sub-packages have literally no
> dependency on the rest of the package, and most that do have very weak
> coupling at best.  Moreover, it is trivial to combine two previously
> independent components.  Following (1) and (2), it may be substantially
> more difficult to tease apart classes that were once part of the same
> component.

Agreed.

> 6) Monolithic components only get bigger, making all of these problems
> worse.
>
> For instance, the [lang] proposal that was approved describes its scope
> as:
>
> "[A] package of Java utility classes for the classes that are in
> java.lang's hierarchy, or are considered to be so standard as to justify
> existence in java.lang. The Lang Package also applies to primitives and
> arrays." [6]
>
> In the five months since that proposal was accepted, the scope of lang has
> expanded significantly ([7], [8], [9], [10], [11]) and now includes or is
> proposed to include:
>
>  * math utilities [12]
>  * serialization utilities [13]
>  * currency and unit classes [14]
>  * date and time utilities [15]
>  * reflection and introspection utilities [16]
>  * functors [17]
>  * and much more [18], [19], [20], [21], [22]

Okay okay :) Some of those are happily in the Lang scope. Others are
realistically within it [java.util.Date and java.io.Serializable] and
others are unlikely to happen ([14]), but I agree that Lang is a
collection of differing concepts, much like java.lang is.

Lang has pushed a lot of Util-like things out to keep. Many of those
components would become religions if they were to gain independence, ie)
JODA-Time being an example of [15] as an independent project. Lang's
approach to them is to take a very common subset of use that only depends
on the JDK. Functors/Converters are 'special' in that they are religious,
and their position in Lang less solid.

> And the more the scope expands, the more the scope expands--the existence
> of the [lang] monolith has encouraged a reduction in ([23], [24], others)
> and discouraged the growth of ([25], [26], others) other components, and
> has discouraged the introduction of new components ([27], [28], others).
>
>
> As above and before, if classes aren't commonly used, changed, and
> released together, or mutually dependant on each other, they should be in
> distinct components.  If we want a catch-all JAR, we've got one [3].

I disagree. Combo has no community, and no release cycles. It includes
large projects and tiny projects and is too huge for the average user.
There are no Javadocs and no documentation/obvious support. All it does is
provide one binary.

I'd prefer the opposite. Multiple binaries under a tighter project
[see previous Commons Core comments].

> Given the principles enumerated in the commons guidelines and detrimental
> effects enumerated here, I'm not sure why we'd follow any other course.

I don't see these all as detrimental effects. Your examples for
discouraging the introduction of new components are examples of a piece of
[beanutils] being touted for migration out into another project. The same
as reflect and functor. Reflect happily grew inside Lang and functor
happily grew outside of Lang. Lang made no difference here, it was the
lack of anyone having an itch for Converters.

[Sorry if that was all a bit jumbled, it's a bit hard to not repeat
myself]

In conclusion, I agree that the dependency issue is important. Projects
like Lang and Collections and IO have to continually ask themselves if the
new functionality is core to Lang/Collections/IO [and as Lang lacks an
actual functionality concept, it's harder]. If they were broken down into
more discrete items [and we're mainly talking Lang], then I'd like to see
a project wrapping them inside Commons, or maybe Commons just needs to
kick some projects upstairs.

Hen

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: [general][lang] monolithic components considered harmful

Posted by Martin Cooper <ma...@apache.org>.

Excellent message, Rodney!

I'm not even going to try to reply in the various threads, but I'd like to
add a couple of observations.

A) Some people have expressed the notion that more, smaller components
would lead to a lack of community around those components. I don't see it
that way. The "community" around here is the Commons community, and people
get involved with the components they are interested in.

B) I believe people are more likely to dive into a component when it has a
narrow, well-defined scope that matches with their interest and knowledge.
Someone interested in reflection code _might_ decide to become a committer
to Lang to work on that piece, but I believe they would be more likely to
sign on with a more focussed component such as Reflect.

C) Taking the previous comment further, I believe that finding people to
step up for the job of Release Manager will be _much_ easier if we have
more, smaller components. Would I step up for Lang? No way - there's too
much stuff in there that I don't have a deep enough understanding of to
commit to managing a release. Might I step up to the task for a smaller
component such as Reflect? Hell, yes, I might, because it's something I
have an interest in seeing released, and something I have a pretty decent
understanding of.

In short, I think monolithic components are more likely to scare away
potential developers, while smaller, more focussed components will draw
those who are interested.

Finally, I think it's worth saying that, although all of us are using Lang
and various (potential) sub-components of Lang for illustrative purposes
in this thread, Rodney's original text applies much more widely than just
Lang.

--
Martin Cooper


On Mon, 30 Dec 2002, Rodney Waldhoff wrote:

> It may seem that I'm picking on [lang] here, but that's not my intention.
> I just feel like I'm watching an impending train-wreck, and intend to
> throw the switch while there's still time.
>
> The Jakarta-Commons charter suggests (well, literally requires [0]) that:
>
> "Each package must have a clearly defined purpose, scope, and API -- Do
> one thing well, and keep your contracts"
>
> and suggests in a number of ways that small, single-purpose components are
> preferable to monolithic ones.  (Perhaps most succinctly as "Place types
> that are commonly used, changed, and released together, or mutually
> dependant on each other, into the same package [and types that are not
> used, changed, and released together, or mutually dependent into different
> packages].)  Yet there seems to be an increasing tendency here toward
> lumping discrete units into monolithic components.
>
> Allow me justify this position.
>
> The arguments in favor of monolithic components I've seen seem to boil
> down to concerns about minimizing dependencies and preventing
> circularities.  This may seem superficially correct, but it is misguided.
> The number of JARs I need to have in my classpath is at best an indirect
> metric for the absence or presence of dependency issues, and at worst a
> misleading one.  Adding a new JAR to the classpath is a trivial issue, and
> tools like Maven [1], ClassWorld's UberJar [2], Commons-Combo [3] and even
> Java Web Start [4] make it even less of an issue (for better or worse).
> The real concerns here should be those of configuration management. For
> example, which version of X does Y require, and is that compatible with
> the version of X that Z requires?  How many applications will be impacted
> by a given change?  How small can I make my (end-user) application?
>
> Monolithic components make configuration management problems worse, not
> better.
>
> Here's how:
>
> 1) Monolithic components introduce false dependencies.
>
> Let's suppose, as some have suggested, that we release [lang] with new
> reflection and math packages.  Suppose further that [cli] uses the
> lang.math utilities and that [beanutils] uses the lang.reflect utilities,
> and that I've got an application that uses both [cli] and [beanutils].
>
> One might think this gives a simple dependency graph:
>
>       [LANG]
>         ^
>         |
>     .--' '--.
>     |       |
>   [CLI] [BEANUTILS]
>     ^       ^
>     |       |
>     '--. .--'
>         |
>      [MY APP]
>
> (where [X] <-- [Y] means Y depends on X)
>
> but the reality is more complicated.  Suppose the latest version of
> [beanutils] required some changes to lang.reflect.  In the same period,
> some changes have been made to lang.math, but [cli] has not yet been
> updated to support that.  This makes the version of [lang] required by
> [beanutils] incompatible with the version of [lang] required by [cli].
> (And if your solution is "we'll just keep [cli] up-to-date", replace [cli]
> in this example with some third-party, possibly closed-source component.)
>
> This means:
>
>   [LANG]  [LANG']
>     ^       ^
>     |       |
>     |       |
>   [CLI] [BEANUTILS]
>     ^       ^
>     |       |
>     '--. .--'
>         |
>      [MY APP]
>
>
> but since [lang] != [lang'], I can't do that.  This problem isn't caused
> by any true incompatibilities, but by an artificial coupling of unrelated
> code.
>
> If [reflect] and [math] are teased apart, the artificial problems go away:
>
>   [MATH] [REFLECT]
>     ^       ^
>     |       |
>     |       |
>   [CLI] [BEANUTILS]
>     ^       ^
>     |       |
>     '--. .--'
>         |
>      [MY APP]
>
> I can now replace [reflect] with [reflect'], and I only need to worry
> about updating those components that depend upon the [reflect] classes.
> This is true even if both [math] and [reflect] depend upon some other
> stuff in [lang]:
>
>       [LANG]
>         ^
>         |
>     .--' '--.
>     |       |
>   [MATH] [REFLECT]
>     ^       ^
>     |       |
>     |       |
>   [CLI] [BEANUTILS]
>     ^       ^
>     |       |
>     '--. .--'
>         |
>      [MY APP]
>
>
> 2) Monolithic components encourage superfluous dependencies and
> inappropriate coupling.
>
> Bundling unrelated code into a single component inappropriately lowers the
> cost of crossing interface boundaries.  Since the code is distributed
> together, it would seem that the cost of using, say, a method of
> lang.SerializationUtils within lang.functor.FactoryUtils, is negligible.
> But the true cost here isn't in getting SerializationUtils into the
> classpath, it's in coupling of the two classes--making FactoryUtils
> sensitive to changes in SerializationUtils.
>
> Consider, for instance, lang.StringUtils.  There are number of handy
> methods there, some of them non-trivial and all of them offering better
> readability than the naive alternative.  I sympathize with the desire for
> increased readability and reuse, and in some circumstances it may be a
> Good Thing to use, for example, StringUtils.trim(String):
>
>     public static String trim(String str) {
>         return (str == null ? null : str.trim());
>     }
>
> instead of simply inlining the (str == null ? null : str.trim()) clause.
>
> But when used infrequently in an otherwise unrelated class, the price paid
> for this trivial reuse is fairly high, coupling this code with a 1700+
> line class to reuse 33 characters of code. (And StringUtils uses
> CharSetUtils, which uses CharSet, which uses various java collection
> classes, etc.)
>
> There are times when trivial code is just that.  Lumping together
> unrelated code in a monolithic component encourages me to be lazy about
> these dependencies and more importantly, these couplings.  Packaging
> unrelated code into distinct components forces me to consider whether
> introducing a new coupling is justified.
>
> 3) Monolithic components slow the pace of development.
>
> When components are small and single purpose, changes are small,
> well-contained, readily tested and easily understood. New releases can be
> performed more readily, more easily and hence more frequently.
>
> Bundling unrelated code into a monolithic component means I need to
> synchronize development of that unrelated code: Maybe I'd like to do a new
> release of sub-component X, but I can't since sub-component Y is in the
> midst of a major refactoring.  Maybe I'd like to do a major refactoring of
> sub-component A but I can't since sub-component B is preparing for a
> release.
>
> The more "foundational" a component is, the more this problem multiplies.
> E.g., suppose we can't release lang.reflect because we're screwing around
> with lang.time, and beanutils can't release without a released version of
> lang.reflect, and struts can't release with released version of beanutils,
> etc.
>
> (Decoupling the CVS HEAD of lang.time and released version of lang.reflect
> (i.e., releasing lang with the latest lang.reflect but without lang.time),
> as we've done in other circumstances only demonstrates that these really
> are unrelated packages, and causes problems for those that work from a
> SNAPSHOT.)
>
> 4) Monolithic components make it more difficult for clients to track and
> communicate their dependencies.
>
> Following our versioning guidelines [5], non-backward compatible changes
> to public APIs require new major version numbers.  Hence a non-backward
> compatible change to sub-component X will require new major version
> number, even though sub-component Y may be fully backwards compatible.
> Clients that only depend upon Y (and since X and Y are not strongly
> related, this is a significant set) will find the contract implied by the
> versioning guidelines broken--the version numbers suggest a major change,
> but there isn't as far as Y is concerned.  Clients that only depend upon Y
> are forced to confirm that nothing has been broken, and perhaps even
> update existing deployments even though there has been no change to Y.
> This weakens the utility of the versioning heuristics, and makes it more
> difficult for clients to track and manage their dependencies.
>
> 5) Monolithic components only hide circularities, and may even encourage
> them.
>
> Whenever A depends upon B and B depends on A, we have a circular
> dependency, wherever the code for A and B is located.  As with most forms
> of strong coupling, such circularities should be avoided whenever
> possible.  Building A and B in the same compilation run may make it
> possible to deal with a circular dependency, but it doesn't prevent it.
> Similarly, placing A and B are in different components doesn't create a
> circular dependency, it exposes it.
>
> The "circular dependency" issue is largely hypothetical anyway.  In case
> of [lang] for example, several of the sub-packages have literally no
> dependency on the rest of the package, and most that do have very weak
> coupling at best.  Moreover, it is trivial to combine two previously
> independent components.  Following (1) and (2), it may be substantially
> more difficult to tease apart classes that were once part of the same
> component.
>
> 6) Monolithic components only get bigger, making all of these problems
> worse.
>
> For instance, the [lang] proposal that was approved describes its scope
> as:
>
> "[A] package of Java utility classes for the classes that are in
> java.lang's hierarchy, or are considered to be so standard as to justify
> existence in java.lang. The Lang Package also applies to primitives and
> arrays." [6]
>
> In the five months since that proposal was accepted, the scope of lang has
> expanded significantly ([7], [8], [9], [10], [11]) and now includes or is
> proposed to include:
>
>  * math utilities [12]
>  * serialization utilities [13]
>  * currency and unit classes [14]
>  * date and time utilities [15]
>  * reflection and introspection utilities [16]
>  * functors [17]
>  * and much more [18], [19], [20], [21], [22]
>
> And the more the scope expands, the more the scope expands--the existence
> of the [lang] monolith has encouraged a reduction in ([23], [24], others)
> and discouraged the growth of ([25], [26], others) other components, and
> has discouraged the introduction of new components ([27], [28], others).
>
>
> As above and before, if classes aren't commonly used, changed, and
> released together, or mutually dependant on each other, they should be in
> distinct components.  If we want a catch-all JAR, we've got one [3].
> Given the principles enumerated in the commons guidelines and detrimental
> effects enumerated here, I'm not sure why we'd follow any other course.
>
>  - Rod
>
> [0] <http://jakarta.apache.org/commons/charter.html>
> [1] <http://jakarta.apache.org/turbine/maven/>
> [2] <http://classworlds.werken.com/uberjar.html>
> [3] <http://cvs.apache.org/viewcvs/jakarta-commons/combo/>
> [4] <http://java.sun.com/products/javawebstart/>
> [5] <http://jakarta.apache.org/commons/versioning.html>
> [6] <http://cvs.apache.org/viewcvs/jakarta-commons/lang/PROPOSAL.html?rev=1.1&content-type=text/vnd.viewcvs-markup>
> [7] <http://cvs.apache.org/viewcvs/jakarta-commons/lang/STATUS.html.diff?r1=1.10&r2=1.12&diff_format=h>
> [8] <http://cvs.apache.org/viewcvs/jakarta-commons/lang/STATUS.html.diff?r1=1.25&r2=1.26&diff_format=h>
> [9] <http://cvs.apache.org/viewcvs/jakarta-commons/lang/STATUS.html.diff?r1=1.28&r2=1.29&diff_format=h>
> [10] <http://cvs.apache.org/viewcvs/jakarta-commons/lang/STATUS.html.diff?r1=1.30&r2=1.31&diff_format=h>
> [11] <http://cvs.apache.org/viewcvs/jakarta-commons/lang/STATUS.html.diff?r1=1.31&r2=1.32&diff_format=h>
> [12] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgId=586315>
> [13] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgId=457636>
> [14] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgNo=18957>
> [15] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgId=577799>
> [16] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgId=411302>
> [17] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgId=577713>
> [18] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgNo=16718>
> [19] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgNo=18778>
> [20] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgNo=19885>
> [21] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgId=512176>
> [22] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgId=581065>
> [23] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgId=519705>
> [24] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgNo=20304>
> [25] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgNo=19847>
> [26] <http://archives.apache.org/eyebrowse/ReadMsg?listId=15&msgNo=19865>
> [27] <http://archives.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgId=551801>
> [28] <http://archives.apache.org/eyebrowse/ReadMsg?listId=15&msgNo=19221>
>
>
> --
> To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
> For additional commands, e-mail: <ma...@jakarta.apache.org>
>
>


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>