You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Doug Bateman <do...@dougbateman.net> on 2009/11/21 13:41:39 UTC

Consolidating Subprojects

Howdy all,

Tonight while browsing through the latest commons projects, it
occurred to me the once modest commons project has grown substantially
over time and has a lot of great stuff.  There are now lots of sub
projects.  Actually, it's 37 active subprojects to be exact.  Plus the
archived projects.  And the release numbering, JRE requirements, and
dependency graph can get a bit confusing.  (Took me several hours to
map it out.)

So I started thinking about what really makes it important to separate projects:
+ Different large external dependencies you wish to keep separate.
+ Projects that aren't related.
+ Projects that have very different user communities (e.g.
commons-math versus commons-dbcp).

And on the flip side, the benefits that come from having one artifact
and one release number to track, instead of a whole heap of release
numbers and dependencies to manage.

Then it hit me... most of the dependencies are with other commons
projects, and on Java itself.  And quite a few of the projects run on
the same version of Java.  Could this network of interrelated projects
be simplified by consolidating a handful of projects together as a
single project?

If so, what are the tradeoffs and what might it look like.

So that's the question... would it potentially be beneficial to
consolidate some of the projects?  What do we lose?  What do we gain?

To help get discussion and thoughts flowing, I've attached some notes
to the bottom of this email (and in attached txt files).  Please don't
interpret these considerations as an actual final proposal... I'm
including them simply as a basis to initiate a possible discussion.

Regards,
Doug

Attachment 1: A possible consolidation plan.
Attachment 2: A starter list of pros and cons.
Attachment 3: A map of project dependencies and minimum Java versions
requirements.

Caveat: Please forgive me for any spelling errors or missing words
that went unnoticed while pulling all this together.  Spelling isn't
my strong suit.


*Attachment 1:  A possible consolidation plan...*

Commons-Lang (JDK 1.1/1.2 Compatible)
- lang
- collections
- logging
- primatives
- discovery

Commons-Utils (JDK 1.5... plus create one-time releases for JDK 1.3
and 1.4, using the appropriate past releases of the items below)
- attributes client api
- beanutils
- command line utils (CLI)
- codec
- compress/zip
- dbutils
- exec
- io
- jxpath
- pool
- proxy (unclear... this might not belong as it has some *optional*
dependencies)
- validator

Commons-Config (Could be in rolled into Commons-Utils)
- configuration
- digester

Commons-Net (JDK 1.4+.  Keep old release of original commons-net
available for those needing jdk 1.2 support.)
- net
- email
- vfs

Commons-EE (JavaEE 1.4+.)
- dbcp
- el
- fileupload
- transaction
- chain (Note: The non-JavaEE stuff could go into utils group above if desired.)

Keep as Independent Subprojects:
- sanselan (hasn't reached 1.0 yet)
- betwixt (hasn't reached 1.0 yet)
- scxml (hasn't reached 1.0 yet)
- daemon (contains native code)
- modeler (requires jmx jars unless jdk 1.6+)
- math (maintain focus on math/sci community)
- launcher (ant dependency)
- jelly
- attributes' ant tasks



*Attachment 2: Pros and Cons*

Benefits of the separate projects in the status quo:
+ *JRE Versions*: Some projects run on Java releases as old as Java
1.1.  Other projects have generics and need Java 5.0.
+ *Fewer Transitive Dependencies*: By keeping project A and B
separate, a project can use A without cluttering its transitive
dependency graph with all of B's dependencies.
+ *Smaller Jar Files*: Rather than having one single 1 MB jar file, I
can pick out just the handful of 150kb jar files I really need.
+ *Separate Release Cycles*: I can release project A without releasing
project B.
+ *Separate Committer Lists*: Project A and Project B can have
separate lists of committers, and access to the SVN directories can be
controlled accordingly.
+ *Legacy*: We've always done it this way, and it works.  Why confuse
people with change.

Important Observations:
+ *Similar Committer Lists*: When you look at the projects, there is a
core group of individuals who are listed as committers on the majority
of projects.  The committer lists between projects aren't all that
different.  Particularly since the subprojects are all so
interdependent on each other, one project is likely to find and fix
another projects bugs.
+ *Very Few Transitive Dependencies*: Most of the commons subprojects
simply depend on each other.  There are actually very few required
dependencies on external artifacts that could be undesirably included,
+ *Small Jars*: Most of the jars are only about 150kb.  If they
combined the resulting jar would still only be about 1MB, which is
hardly problematic for 90% of the apps of there.
+ *Stable Projects*: Subprojects that support old versions of Java
tend to be the more mature and stable projects.  Generally speaking,
new versions incorporate new features rather than bug fixes for very
very old features (there are exceptions, of course).
+ *Java 1.4 is 7 years old*: Java 1.4 came out in 2002.  People
running Java 1.1, 1.2, and 1.3 are most likely more interested in the
existing commons releases than they are desiring of the newest stuff.
(Certainly those who run old Java but want the newest stuff fall in
the <10% category.  As Craig once said... Apache designs primarily for
the more common uses (80%) than the exceptional uncommon cases (<20%).
 Craig... did I get the quote correct this time around. ;-)
+ *Java 5.0 is 5 years old*: Java 5.0 came out in 2004.  Sure, lots of
vendors were slow to fully support 5.0 (ahemIBMahem), but it's out and
established now.  Even dbutils has elected to support Java 1.5 for
it's newest releases.  If you need a Java 1.4 version of dbutils, one
can simply pickup the older releases.

Benefits of consolidating
+ *Simpler Dependency Management*: Rather than tracking down the
required versions of various dependencies and transitive dependencies,
it really can be much easier to just grap a single JAR with
everything, so long as you don't accidentally introduce new transitive
dependencies you didn't want.
+ *Easier Refactoring*: It's easier to manage clean design within a
single project (for example, there might be some sharable code between
commons-net and commons-vfs).

Other minor benefits of consolidating:
+ *Easier Corporate Approval*: So this might not seem such a big deal,
but organizations which have to strong-arm the corporate lawyers to
get approval for various open-source libraries don't do so well
handing the lawyers a list of 37 separate projects to approve.
+ *Less Bewildering*: A new person looking at the website is hit by a
list of 37 projects, and tends to instinctively become overwhelmed
before getting started.



*Attachment 3: Commons Projects and their dependencies*

Disclaimer: I put this list together based on a variety of sources,
ranging from truck/pom.xml to project.properties to project webpages.
Please don't flame me if I didn't get everything exactly right.  I put
this list together quickly to get a sense of what the issues are in
consolidating projects and dependency lists.

attributes
  - JDK: 1.4
  - Required: qdox
  - Optional/Provided: ant, maven-xdoc

beanutils
  - JDK: 1.3
  - Required: commons-logging
  - Optional/Provided: commons-collections

betwixt
  - Note: Not yet release 1.0
  - JDK: 1.3 + xml
  - Required: beanutils, collections, digester, logging
  - Optional/Provided:

chain
  - JDK: 1.3
  - JavaEE: 1.3
  - Required: commons-digester
  - Optional/Provided: javax.servlet 2.3, javax.portlet 1.0, javax.faces 1.1

cli
  - JDK: 1.4
  - Required: None
  - Optional/Provided: None

codec
  - JDK: 1.4
  - Required:
  - Optional/Provided:

collections
  - JDK: 1.2 with partial 1.1 support, pom.xml in trunk targets jdk 1.5
  - Required: None
  - Optional/Provided: None

compress
  - JDK: 1.4
  - Required: None
  - Optional/Provided: None

configuration
  - JDK: 1.3 + xml, 1.4, or 1.5
  - Required: commons-collections, commons-lang, commons-logging,
              commons-digester, commons-beanutils
  - Optional/Provided: commons-jxpath, commons-codec, commons-jexl
              commons-vfs, javax.servlet, ant

daemon
  - Note: Includes native C code
  - JDK: 1.3
  - Required: None
  - Optional/Provided: None

dbcp
  - JDK: 1.4 (1.3 version comes with JDBC 3.0 support)
  - Required: commons-pool
  - Optional/Provided: JavaEE, concurrent

dbutils
  - JDK: 1.5 (generics)
  - Required: None
  - Optional/Provided: None

digester
  - JDK: 1.5
  - Required: commons-loggging, commons-beanutils
  - Optional/Provided: None

discovery
  - JDK: 1.1
  - Required: commons-logging
  - Optional/Provided: None

el
  - JDK: 1.4
  - JavaEE: 1.4
  - Required: commons-logging
  - Optional/Provided: javax.servlet 2.4 (for JSP 2.0 support)

email
  - JDK: 1.4
  - Required: javax.activation, javax.email
  - Optional/Provided:

exec
  - JDK: 1.3
  - Required: None
  - Optional/Provided: None

fileupload
  - JDK: 1.3
  - JavaEE: 2.4
  - Required: commons-io
  - Optional/Provided: javax.portlet, javax.servlet 2.4

io
  - JDK: 1.5
  - Required: None
  - Optional/Provided: None

jci
  - JDK: 1.4
  - Required: None
  - Optional/Provided: A compiler

jelly
  - JDK: 1.3 + xml
  - Required: commons-beanutils, commons-cli, commons-collections
              commons-jexl, commons-lang, commons-logging
              dom4j, forehead, jaxen
  - Optional/Provided: javax.servlet, jstl

jexl
  - JDK: 1.5
  - Required: commons-logging
  - Optional/Provided: None

jxpath
  - JDK: 1.3 + xml
  - Required: commongs-logging
  - Optional/Provided: javax.servlet, jdom, commons-beanutil

lang
  - JDK: 1.2
  - Required: None
  - Optional/Provided: None

launcher
  - JDK: 1.3 + xml
  - Required: ant
  - Optional/Provided:

logging
  - JDK: 1.1
  - Required: None
  - Optional/Provided: log4j, logkit, avalon, javax.servlet

math
  - JDK: 1.5
  - Required: None
  - Optional/Provided: None

modeler
  - JDK: 1.3 + xml
  - Required: commons-digester, commons-logging, jmx
  - Optional/Provided: ant

net
  - JDK: 1.2
  - Required: oro
  - Optional/Provided: None

pool
  - JDK: 1.3
  - Required: None
  - Optional/Provided: None

primitives
  - JDK: 1.1
  - Required: commons-collections
  - Optional/Provided:

proxy
  - JDK: 1.4
  - Required: None
  - Optional/Provided: Lots, but they're all optional

sanselan
  - Note: Not yet release 1.0
  - JDK: 1.4
  - Required: None
  - Optional/Provided: None

scxml
  - JDK: 1.4
  - Required: commons-logging, commons-digester, commons-beanutils
  - Optional/Provided: javax.servlet, javax.faces, commons-el, commons-jexl

transaction
  - JDK: 1.2
  - JavaEE: 1.4
  - Required: commons-codec, commons-logging, log4j
  - Optional/Provided: JavaEE 1.4 (servlet, jta, etc, etc.)

validator
  - JDK: 1.4
  - Required: commons-beanutils, commons-digester, commons-logging,
  - Optional/Provided: oro

vfs
  - JDK: 1.4
  - Required: None
  - Optional/Provided: None

Re: Consolidating Subprojects

Posted by Henri Yandell <fl...@gmail.com>.
Very impressive email.

With regards to JDK dependency, I would treat everything as 1.5
dependent and worry about how to handle 1.6/1,7. Lang's trunk is 1.5
and there are 3 issues that would like 1.6.

Some more/repeated issues:

* Ties release cycles together. Some major issue in one component
would block another.
* When bugfix releases are thrown in, it means more releases. It means
less jars though, so might balance out.
* It means much bigger jars. 3M jars when people complain about the
size of the 500k Collections jar. I think a lot of that is not the
size of the jar, but the ability to grok the fullness of the API -
anyone who actually cares about jar size should be using tools like
jarjar. Still, it's often been the main complaint and there are people
who strongly dislike the 500k collections jar.

Some of the details of the below would need changing. Components that
are dead, or more likely would be better as a TLP in some future. Also
the pros/cons about commit karma can be scrubbed as we don't
differentiate karma (or want to).

Hen

On Sat, Nov 21, 2009 at 4:41 AM, Doug Bateman <do...@dougbateman.net> wrote:
> Howdy all,
>
> Tonight while browsing through the latest commons projects, it
> occurred to me the once modest commons project has grown substantially
> over time and has a lot of great stuff.  There are now lots of sub
> projects.  Actually, it's 37 active subprojects to be exact.  Plus the
> archived projects.  And the release numbering, JRE requirements, and
> dependency graph can get a bit confusing.  (Took me several hours to
> map it out.)
>
> So I started thinking about what really makes it important to separate projects:
> + Different large external dependencies you wish to keep separate.
> + Projects that aren't related.
> + Projects that have very different user communities (e.g.
> commons-math versus commons-dbcp).
>
> And on the flip side, the benefits that come from having one artifact
> and one release number to track, instead of a whole heap of release
> numbers and dependencies to manage.
>
> Then it hit me... most of the dependencies are with other commons
> projects, and on Java itself.  And quite a few of the projects run on
> the same version of Java.  Could this network of interrelated projects
> be simplified by consolidating a handful of projects together as a
> single project?
>
> If so, what are the tradeoffs and what might it look like.
>
> So that's the question... would it potentially be beneficial to
> consolidate some of the projects?  What do we lose?  What do we gain?
>
> To help get discussion and thoughts flowing, I've attached some notes
> to the bottom of this email (and in attached txt files).  Please don't
> interpret these considerations as an actual final proposal... I'm
> including them simply as a basis to initiate a possible discussion.
>
> Regards,
> Doug
>
> Attachment 1: A possible consolidation plan.
> Attachment 2: A starter list of pros and cons.
> Attachment 3: A map of project dependencies and minimum Java versions
> requirements.
>
> Caveat: Please forgive me for any spelling errors or missing words
> that went unnoticed while pulling all this together.  Spelling isn't
> my strong suit.
>
>
> *Attachment 1:  A possible consolidation plan...*
>
> Commons-Lang (JDK 1.1/1.2 Compatible)
> - lang
> - collections
> - logging
> - primatives
> - discovery
>
> Commons-Utils (JDK 1.5... plus create one-time releases for JDK 1.3
> and 1.4, using the appropriate past releases of the items below)
> - attributes client api
> - beanutils
> - command line utils (CLI)
> - codec
> - compress/zip
> - dbutils
> - exec
> - io
> - jxpath
> - pool
> - proxy (unclear... this might not belong as it has some *optional*
> dependencies)
> - validator
>
> Commons-Config (Could be in rolled into Commons-Utils)
> - configuration
> - digester
>
> Commons-Net (JDK 1.4+.  Keep old release of original commons-net
> available for those needing jdk 1.2 support.)
> - net
> - email
> - vfs
>
> Commons-EE (JavaEE 1.4+.)
> - dbcp
> - el
> - fileupload
> - transaction
> - chain (Note: The non-JavaEE stuff could go into utils group above if desired.)
>
> Keep as Independent Subprojects:
> - sanselan (hasn't reached 1.0 yet)
> - betwixt (hasn't reached 1.0 yet)
> - scxml (hasn't reached 1.0 yet)
> - daemon (contains native code)
> - modeler (requires jmx jars unless jdk 1.6+)
> - math (maintain focus on math/sci community)
> - launcher (ant dependency)
> - jelly
> - attributes' ant tasks
>
>
>
> *Attachment 2: Pros and Cons*
>
> Benefits of the separate projects in the status quo:
> + *JRE Versions*: Some projects run on Java releases as old as Java
> 1.1.  Other projects have generics and need Java 5.0.
> + *Fewer Transitive Dependencies*: By keeping project A and B
> separate, a project can use A without cluttering its transitive
> dependency graph with all of B's dependencies.
> + *Smaller Jar Files*: Rather than having one single 1 MB jar file, I
> can pick out just the handful of 150kb jar files I really need.
> + *Separate Release Cycles*: I can release project A without releasing
> project B.
> + *Separate Committer Lists*: Project A and Project B can have
> separate lists of committers, and access to the SVN directories can be
> controlled accordingly.
> + *Legacy*: We've always done it this way, and it works.  Why confuse
> people with change.
>
> Important Observations:
> + *Similar Committer Lists*: When you look at the projects, there is a
> core group of individuals who are listed as committers on the majority
> of projects.  The committer lists between projects aren't all that
> different.  Particularly since the subprojects are all so
> interdependent on each other, one project is likely to find and fix
> another projects bugs.
> + *Very Few Transitive Dependencies*: Most of the commons subprojects
> simply depend on each other.  There are actually very few required
> dependencies on external artifacts that could be undesirably included,
> + *Small Jars*: Most of the jars are only about 150kb.  If they
> combined the resulting jar would still only be about 1MB, which is
> hardly problematic for 90% of the apps of there.
> + *Stable Projects*: Subprojects that support old versions of Java
> tend to be the more mature and stable projects.  Generally speaking,
> new versions incorporate new features rather than bug fixes for very
> very old features (there are exceptions, of course).
> + *Java 1.4 is 7 years old*: Java 1.4 came out in 2002.  People
> running Java 1.1, 1.2, and 1.3 are most likely more interested in the
> existing commons releases than they are desiring of the newest stuff.
> (Certainly those who run old Java but want the newest stuff fall in
> the <10% category.  As Craig once said... Apache designs primarily for
> the more common uses (80%) than the exceptional uncommon cases (<20%).
>  Craig... did I get the quote correct this time around. ;-)
> + *Java 5.0 is 5 years old*: Java 5.0 came out in 2004.  Sure, lots of
> vendors were slow to fully support 5.0 (ahemIBMahem), but it's out and
> established now.  Even dbutils has elected to support Java 1.5 for
> it's newest releases.  If you need a Java 1.4 version of dbutils, one
> can simply pickup the older releases.
>
> Benefits of consolidating
> + *Simpler Dependency Management*: Rather than tracking down the
> required versions of various dependencies and transitive dependencies,
> it really can be much easier to just grap a single JAR with
> everything, so long as you don't accidentally introduce new transitive
> dependencies you didn't want.
> + *Easier Refactoring*: It's easier to manage clean design within a
> single project (for example, there might be some sharable code between
> commons-net and commons-vfs).
>
> Other minor benefits of consolidating:
> + *Easier Corporate Approval*: So this might not seem such a big deal,
> but organizations which have to strong-arm the corporate lawyers to
> get approval for various open-source libraries don't do so well
> handing the lawyers a list of 37 separate projects to approve.
> + *Less Bewildering*: A new person looking at the website is hit by a
> list of 37 projects, and tends to instinctively become overwhelmed
> before getting started.
>
>
>
> *Attachment 3: Commons Projects and their dependencies*
>
> Disclaimer: I put this list together based on a variety of sources,
> ranging from truck/pom.xml to project.properties to project webpages.
> Please don't flame me if I didn't get everything exactly right.  I put
> this list together quickly to get a sense of what the issues are in
> consolidating projects and dependency lists.
>
> attributes
>  - JDK: 1.4
>  - Required: qdox
>  - Optional/Provided: ant, maven-xdoc
>
> beanutils
>  - JDK: 1.3
>  - Required: commons-logging
>  - Optional/Provided: commons-collections
>
> betwixt
>  - Note: Not yet release 1.0
>  - JDK: 1.3 + xml
>  - Required: beanutils, collections, digester, logging
>  - Optional/Provided:
>
> chain
>  - JDK: 1.3
>  - JavaEE: 1.3
>  - Required: commons-digester
>  - Optional/Provided: javax.servlet 2.3, javax.portlet 1.0, javax.faces 1.1
>
> cli
>  - JDK: 1.4
>  - Required: None
>  - Optional/Provided: None
>
> codec
>  - JDK: 1.4
>  - Required:
>  - Optional/Provided:
>
> collections
>  - JDK: 1.2 with partial 1.1 support, pom.xml in trunk targets jdk 1.5
>  - Required: None
>  - Optional/Provided: None
>
> compress
>  - JDK: 1.4
>  - Required: None
>  - Optional/Provided: None
>
> configuration
>  - JDK: 1.3 + xml, 1.4, or 1.5
>  - Required: commons-collections, commons-lang, commons-logging,
>              commons-digester, commons-beanutils
>  - Optional/Provided: commons-jxpath, commons-codec, commons-jexl
>              commons-vfs, javax.servlet, ant
>
> daemon
>  - Note: Includes native C code
>  - JDK: 1.3
>  - Required: None
>  - Optional/Provided: None
>
> dbcp
>  - JDK: 1.4 (1.3 version comes with JDBC 3.0 support)
>  - Required: commons-pool
>  - Optional/Provided: JavaEE, concurrent
>
> dbutils
>  - JDK: 1.5 (generics)
>  - Required: None
>  - Optional/Provided: None
>
> digester
>  - JDK: 1.5
>  - Required: commons-loggging, commons-beanutils
>  - Optional/Provided: None
>
> discovery
>  - JDK: 1.1
>  - Required: commons-logging
>  - Optional/Provided: None
>
> el
>  - JDK: 1.4
>  - JavaEE: 1.4
>  - Required: commons-logging
>  - Optional/Provided: javax.servlet 2.4 (for JSP 2.0 support)
>
> email
>  - JDK: 1.4
>  - Required: javax.activation, javax.email
>  - Optional/Provided:
>
> exec
>  - JDK: 1.3
>  - Required: None
>  - Optional/Provided: None
>
> fileupload
>  - JDK: 1.3
>  - JavaEE: 2.4
>  - Required: commons-io
>  - Optional/Provided: javax.portlet, javax.servlet 2.4
>
> io
>  - JDK: 1.5
>  - Required: None
>  - Optional/Provided: None
>
> jci
>  - JDK: 1.4
>  - Required: None
>  - Optional/Provided: A compiler
>
> jelly
>  - JDK: 1.3 + xml
>  - Required: commons-beanutils, commons-cli, commons-collections
>              commons-jexl, commons-lang, commons-logging
>              dom4j, forehead, jaxen
>  - Optional/Provided: javax.servlet, jstl
>
> jexl
>  - JDK: 1.5
>  - Required: commons-logging
>  - Optional/Provided: None
>
> jxpath
>  - JDK: 1.3 + xml
>  - Required: commongs-logging
>  - Optional/Provided: javax.servlet, jdom, commons-beanutil
>
> lang
>  - JDK: 1.2
>  - Required: None
>  - Optional/Provided: None
>
> launcher
>  - JDK: 1.3 + xml
>  - Required: ant
>  - Optional/Provided:
>
> logging
>  - JDK: 1.1
>  - Required: None
>  - Optional/Provided: log4j, logkit, avalon, javax.servlet
>
> math
>  - JDK: 1.5
>  - Required: None
>  - Optional/Provided: None
>
> modeler
>  - JDK: 1.3 + xml
>  - Required: commons-digester, commons-logging, jmx
>  - Optional/Provided: ant
>
> net
>  - JDK: 1.2
>  - Required: oro
>  - Optional/Provided: None
>
> pool
>  - JDK: 1.3
>  - Required: None
>  - Optional/Provided: None
>
> primitives
>  - JDK: 1.1
>  - Required: commons-collections
>  - Optional/Provided:
>
> proxy
>  - JDK: 1.4
>  - Required: None
>  - Optional/Provided: Lots, but they're all optional
>
> sanselan
>  - Note: Not yet release 1.0
>  - JDK: 1.4
>  - Required: None
>  - Optional/Provided: None
>
> scxml
>  - JDK: 1.4
>  - Required: commons-logging, commons-digester, commons-beanutils
>  - Optional/Provided: javax.servlet, javax.faces, commons-el, commons-jexl
>
> transaction
>  - JDK: 1.2
>  - JavaEE: 1.4
>  - Required: commons-codec, commons-logging, log4j
>  - Optional/Provided: JavaEE 1.4 (servlet, jta, etc, etc.)
>
> validator
>  - JDK: 1.4
>  - Required: commons-beanutils, commons-digester, commons-logging,
>  - Optional/Provided: oro
>
> vfs
>  - JDK: 1.4
>  - Required: None
>  - Optional/Provided: None
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: Consolidating Subprojects

Posted by Mat Booth <ap...@matbooth.co.uk>.
2009/11/22 Doug Bateman <do...@dougbateman.net>:
>> It means much bigger jars. 3M jars when people complain about the
>> size of the 500k Collections jar. I think a lot of that is not the
>> size of the jar, but the ability to grok the fullness of the API -
>> anyone who actually cares about jar size should be using tools like
>> jarjar. Still, it's often been the main complaint and there are people
>> who strongly dislike the 500k collections jar.
>
> Regarding jar file sizes mentioned earlier in the thread.  I've never
> really understood the tendency for people to complain about 300kb
> jars.  Or even a larger Java API.  (The JRE for 1.6 is 80MB, which is
> nothing compared to even a tiny 60GB disk.)

We use some commons projects in one of our commercial products that
uses in-browser applets. Size is very important for some of our
customers who have clients connecting in from geographically remote
outstations over links with about as much bandwidth as a piece of wet
string.

We ship the workstations with the JRE installed so that's never a
problem, but maintaining a connection long enough for the browser
download and cache all the necessary jars almost always is.

Note that I'm not opposed to a consolidated commons distribution, it's
just that you should understand why size does matter for some
applications.

-- 
Mat Booth

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: Consolidating Subprojects

Posted by Torsten Curdt <tc...@vafer.org>.
>>> Sure, but how many users have that requirement? More than 1%?
>>
>> Even if it is 1%, it's worth considering. I am really not comfortable
>> with project addressing only the 80% more frequent cases. Minority
>> should never be neglected. It's clearly a philosophical and personal choice.

You just cannot address all the concerns of each and every user - and
we never have. That's just not realistic.

The point is that is is easily solvable. As long as people can make it
work I don't see a problem.

> Torsten addressed it by writing Jarjar :)

Minijar! Jarjar is from someone else :) ...and there also is ProGuard
and many more now.

cheers
--
Torsten

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: Consolidating Subprojects

Posted by Henri Yandell <fl...@gmail.com>.
On Mon, Nov 23, 2009 at 9:58 AM, Luc Maisonobe <Lu...@free.fr> wrote:
> Torsten Curdt a écrit :
>>> We use some commons projects in one of our commercial products that
>>> uses in-browser applets. Size is very important for some of our
>>> customers who have clients connecting in from geographically remote
>>> outstations over links with about as much bandwidth as a piece of wet
>>> string.
>>>
>>> We ship the workstations with the JRE installed so that's never a
>>> problem, but maintaining a connection long enough for the browser
>>> download and cache all the necessary jars almost always is.
>>>
>>> Note that I'm not opposed to a consolidated commons distribution, it's
>>> just that you should understand why size does matter for some
>>> applications.
>>
>> Sure, but how many users have that requirement? More than 1%?
>
> Even if it is 1%, it's worth considering. I am really not comfortable
> with project addressing only the 80% more frequent cases. Minority
> should never be neglected. It's clearly a philosophical and personal choice.

Torsten addressed it by writing Jarjar :)

Hen

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: Consolidating Subprojects

Posted by Luc Maisonobe <Lu...@free.fr>.
Torsten Curdt a écrit :
>> We use some commons projects in one of our commercial products that
>> uses in-browser applets. Size is very important for some of our
>> customers who have clients connecting in from geographically remote
>> outstations over links with about as much bandwidth as a piece of wet
>> string.
>>
>> We ship the workstations with the JRE installed so that's never a
>> problem, but maintaining a connection long enough for the browser
>> download and cache all the necessary jars almost always is.
>>
>> Note that I'm not opposed to a consolidated commons distribution, it's
>> just that you should understand why size does matter for some
>> applications.
> 
> Sure, but how many users have that requirement? More than 1%?

Even if it is 1%, it's worth considering. I am really not comfortable
with project addressing only the 80% more frequent cases. Minority
should never be neglected. It's clearly a philosophical and personal choice.

Luc

> 
> Someone that cares about jar sizes can easily use the right tools to
> reduce the footprint. IMO jar size should be less of a concern.
> 
> That said I am still not sure what to make out the proposal.
> 
> Less jar juggling - sure ...but also more dependencies of what needs
> to be fixed before a release.
> Plus (I am sure) people will be complaining that they "don't need all
> this stuff" - which is not that great from Commons' image point of
> view.
> 
> cheers
> --
> Torsten
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: Consolidating Subprojects

Posted by Torsten Curdt <tc...@vafer.org>.
> We use some commons projects in one of our commercial products that
> uses in-browser applets. Size is very important for some of our
> customers who have clients connecting in from geographically remote
> outstations over links with about as much bandwidth as a piece of wet
> string.
>
> We ship the workstations with the JRE installed so that's never a
> problem, but maintaining a connection long enough for the browser
> download and cache all the necessary jars almost always is.
>
> Note that I'm not opposed to a consolidated commons distribution, it's
> just that you should understand why size does matter for some
> applications.

Sure, but how many users have that requirement? More than 1%?

Someone that cares about jar sizes can easily use the right tools to
reduce the footprint. IMO jar size should be less of a concern.

That said I am still not sure what to make out the proposal.

Less jar juggling - sure ...but also more dependencies of what needs
to be fixed before a release.
Plus (I am sure) people will be complaining that they "don't need all
this stuff" - which is not that great from Commons' image point of
view.

cheers
--
Torsten

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: Consolidating Subprojects

Posted by Mat Booth <ma...@gmail.com>.
2009/11/22 Doug Bateman <do...@dougbateman.net>:
>> It means much bigger jars. 3M jars when people complain about the
>> size of the 500k Collections jar. I think a lot of that is not the
>> size of the jar, but the ability to grok the fullness of the API -
>> anyone who actually cares about jar size should be using tools like
>> jarjar. Still, it's often been the main complaint and there are people
>> who strongly dislike the 500k collections jar.
>
> Regarding jar file sizes mentioned earlier in the thread.  I've never
> really understood the tendency for people to complain about 300kb
> jars.  Or even a larger Java API.  (The JRE for 1.6 is 80MB, which is
> nothing compared to even a tiny 60GB disk.)

We use some commons projects in one of our commercial products that
uses in-browser applets. Size is very important for some of our
customers who have clients connecting in from geographically remote
outstations over links with about as much bandwidth as a piece of wet
string.

We ship the workstations with the JRE installed so that's never a
problem, but maintaining a connection long enough for the browser
download and cache all the necessary jars almost always is.

Note that I'm not opposed to a consolidated commons distribution, it's
just that you should understand why size does matter for some
applications.

-- 
Mat Booth

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: Consolidating Subprojects

Posted by Doug Bateman <do...@dougbateman.net>.
> Raising one question: do you thing some production project could use
> latest version of project commons.A and old version of commons.B ? They
> would probably not like merging both projects into one. Is this a use
> case we want to address or is it too theoretical ?

Well, the dependencies between versions exist now in the status quo.
Except perhaps you could have a newer version of commons logging with
an older package that uses commons logging.  But I tend to find the
problem with needing very specific versions in the classpath arises
from classloader issues in non-osi environments.  (For example
deployed war files on JBoss where the classloading isn't standards
compliant by default.)  And in these cases, the version of the lowest
level component, e.g. commons logging, tends to be the main problem.
So my problem is usually that my version of component X requires a
newer version of commons logging than is provided by JBoss in the
classpath.  And it's usually found by a junior programmer on a project
who already spent 2 days trying to troubleshoot before asking for
help.

But to answer your question accurately, I really don't know.  I can't
think of a truly valid reason.  But that doesn't mean someone hasn't.
Is it in the 80% case?

> It means much bigger jars. 3M jars when people complain about the
> size of the 500k Collections jar. I think a lot of that is not the
> size of the jar, but the ability to grok the fullness of the API -
> anyone who actually cares about jar size should be using tools like
> jarjar. Still, it's often been the main complaint and there are people
> who strongly dislike the 500k collections jar.

Regarding jar file sizes mentioned earlier in the thread.  I've never
really understood the tendency for people to complain about 300kb
jars.  Or even a larger Java API.  (The JRE for 1.6 is 80MB, which is
nothing compared to even a tiny 60GB disk.)  I think you're right that
it's more about the ability to grok the full API.  Lessons I've taken
from the training business is that people initially feel overwhelmed.
But when you teach them it's okay to just know the 20% primer, and
show them how to find other answers on demand, it sets their mind at
ease (and in fact leads to empowerment).  So perhaps it's just a
matter of having a nice concise "getting oriented" document.

Taking the opposite point of view for a moment... It is fair to say
that jarjar doesn't automatically solve every problem.  When I use a
dependency injection framework that configures the factories in XML or
.properties, jarjar usually can't automatically figure out which
.class files I really use.  So I have to teach jarjar what not to
throw out (which isn't always so easy).  But heck, for every time I've
had to do that, I can think of 10 times I had to deal with version
number hell.

The bottom line for me is believing that the simplest solution that
does the job is often the best and most flexible (that's partly why I
like Struts and Wicket).  And I strong believe 1 fat jar is simpler to
manage than 10 think ones.  But I also agree people would be up in
arms about combining commons-lang with commons-logging.  I don't know
why, but they would.  Perhaps exploring that question would lead to
insights into the larger topic.

Warmest Regards,
Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: Consolidating Subprojects

Posted by Luc Maisonobe <Lu...@free.fr>.
Doug Bateman a écrit :
> Howdy all,
> 
> Tonight while browsing through the latest commons projects, it
> occurred to me the once modest commons project has grown substantially
> over time and has a lot of great stuff.  There are now lots of sub
> projects.  Actually, it's 37 active subprojects to be exact.  Plus the
> archived projects.  And the release numbering, JRE requirements, and
> dependency graph can get a bit confusing.  (Took me several hours to
> map it out.)
> 
> So I started thinking about what really makes it important to separate projects:
> + Different large external dependencies you wish to keep separate.
> + Projects that aren't related.
> + Projects that have very different user communities (e.g.
> commons-math versus commons-dbcp).

Raising one question: do you thing some production project could use
latest version of project commons.A and old version of commons.B ? They
would probably not like merging both projects into one. Is this a use
case we want to adress or is it too theoretical ?

Luc

> 
> And on the flip side, the benefits that come from having one artifact
> and one release number to track, instead of a whole heap of release
> numbers and dependencies to manage.
> 
> Then it hit me... most of the dependencies are with other commons
> projects, and on Java itself.  And quite a few of the projects run on
> the same version of Java.  Could this network of interrelated projects
> be simplified by consolidating a handful of projects together as a
> single project?
> 
> If so, what are the tradeoffs and what might it look like.
> 
> So that's the question... would it potentially be beneficial to
> consolidate some of the projects?  What do we lose?  What do we gain?
> 
> To help get discussion and thoughts flowing, I've attached some notes
> to the bottom of this email (and in attached txt files).  Please don't
> interpret these considerations as an actual final proposal... I'm
> including them simply as a basis to initiate a possible discussion.
> 
> Regards,
> Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: Consolidating Subprojects

Posted by Phil Steitz <ph...@gmail.com>.
Doug Bateman wrote:
> Howdy all,
> 
> Tonight while browsing through the latest commons projects, it
> occurred to me the once modest commons project has grown substantially
> over time and has a lot of great stuff.  There are now lots of sub
> projects.  Actually, it's 37 active subprojects to be exact.  Plus the
> archived projects.  And the release numbering, JRE requirements, and
> dependency graph can get a bit confusing.  (Took me several hours to
> map it out.)

Thanks! Very interesting.
> 
> So I started thinking about what really makes it important to separate projects:
> + Different large external dependencies you wish to keep separate.

As you and others have pointed out, we really don't have these and
for the last several years we have been pushing hard to minimize
dependencies even among the commons components.

> + Projects that aren't related.
> + Projects that have very different user communities (e.g.
> commons-math versus commons-dbcp).

A little humorous for me personally, as I cut the last releases of
each of the two above ;)

One of the things that makes Commons work is that there are quite a
few of us who contribute to multiple components.  Sometimes that
contribution is to the main development, sometimes it is to things
like cleaning up javadoc, checkstyle/findbugs, site maintenance or
help with cutting releases. What is "related" about all of the
commons components is that they all live here and all get the
benefit of some level of attention from the commons community. We
have talked several times in the past about breaking commons up via
different schemes and each time we come back to this central benefit
of "the commons" and decide to keep it as one project.  We could of
course change our minds about that, which is why we appreciate your
suggestion to take stock of where we are.

> 
> And on the flip side, the benefits that come from having one artifact
> and one release number to track, instead of a whole heap of release
> numbers and dependencies to manage.

As others have pointed out, this cuts both ways.  Years ago, we had
a "combo-jar" release that turned out to be too difficult to
maintain and of limited value to users, so it was dropped.  As Hen
pointed out, most feedback we have gotten is that users like smaller
jars with fewer dependencies and good backward compatibility.  As
Torsten pointed out, the difficulty managing "multi-releases" is
also not to be underestimated. IMO we do not do as good a job as we
should releasing often and early in Commons and I would be hesitant
to make a change that made that situation worse.

At the end of the day, I am in the same place as Torsten on this -
what exactly do we expect to gain by "consolidation?"  If it is to
make it easier for users to grab what they need in "bundles" that
are sure to work together, I guess that could be a benefit for some
users, but I would not want to break up the community to accomplish
that.  I would also worry about the mechanics of maintaining all of
the bundled distros and the havoc that it would wreak on the
established user base of the current, more granular, components.  As
a user, I would want to stick with the granular components.

One slightly OT point raised one of Ralph's comments.  The
"committer lists" in the POMs for our components (so on the web
pages) are woefully out of date.  This is because we have
traditionally added ourselves to components lists as we start to
work on them, but we rarely if ever remove ourselves and we never
remove people who have left.  The same problem exists wrt @author
tags that still litter the codebase, most from former Commons
committers who are no longer active.  Should we clean these up?  We
could use svn history for the committer lists - say, no commits in
the last year and you get removed, but you are welcome to add
yourself back with your first new commit?  Should we uniformly
remove @authors?

Phil



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: Consolidating Subprojects

Posted by Ralph Goers <ra...@dslextreme.com>.
On Nov 21, 2009, at 4:41 AM, Doug Bateman wrote:

> So that's the question... would it potentially be beneficial to
> consolidate some of the projects?  What do we lose?  What do we gain?
> 

The developer community around many of the commons projects is small. Looking at the committer lists can be deceptive since people may not have been active on a particular project in quite a while. Putting projects together that are largely the same set of committers could make sense. But grouping lang, collections, etc with logging would not make a lot of sense to me. And it would be a huge problem if they were all packaged in the same jar.

This would also make it much harder to upgrade a project to a new code base with different requirements as each of the subprojects would have to agree. 

In short, I would be in favor of this only if it somehow would increase the size of the developer community in the projects. Otherwise I'm not sure the benefit is really there. It seems to me that the items you list in your benefits list can mostly be solved in other ways.

1.  "Simipler dependency management" - assumes that each project generates a single jar. My guess is that they wouldn't, in which case this benefit doesn't exist.
2.  "Easier refactoring" - Since commons committers can commit to any project this can already be done.
3.  "Easier corporate approval" - My experience with corporate attorneys is that if they like the Apache license then you will get blanket approval for any project under that license. They won't care if it is 3 or 37.
4.  "Less bewildering" - I'm not even sure how to quantify this. How would this help the end user figure out what Configuration and Digester do and how to use them if they are combined? It seems to me this would make this worse since they are designed to do fundamentally different things. 

It might make sense if the grouping you are suggesting happened on the main Commons web page so that they user didn't immediately see 37 projects but instead sees a small number of groups with a clickable + next the them. So clicking on "Language Projects" would show those projects. 

 
> 
> 
> vfs
>  - JDK: 1.4
>  - Required: None
>  - Optional/Provided: None

You must have been looking at the root pom.xml. The dependency list for vfs should look more like:
Required: commons-logging, commons-httpclient
Optional/Provided: ant, commons-net, commons-collections, jdom, jackrabbit-webdav, jsch, xml-apis


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org