You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Stephen Colebourne <sc...@btopenworld.com> on 2005/08/23 00:53:48 UTC

[fileupload] Remove commons-io dependency

I would like to propose the removal of the commons-io dependency from 
[fileupload].

The dependency consists of three classes:
- FileCleaner
- DeferredFileOutputStream
- ThresholdingOutputStream (required by DeferredFileOutputStream)

This proposal is to copy-and-paste these three classes to package scoped 
in the [fileupload] project, and mark them in documentation (in both io 
and fileupload) as duplicates. (There is the potential for FileCleaner 
to use reflection to try and contact the commons-io version of the class 
to avoid a thread creation)


While I understand that many people have an instinctive reaction against 
copy-and paste, and that it might seem normal and rational to eat our 
own dogfood and reuse code, the truth is that in complex servlet 
environments it causes issues.

Unless every method in every class in every release that your dependency 
makes is 100% binary, source and semantically compatible forevermore, 
then you may have a problem. These problems are generally rare, but are 
in many cases unecessary.

[fileupload] 1.0 had no dependency on [io]. Lets remove that dependency 
from v1.1 and thus speed a release.

Stephen

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [fileupload] Remove commons-io dependency

Posted by robert burrell donkin <ro...@blueyonder.co.uk>.
On Tue, 2005-09-13 at 00:15 +0100, Stephen Colebourne wrote:
> Stephen Colebourne wrote:
> > I am not a [fileupload] maintainer, so cannot -1 or force this issue. 
> > All I can do is try to persuade the actual maintainers that this 
> > dependency is not in your users best interests.
> 
> BTW, here is yet another blogger on the pain of dependencies:
> 
> "Compare Jason Hunter's excellent COS package which provides file upload 
> capabilities as well as several other servlet utilities with zero 
> dependencies, to the Jakarta Commons File Upload which also requires 
> Commons Logging, Commons IO, Commons BeanUtils and Commons Digester."
> 
> http://jroller.com/page/tfenne?entry=framework_and_library_usability
> 
> 
> (Note, this actually isn't true anymore of all except IO, yet the 
> perception is out there damaging commons...)

IMHO quite a lot of the stuff out there in the blogosphere that really
isn't very well informed. the reason given in that blog is fundamentally
a middleware and framework issue (not a library one). creators of
middleware and frameworks are becoming more aware of the problems that
their dependencies have on their users and are moving towards
repackaging dependencies.

there are good reasons to avoid core dependencies between basic
libraries. often the coupling is just a class or two but the dependency
drags in a whole lot of different classes. this leads to the cycle
whereby breaking binary compatibility in one library leads to a cascade
of forced upgrades in all dependent projects. the graph also makes it
harder for users to download all the dependencies they need to run the
project. maven is

- robert


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [fileupload] Remove commons-io dependency

Posted by Stephen Colebourne <sc...@btopenworld.com>.
Stephen Colebourne wrote:
> I am not a [fileupload] maintainer, so cannot -1 or force this issue. 
> All I can do is try to persuade the actual maintainers that this 
> dependency is not in your users best interests.

BTW, here is yet another blogger on the pain of dependencies:

"Compare Jason Hunter's excellent COS package which provides file upload 
capabilities as well as several other servlet utilities with zero 
dependencies, to the Jakarta Commons File Upload which also requires 
Commons Logging, Commons IO, Commons BeanUtils and Commons Digester."

http://jroller.com/page/tfenne?entry=framework_and_library_usability


(Note, this actually isn't true anymore of all except IO, yet the 
perception is out there damaging commons...)

Stephen

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [fileupload] Remove commons-io dependency

Posted by Stephen Colebourne <sc...@btopenworld.com>.
Craig McClanahan wrote:
> If we follow the path to fork those classes, it would be better to
> just rename them back into the org.apache.commons.fileupload package
> space, and dispense with any notion that the classes would remain in
> sync.  We couldn't do that with the beanutils/digester dependency on
> collections, because there were some implementation specific APIs
> (incorrectly) baked in to the public API.
> 
> Using SVN externals for this is a cute hack to avoid forgotten
> cut-n-paste edits, but it doesn't address the process issues of
> release timing or ensure that (for example) someone from one of the
> cooperating projects would not change the code in incompatible ways,
> having no clue that the SVN external trick was being used, and would
> unknowingly break somebody else.

I agree, svn:externals is clever, but copy-and-paste with comments will 
actually work better.

I am not a [fileupload] maintainer, so cannot -1 or force this issue. 
All I can do is try to persuade the actual maintainers that this 
dependency is not in your users best interests.

Stephen

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [fileupload] Remove commons-io dependency

Posted by Craig McClanahan <cr...@gmail.com>.
On 8/22/05, Noel J. Bergman <no...@devtech.com> wrote:
> Martin's valid points aside, if you just want to copy the classes from IO,
> you can use external in SVN.
> 

If we follow the path to fork those classes, it would be better to
just rename them back into the org.apache.commons.fileupload package
space, and dispense with any notion that the classes would remain in
sync.  We couldn't do that with the beanutils/digester dependency on
collections, because there were some implementation specific APIs
(incorrectly) baked in to the public API.

Using SVN externals for this is a cute hack to avoid forgotten
cut-n-paste edits, but it doesn't address the process issues of
release timing or ensure that (for example) someone from one of the
cooperating projects would not change the code in incompatible ways,
having no clue that the SVN external trick was being used, and would
unknowingly break somebody else.

>         --- Noel

Craig

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [fileupload] Remove commons-io dependency

Posted by "Henning P. Schmiedehausen" <hp...@intermeta.de>.
Henri Yandell <fl...@gmail.com> writes:

[rereading this thread because Stephen brought up the blogger article]

>* some kind of subversion sym link? Unsure if it supports that yet.
>svn:externals could be done?

"Gotcha!". I fell into that trap, too. No, this does not
work. svn:externals can only reference directories, not single
files.

It wouldn't work either unless you want to repackage the files under
the same java package as they were in the original jar. Which would
lead to real hell (what if you missed a dependency of class a.x
repackaged into b.jar which now references a.y that is in c.jar while
both a.* classes should've been referenced from a.jar which just
happened to be on the back of the class path. Now along _that_ path
lies madness. I wouldn't dare to go there.

>* custom maven.xml which copies the file from commons-io, search and
>replace on 'public class' on those N files and quietly inlines the
>classes?

How? You would have to automatically find out all dependencies (what
if you reference a.x from a.jar which in turn relies on b.y. Will you
put both a.x and b.y into your c.jar?).

I understand Stephens' reason for concern. The idea of commons is to
have working building blocks for applications. Making these blocks
independent is IMHO a "nice to have feature". If this means to copy a
few stable methods into a FooUtils class that is package private (like
we did with commons-email) is fine IMHO.

commons-fileupload has dependencies to commons-io, that introduce
things like a thread for reaping files. Copying these classes simply
into commons-fileupload would mean that there is not just one but
suddently two reaper threads (FileCleaner). Something to look out for.

Once you don't have just a number of trivial dependencies but actually
use large parts of the functionality of another component, it makes no
longer sense to copy methods over. Where to draw the line? 

Our problem is, that the commons are not good at _documenting_
dependencies. Basically, we throw maven in and let it generate a
dependencies page. Look e.g. at
http://jakarta.apache.org/commons/email/dependencies.html

To everyone _not_ familiar with c-e, this implies that you need
dumbster and the maven-findbugs-plugin to work with commons-email. 

In contrast, look at
http://jakarta.apache.org/commons/configuration/dependencies.html

They did a really good job to identify which part of
commons-configuration needs which dependencies.

I understand the concerns of commons users not to be dragged into
"dependency hell". 

- As few dependencies as possible. If just StringUtils.isEmpty()
  and StringUtils.isNotEmpty() from commons-lang are needed , these
  two methods should be copied into an Utils class. These are
  methods that are unlikely to change in c-l

- When using dependencies, try to stay in the commons.

- The commons _should_ strive for maximum backwards
  compatibility. Especially commons core packages like -lang and
  -collections. If we consider the commons as "building blocks" then our
  building blocks should have maximum compatibility to each other (think Lego).

- Document the dependencies. The current maven-built "dependencies.html"
  page is sub-optimal. If a component does not depend on "a.jar" all the
  time but only when a special sub-package is needed (think commons-fileupload
  and portlet-api), document this.


In the longer run, we might think a bit more about the current commons
structure. There is "commons proper" and "commons sandbox". But we
have e.g. "building blocks" like commons-lang, commons-collections or
commons-math where every external dependency really hurts and then we
have almost 2nd level projects like jelly, latka or vfs (random
selection) that are not exactly "building blocks" any longer but
larger components that build on functionality from many packages.

It might be a good thing to define something like "commons-core" where
external dependencies without good reason is frowned upon and then
"commons-proper" where dependencies are ok.

	Best regards
		Henning


-- 
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen          INTERMETA GmbH
hps@intermeta.de        +49 9131 50 654 0   http://www.intermeta.de/

RedHat Certified Engineer -- Jakarta Turbine Development  -- hero for hire
   Linux, Java, perl, Solaris -- Consulting, Training, Development

		      4 - 8 - 15 - 16 - 23 - 42

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [fileupload] Remove commons-io dependency

Posted by Henri Yandell <fl...@gmail.com>.
On 8/22/05, Stephen Colebourne <sc...@btopenworld.com> wrote:
> Martin Cooper wrote:
> > On 8/22/05, Stephen Colebourne <sc...@btopenworld.com> wrote:
> >>I would like to propose the removal of the commons-io dependency from
> >>[fileupload].
> >
> > I am *not* in favour of this. (And it would have been nice if you'd
> > expressed this opinion a year and a half ago, when these classes were
> > introduced to IO from FileUpload, and the dependency was created. ;)
> 
> I know ;-) [fileupload] is not a component I follow closely :-(

How about:

* some kind of subversion sym link? Unsure if it supports that yet.
svn:externals could be done?
* custom maven.xml which copies the file from commons-io, search and
replace on 'public class' on those N files and quietly inlines the
classes?

Hen

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [fileupload] Remove commons-io dependency

Posted by Stephen Colebourne <sc...@btopenworld.com>.
Martin Cooper wrote:
> On 8/22/05, Stephen Colebourne <sc...@btopenworld.com> wrote:
>>I would like to propose the removal of the commons-io dependency from
>>[fileupload].
> 
> I am *not* in favour of this. (And it would have been nice if you'd
> expressed this opinion a year and a half ago, when these classes were
> introduced to IO from FileUpload, and the dependency was created. ;)

I know ;-) [fileupload] is not a component I follow closely :-(


>>The dependency consists of three classes:
>>- FileCleaner
>>- DeferredFileOutputStream
>>- ThresholdingOutputStream (required by DeferredFileOutputStream)
> 
> Yes. Those classes were added to IO specifically so that they would be
> available outside of the FileUpload component, which is where they
> originated.

Yes, and I agree with their presence in [io]. My argument is that 
doesn't mean a dependency is needed.


> Everyone seemed to be in favour at the time.

My views on inter-commons dependencies have changed over time. When I 
joined commons I wanted many more dependencies. Now I am arguing for 
removing as many as possible.

Many of our key users have stopped using commons, or at the very least 
bitch about, the inter-commons dependencies, and the resulting jar 
version hell.


>>This proposal is to copy-and-paste these three classes to package scoped
>>in the [fileupload] project, and mark them in documentation (in both io
>>and fileupload) as duplicates. (There is the potential for FileCleaner
>>to use reflection to try and contact the commons-io version of the class
>>to avoid a thread creation)
> 
> And who is going to keep them in sync? Are the IO developers going to
> notify the FileUpload folks when bugs get fixed, or do the FileUpload
> folks need to watch all of the changes to IO in order to pick up such
> fixes?

This assumes a large number of bugs. Chances are the number will be 
small. A comment in each file informs the committer that they should 
change another file.

The system is not perfect, but I argue that it is in fact preferable to 
adding the dependency. (Preferable to our users, not preferable to us.)


>>While I understand that many people have an instinctive reaction against
>>copy-and paste, and that it might seem normal and rational to eat our
>>own dogfood and reuse code, the truth is that in complex servlet
>>environments it causes issues.
> 
> Issues? FileUpload 1.1 also introduces dependencies on BeanUtils,
> Digester and Logging.

Maybe I've missed it, but the latest TRUNK [fileupload] has no 
BeanUtils, Digester or Logging dependency that I can see.

 From [fileupload]: "Replaced the ad hoc <code>newInstance()</code> 
means of creating <code>FileItem</code> instances with a factory-based 
scheme for much greater flexibility and simpler customization. This 
change also *eliminates*a*dependency*on Commons BeanUtils."


> Why is IO special, so that we should copy classes from it?

Its not. I'm arguing this wherever I see it.


>>Unless every method in every class in every release that your dependency
>>makes is 100% binary, source and semantically compatible forevermore,
>>then you may have a problem. These problems are generally rare, but are
>>in many cases unecessary.
> 
> This seems like an argument for not having Commons components. Seems a
> little odd...

The argument is that our USERS should CHOOSE to pickup a dependency 
because they actually WANT it. They should not be FORCED to have it 
because they wanted something else. Forcing something on somebody is a 
sure way to generate bad feelings, and commons badly needs to recover 
some mind-share.

Inter-commons dependencies are, I believe, the primary complaint that 
I've read again and again in non commons-dev discussions. I'm arguing to 
reduce them, even where it causes us pain as commons-developers.

Stephen


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


RE: [fileupload] Remove commons-io dependency

Posted by "Noel J. Bergman" <no...@devtech.com>.
Martin's valid points aside, if you just want to copy the classes from IO,
you can use external in SVN.

	--- Noel


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [fileupload] Remove commons-io dependency

Posted by Martin Cooper <mf...@gmail.com>.
On 8/22/05, Stephen Colebourne <sc...@btopenworld.com> wrote:
> I would like to propose the removal of the commons-io dependency from
> [fileupload].

I am *not* in favour of this. (And it would have been nice if you'd
expressed this opinion a year and a half ago, when these classes were
introduced to IO from FileUpload, and the dependency was created. ;)

> The dependency consists of three classes:
> - FileCleaner
> - DeferredFileOutputStream
> - ThresholdingOutputStream (required by DeferredFileOutputStream)

Yes. Those classes were added to IO specifically so that they would be
available outside of the FileUpload component, which is where they
originated. Everyone seemed to be in favour at the time.

> This proposal is to copy-and-paste these three classes to package scoped
> in the [fileupload] project, and mark them in documentation (in both io
> and fileupload) as duplicates. (There is the potential for FileCleaner
> to use reflection to try and contact the commons-io version of the class
> to avoid a thread creation)

And who is going to keep them in sync? Are the IO developers going to
notify the FileUpload folks when bugs get fixed, or do the FileUpload
folks need to watch all of the changes to IO in order to pick up such
fixes?

> While I understand that many people have an instinctive reaction against
> copy-and paste, and that it might seem normal and rational to eat our
> own dogfood and reuse code, the truth is that in complex servlet
> environments it causes issues.

Issues? FileUpload 1.1 also introduces dependencies on BeanUtils,
Digester and Logging. Why is IO special, so that we should copy
classes from it?

> Unless every method in every class in every release that your dependency
> makes is 100% binary, source and semantically compatible forevermore,
> then you may have a problem. These problems are generally rare, but are
> in many cases unecessary.

This seems like an argument for not having Commons components. Seems a
little odd...

> [fileupload] 1.0 had no dependency on [io]. Lets remove that dependency
> from v1.1 and thus speed a release.

Why are the dependencies of FileUpload 1.0 relevant here? Are we not
allowed to expand a component with new functionality, and hence
require additional dependencies?

--
Martin Cooper


> Stephen
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org