You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by "C. Grobmeier" <gr...@possessed.de> on 2006/04/30 13:44:34 UTC
[compress] New draft 5
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello all,
here is a new draft for the compress interface:
* http://grobmeier.de/commons-compress-draft-5.zip
I have improved a lot of things, based on the comments of Sandy (thanks
for that).
This draft is not perfect, as you can see in the todo list. But imho we
have a quite good base for future work, everything compiles and works
and is documented. So i would like to propose that someone is comitting
this, except you have reasons not to do this.
If you agree with me, i will open a bugzilla issue and add the link to
the code.
- - Chris.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFEVKMikv8rKBUE/T4RAlxxAKCMYzova2roWDA/skRyoDvFcErE2gCfTjFw
bT2VrGdR8Byt+VjsRo7Cyhw=
=1GRF
-----END PGP SIGNATURE-----
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [compress] New draft 5
Posted by "C. Grobmeier" <gr...@possessed.de>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
> My fault for only just joining in etc, but why have the underlying
> Tar/Zip classes become package scoped? Does the bridging API replace
> all need to dig into the lower-level APIs?
>
I was not thinking enough about this. In my fault i was thinking its a
good idea to lower visibility. But i didn't think about if the bridge
has enough functionallity or not.
> If so, why are the bzip classes still public?
Cause bzip still needs refactoring. There were compile errors on
severeal points, and i didn't want to dig all up.
> Should there be ZipCompressor and GZipCompressor implementations?
In my opinion, yes.
> Any reason for there not being an archiveToStream method? Especially
> in the case of the Tar implementation where you'd want to attach a
> TarArchiver to a GZipCompressor.
I don't know any reason not to do so. I will remind this in Draft 6.
Thanks,
Chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFEVymhkv8rKBUE/T4RAk0yAJ9Jx6+pfSRVvsuApFCbZpkKiO7AwQCeJPLx
g3rJSBX+ovIEPZrW7JKJnLY=
=YsaR
-----END PGP SIGNATURE-----
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [compress] New draft 5
Posted by Henri Yandell <fl...@gmail.com>.
On 4/30/06, C. Grobmeier <gr...@possessed.de> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hello all,
>
> here is a new draft for the compress interface:
>
> * http://grobmeier.de/commons-compress-draft-5.zip
My fault for only just joining in etc, but why have the underlying
Tar/Zip classes become package scoped? Does the bridging API replace
all need to dig into the lower-level APIs?
If so, why are the bzip classes still public?
Should there be ZipCompressor and GZipCompressor implementations?
Any reason for there not being an archiveToStream method? Especially
in the case of the Tar implementation where you'd want to attach a
TarArchiver to a GZipCompressor.
Hen
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [compress] New draft 5
Posted by "C. Grobmeier" <gr...@possessed.de>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Henri Yandell wrote:
> If you could fill out the software grant:
>
> http://www.apache.org/licenses/#grants
>
> and either fax or postal-mail it (see instructions in first paragraph
> of "http://www.apache.org/licenses/icla.txt").
[x] done via fax
Please come back to me if the fax doesn't arrive
chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFEWNvekv8rKBUE/T4RAnqGAJ9YQKz7KEebe2uQGvEkCgSVlxiSMwCcCSKD
y8jvogkVakzI1PtEqOXENaw=
=w/Xo
-----END PGP SIGNATURE-----
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [compress] New draft 5
Posted by "C. Grobmeier" <gr...@possessed.de>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Henri Yandell wrote:
> If you could fill out the software grant:
>
> http://www.apache.org/licenses/#grants
>
> and either fax or postal-mail it (see instructions in first paragraph
> of "http://www.apache.org/licenses/icla.txt").
>
I will fill out the form and send it today.
> It's better to get the code in early and then have comments be patches
> than continually keep sending you back to the drawing board I think.
I agree. Having a high level look at the discussions followed by my
proposal, i will prepare a new draft. So please hold on before comitting :-)
Regards,
Chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFEVydgkv8rKBUE/T4RAqWzAKCReSn/6vPAJk4W+wchxSAVTjLH7ACfQCoz
BJKgL+JWWWI0eitr/VtB8yQ=
=SbNP
-----END PGP SIGNATURE-----
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [compress] New draft 5
Posted by Henri Yandell <fl...@gmail.com>.
Sounds good, I'll look at the code tonight (hopefully) and comment; in
the meantime, let's go ahead and get a software grant moving for the
code.
If you could fill out the software grant:
http://www.apache.org/licenses/#grants
and either fax or postal-mail it (see instructions in first paragraph
of "http://www.apache.org/licenses/icla.txt").
It's better to get the code in early and then have comments be patches
than continually keep sending you back to the drawing board I think.
Hen
On 4/30/06, C. Grobmeier <gr...@possessed.de> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hello all,
>
> here is a new draft for the compress interface:
>
> * http://grobmeier.de/commons-compress-draft-5.zip
>
> I have improved a lot of things, based on the comments of Sandy (thanks
> for that).
>
> This draft is not perfect, as you can see in the todo list. But imho we
> have a quite good base for future work, everything compiles and works
> and is documented. So i would like to propose that someone is comitting
> this, except you have reasons not to do this.
> If you agree with me, i will open a bugzilla issue and add the link to
> the code.
>
> - - Chris.
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.2.1 (MingW32)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFEVKMikv8rKBUE/T4RAlxxAKCMYzova2roWDA/skRyoDvFcErE2gCfTjFw
> bT2VrGdR8Byt+VjsRo7Cyhw=
> =1GRF
> -----END PGP SIGNATURE-----
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [compress] New draft 5
Posted by will pugh <wi...@sourcelabs.com>.
It occured to me later on, that I think this interface would be better
if it were focused on describing an Archive, rather than trying to
describe two processes (ie describing the noun rather than all the
verbs). The fall out from this, is that you don't have a member for an
unpack directory, since that is only relevent to the unpacking process.
and I think it makes the interface I went to mock this up and came up
with a couple other issues with the interface.
1) Adding an entry or a file to an archive, you need to be able to
set the entry name, otherwise you can run into problems with getting
just the right relative path in the archive for the files.
2) It is pretty common to want to recursively add directories, we
should provide this in the archive API
I've attached my mock-up so folks can see if they think these thoughts
make sense. I tried to borrow design and parameter conventions from
some of the commons-io methods I use alot.
--Will
will pugh wrote:
> Overall, I like the interfaces, but I've got a few issues:
>
> 0) Update is mentioned in the Javadoc at the beginning of Archiver,
> but is not implemented.
>
> 1) You often change method names based on the parameter types, e.g.
> Archiver.addFile + Archiver.addFileName, setUnpackDestinationName +
> setUnpackDestinationFile, etc. It seems more conventional and less
> chaotic to give all the methods the same name, and have them only
> differ based on parameter. Examples of this style are constructors
> for java.utils.zip.ZipFile, java.io.FileInputStream,
> java.io.FileOutputStream, org.apache.commons.io.IOUtils.IOUtils.copy,
> org.apache.commons.io.FileUtils.isFileNewer, etc
>
> 2) You can only add files to an archive, this makes it more difficult
> to add generated entries into an archive. You need to actually write
> them to disk before including them in an archive
>
> 3) UnpackDestination + Destination should be the same property (on
> the both the interface as well as underlying implementation), or you
> should split packing and unpacking into two different interfaces (or
> force folks to pass a path to pack + unpack, instead of setting
> properties on the object).
>
> 4) None of your interfaces deal with the case of what to do if the
> destination file is already there. Choices are either defining it in
> the interface, or adding a property defining what to do. I would
> suggest the latter, and would suggest that for Archiver this method
> should take a FileFilter (since the unpack behaviour can be
> non-trivial) and should default to FalseFileFilter.INSTANCE. For the
> Compresser interface a simple boolean is probably sufficient. e.g.
>
> Archiver.setOverwriteFilter(TrueFileFilter.INSTANCE)
> Compresser.setOverwrite(true);
>
> 5) In AbstractCompressor, I don't understand why you have a copy
> method and don't just use IOUtils.copy()
> 6) In AbstractCompressor, I don't understand why you try to create
> your own temp name, rather than just letting File.createTempFile do
> it's thing.
>
> --Will
>
> C. Grobmeier wrote:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Hello all,
>>
>> here is a new draft for the compress interface:
>>
>> * http://grobmeier.de/commons-compress-draft-5.zip
>>
>> I have improved a lot of things, based on the comments of Sandy (thanks
>> for that).
>>
>> This draft is not perfect, as you can see in the todo list. But imho we
>> have a quite good base for future work, everything compiles and works
>> and is documented. So i would like to propose that someone is comitting
>> this, except you have reasons not to do this.
>> If you agree with me, i will open a bugzilla issue and add the link to
>> the code.
>>
>> - - Chris.
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.2.1 (MingW32)
>> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>>
>> iD8DBQFEVKMikv8rKBUE/T4RAlxxAKCMYzova2roWDA/skRyoDvFcErE2gCfTjFw
>> bT2VrGdR8Byt+VjsRo7Cyhw=
>> =1GRF
>> -----END PGP SIGNATURE-----
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>
Re: [compress] New draft 5
Posted by will pugh <wi...@sourcelabs.com>.
I'm not sure if there is some requirement I'm missing, but
1) It still seems strange to me that we have an interface that does two
things: pack an archive + unpack an archive, yet it has 16 methods on it.
Of these, 9 seem to be some variation of property getters/setters.
As far as I can tell, none of these properties are used by both the pack
and unpack process.
Methods that set/get properties only used
1) Why do we need getters and setters that take both a File and a
String as parameters? Why can't we standardize on just having a File
type for sourceFile and unpackDestination.
2) I'm still not sure why unpackDestination is a property and not just
a parameter on unpack? It seems that it is a property that is used in
one and only one place. It seems like logically its not a property of
the Archiver as much as a property of the actual packing process.
3) Why don't we have a constructor for each of the archivers that can
take a String or a File as the SourceFile (similar to how java.io.File,
java.util.zip.ZipFile, etc deal with this). If you didn't want to
enforce this as a convention for Archivers, it could easily be added
into the ArchiverType.getInstance() method. We could have ArchiverType
provide two getInstance methods, one could take a string and one could
take a file. This way the interface can be minimal, and we can hoist
type conversions off to helpers rather than the interface
4) What is the purpose of the ArchiverType? I'm assuming this is meant
to help people build their own factories? but the common case for
creating an Archiver is still to create using 'new'. Is this right?
--Will
Sandy McArthur wrote:
> On 5/1/06, will pugh <wi...@sourcelabs.com> wrote:
>
>>
>> Sandy McArthur wrote:
>>
>> > On 4/30/06, will pugh <wi...@sourcelabs.com> wrote:
>> >
>> >> 1) You often change method names based on the parameter types, e.g.
>> >> Archiver.addFile + Archiver.addFileName, setUnpackDestinationName +
>> >> setUnpackDestinationFile, etc. It seems more conventional and less
>> >> chaotic to give all the methods the same name, and have them only
>> differ
>> >> based on parameter. Examples of this style are constructors for
>> >> java.utils.zip.ZipFile, java.io.FileInputStream,
>> >> java.io.FileOutputStream, org.apache.commons.io.IOUtils.IOUtils.copy,
>> >> org.apache.commons.io.FileUtils.isFileNewer, etc
>> >
>> >
>> > This is intentional, in a previous version the method names were the
>> > same. The problem with using the same name but different param types
>> > breaks the JavaBean property getter/setter rules and the classes will
>> > not be as usable in at least some scripting environments.
>>
>> Hmm. What scripting envirionments don't allow you to call this kind of
>> Java method? I looked at the documentation for Jythong, JRuby and
>> Groovy and they all seemed to be O.K. with dealing with methods that
>> differ by parameter types.
>>
>> Also, it doesn't seem like the current interface adheres to the Java
>> Bean specification anyways (unless I'm missing somethign). If you
>> wanted to adhere to the Java Beans Spec, I would suggest having only one
>> getter and one setter for each property, i.e. not having a
>> setUnpackDestinationFile and setUnpackDestinationFileName, but only
>> having setUnpackDestination(File f).
>
>
> I'll agree it's not strict adherence to the JavaBean spec but
> following JavaBean conventions even partially can be useful. A few
> years ago I was treating some JavaMail objects like beans and between
> Java 1.3 and 1.4 the rules on how Java determined what was a legal
> javabean property got more strict. This caused problems when I did
> something similar to "message.bodyPart.subject" in a JSP el
> expression. and there was one getBodyPart() and two setBodyPart(Type1)
> and a setBodyPart(Type2).
>
>> As I see it, here is the list of areas I think the archiver interface
>> offers odd behaviour relative to the JavaBean spec:
>>
>> 1) The sourceFile property is recieved by getSourceFile(), but is set
>> by loadFile and loadFilebyPath. Following the JavaBeans rules, this
>> means there is a readonly property called sourceFile, rather than this
>> being a property that is gettable and settable.
>>
>> In addition to load* are bad names because they don't actually load
>> anything.
>>
>> 2) The unpackDestination property has a getter, but no setter.
>> The unpackDestinationFile property has a setter but no getter.
>> The unpackDestinationName property has a setter but no getter
>
>
> Okay, I'll agree that unpackDestinationFile getter should be renamed
> to unpackDestination but the unpackDestinationName property should
> still keep a separate name and it's okay if it's a dynamic property
> that really sets the unpackDestination property.
>
>> >> 4) None of your interfaces deal with the case of what to do if the
>> >> destination file is already there. Choices are either defining it in
>> >> the interface, or adding a property defining what to do. I would
>> >> suggest the latter, and would suggest that for Archiver this method
>> >> should take a FileFilter (since the unpack behaviour can be
>> non-trivial)
>> >> and should default to FalseFileFilter.INSTANCE. For the Compresser
>> >> interface a simple boolean is probably sufficient. e.g.
>> >>
>> >> Archiver.setOverwriteFilter(TrueFileFilter.INSTANCE)
>> >> Compresser.setOverwrite(true);
>> >
>> >
>> > I'll disagree that this should be configurable. IMO it should just do
>> > whatever FileOutputStream or whatever is used does. It's the calling
>> > code's responsibility to handle any name collisions.
>> >
>> How can the calling code handle name collisions? It takes significantly
>> more work for it to walk through all the entries in the archive and then
>> walking through all the files in the file ssytem. Are you suggesting a
>> caller should only unpack into an empty directory?
>>
>> I'm not really sure I understand what your saying about
>> FileOutputStream. The apis for the Archiver don't allow you to set an
>> OutputStream, and it's not really applicable to upack (which is writing
>> many files).
>
>
> I was confused about what you were referring to. I was thinking you
> wanted to specify the behavior for what would happen for creating a
> new archive on to an existing name.
>
> I'll agree the behavior for unpacking into an existing directory tree
> should be improved.
>
> --
> Sandy McArthur
>
> "He who dares not offend cannot be honest."
> - Thomas Paine
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [compress] New draft 5
Posted by will pugh <wi...@sourcelabs.com>.
I may be missing some requirements here, but it seems like in light of
the JavaBean bits we were talking about, that it makes sense to restrict
the properties on the actual interface, and have them correspond
directly to JavaBean properties. In addition, I would propose getting
rid of some properties and moving them onto the method calls. Is there
a reason not to?
For an interface that basically does two things, it is suprising that it
has 16 methods on it. Of those, 9 appear to be getter/setters on
properties. Those 9 getters/setters refer to 3 properties in
AbstractArchiver. Of those 3 properties, only 1 is used by more than
one method (that is not a getter/setter). Why do we have
destinationFile and unpackDestination properties on the object, if they
are only used in one place? Why not just make these parameters? If you
did this, you would have an interface that offers the exact same
functionality, but has only 6 methods. Seems like there is a bit of
extra entropy.
In addition, I sorta like the way java.io.File, java.util.zip.ZipFile,
etc let you set the source file in the constructor. It makes the code
using the objects less verbose, with no loss of readability. I would
suggest that we add this to the archivers, by adding constructors that
can take a string or a file for the source file. In addition, we can
bubble those changes up to the ArchiverType class.
What is the purpose of the ArchiverType class? I'm assuming the idea is
that this helps folks build factories, but that it is perfectly
acceptable and encouraged to create any of the archivers via 'new'. Is
that correct?
Thanks,
--Will
Sandy McArthur wrote:
> On 5/1/06, will pugh <wi...@sourcelabs.com> wrote:
>
>>
>> Sandy McArthur wrote:
>>
>> > On 4/30/06, will pugh <wi...@sourcelabs.com> wrote:
>> >
>> >> 1) You often change method names based on the parameter types, e.g.
>> >> Archiver.addFile + Archiver.addFileName, setUnpackDestinationName +
>> >> setUnpackDestinationFile, etc. It seems more conventional and less
>> >> chaotic to give all the methods the same name, and have them only
>> differ
>> >> based on parameter. Examples of this style are constructors for
>> >> java.utils.zip.ZipFile, java.io.FileInputStream,
>> >> java.io.FileOutputStream, org.apache.commons.io.IOUtils.IOUtils.copy,
>> >> org.apache.commons.io.FileUtils.isFileNewer, etc
>> >
>> >
>> > This is intentional, in a previous version the method names were the
>> > same. The problem with using the same name but different param types
>> > breaks the JavaBean property getter/setter rules and the classes will
>> > not be as usable in at least some scripting environments.
>>
>> Hmm. What scripting envirionments don't allow you to call this kind of
>> Java method? I looked at the documentation for Jythong, JRuby and
>> Groovy and they all seemed to be O.K. with dealing with methods that
>> differ by parameter types.
>>
>> Also, it doesn't seem like the current interface adheres to the Java
>> Bean specification anyways (unless I'm missing somethign). If you
>> wanted to adhere to the Java Beans Spec, I would suggest having only one
>> getter and one setter for each property, i.e. not having a
>> setUnpackDestinationFile and setUnpackDestinationFileName, but only
>> having setUnpackDestination(File f).
>
>
> I'll agree it's not strict adherence to the JavaBean spec but
> following JavaBean conventions even partially can be useful. A few
> years ago I was treating some JavaMail objects like beans and between
> Java 1.3 and 1.4 the rules on how Java determined what was a legal
> javabean property got more strict. This caused problems when I did
> something similar to "message.bodyPart.subject" in a JSP el
> expression. and there was one getBodyPart() and two setBodyPart(Type1)
> and a setBodyPart(Type2).
>
>> As I see it, here is the list of areas I think the archiver interface
>> offers odd behaviour relative to the JavaBean spec:
>>
>> 1) The sourceFile property is recieved by getSourceFile(), but is set
>> by loadFile and loadFilebyPath. Following the JavaBeans rules, this
>> means there is a readonly property called sourceFile, rather than this
>> being a property that is gettable and settable.
>>
>> In addition to load* are bad names because they don't actually load
>> anything.
>>
>> 2) The unpackDestination property has a getter, but no setter.
>> The unpackDestinationFile property has a setter but no getter.
>> The unpackDestinationName property has a setter but no getter
>
>
> Okay, I'll agree that unpackDestinationFile getter should be renamed
> to unpackDestination but the unpackDestinationName property should
> still keep a separate name and it's okay if it's a dynamic property
> that really sets the unpackDestination property.
>
>> >> 4) None of your interfaces deal with the case of what to do if the
>> >> destination file is already there. Choices are either defining it in
>> >> the interface, or adding a property defining what to do. I would
>> >> suggest the latter, and would suggest that for Archiver this method
>> >> should take a FileFilter (since the unpack behaviour can be
>> non-trivial)
>> >> and should default to FalseFileFilter.INSTANCE. For the Compresser
>> >> interface a simple boolean is probably sufficient. e.g.
>> >>
>> >> Archiver.setOverwriteFilter(TrueFileFilter.INSTANCE)
>> >> Compresser.setOverwrite(true);
>> >
>> >
>> > I'll disagree that this should be configurable. IMO it should just do
>> > whatever FileOutputStream or whatever is used does. It's the calling
>> > code's responsibility to handle any name collisions.
>> >
>> How can the calling code handle name collisions? It takes significantly
>> more work for it to walk through all the entries in the archive and then
>> walking through all the files in the file ssytem. Are you suggesting a
>> caller should only unpack into an empty directory?
>>
>> I'm not really sure I understand what your saying about
>> FileOutputStream. The apis for the Archiver don't allow you to set an
>> OutputStream, and it's not really applicable to upack (which is writing
>> many files).
>
>
> I was confused about what you were referring to. I was thinking you
> wanted to specify the behavior for what would happen for creating a
> new archive on to an existing name.
>
> I'll agree the behavior for unpacking into an existing directory tree
> should be improved.
>
> --
> Sandy McArthur
>
> "He who dares not offend cannot be honest."
> - Thomas Paine
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [compress] New draft 5
Posted by Sandy McArthur <sa...@apache.org>.
On 5/1/06, will pugh <wi...@sourcelabs.com> wrote:
>
> Sandy McArthur wrote:
>
> > On 4/30/06, will pugh <wi...@sourcelabs.com> wrote:
> >
> >> 1) You often change method names based on the parameter types, e.g.
> >> Archiver.addFile + Archiver.addFileName, setUnpackDestinationName +
> >> setUnpackDestinationFile, etc. It seems more conventional and less
> >> chaotic to give all the methods the same name, and have them only differ
> >> based on parameter. Examples of this style are constructors for
> >> java.utils.zip.ZipFile, java.io.FileInputStream,
> >> java.io.FileOutputStream, org.apache.commons.io.IOUtils.IOUtils.copy,
> >> org.apache.commons.io.FileUtils.isFileNewer, etc
> >
> >
> > This is intentional, in a previous version the method names were the
> > same. The problem with using the same name but different param types
> > breaks the JavaBean property getter/setter rules and the classes will
> > not be as usable in at least some scripting environments.
>
> Hmm. What scripting envirionments don't allow you to call this kind of
> Java method? I looked at the documentation for Jythong, JRuby and
> Groovy and they all seemed to be O.K. with dealing with methods that
> differ by parameter types.
>
> Also, it doesn't seem like the current interface adheres to the Java
> Bean specification anyways (unless I'm missing somethign). If you
> wanted to adhere to the Java Beans Spec, I would suggest having only one
> getter and one setter for each property, i.e. not having a
> setUnpackDestinationFile and setUnpackDestinationFileName, but only
> having setUnpackDestination(File f).
I'll agree it's not strict adherence to the JavaBean spec but
following JavaBean conventions even partially can be useful. A few
years ago I was treating some JavaMail objects like beans and between
Java 1.3 and 1.4 the rules on how Java determined what was a legal
javabean property got more strict. This caused problems when I did
something similar to "message.bodyPart.subject" in a JSP el
expression. and there was one getBodyPart() and two setBodyPart(Type1)
and a setBodyPart(Type2).
> As I see it, here is the list of areas I think the archiver interface
> offers odd behaviour relative to the JavaBean spec:
>
> 1) The sourceFile property is recieved by getSourceFile(), but is set
> by loadFile and loadFilebyPath. Following the JavaBeans rules, this
> means there is a readonly property called sourceFile, rather than this
> being a property that is gettable and settable.
>
> In addition to load* are bad names because they don't actually load
> anything.
>
> 2) The unpackDestination property has a getter, but no setter.
> The unpackDestinationFile property has a setter but no getter.
> The unpackDestinationName property has a setter but no getter
Okay, I'll agree that unpackDestinationFile getter should be renamed
to unpackDestination but the unpackDestinationName property should
still keep a separate name and it's okay if it's a dynamic property
that really sets the unpackDestination property.
> >> 4) None of your interfaces deal with the case of what to do if the
> >> destination file is already there. Choices are either defining it in
> >> the interface, or adding a property defining what to do. I would
> >> suggest the latter, and would suggest that for Archiver this method
> >> should take a FileFilter (since the unpack behaviour can be non-trivial)
> >> and should default to FalseFileFilter.INSTANCE. For the Compresser
> >> interface a simple boolean is probably sufficient. e.g.
> >>
> >> Archiver.setOverwriteFilter(TrueFileFilter.INSTANCE)
> >> Compresser.setOverwrite(true);
> >
> >
> > I'll disagree that this should be configurable. IMO it should just do
> > whatever FileOutputStream or whatever is used does. It's the calling
> > code's responsibility to handle any name collisions.
> >
> How can the calling code handle name collisions? It takes significantly
> more work for it to walk through all the entries in the archive and then
> walking through all the files in the file ssytem. Are you suggesting a
> caller should only unpack into an empty directory?
>
> I'm not really sure I understand what your saying about
> FileOutputStream. The apis for the Archiver don't allow you to set an
> OutputStream, and it's not really applicable to upack (which is writing
> many files).
I was confused about what you were referring to. I was thinking you
wanted to specify the behavior for what would happen for creating a
new archive on to an existing name.
I'll agree the behavior for unpacking into an existing directory tree
should be improved.
--
Sandy McArthur
"He who dares not offend cannot be honest."
- Thomas Paine
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [compress] New draft 5
Posted by will pugh <wi...@sourcelabs.com>.
Sandy McArthur wrote:
> On 4/30/06, will pugh <wi...@sourcelabs.com> wrote:
>
>> 1) You often change method names based on the parameter types, e.g.
>> Archiver.addFile + Archiver.addFileName, setUnpackDestinationName +
>> setUnpackDestinationFile, etc. It seems more conventional and less
>> chaotic to give all the methods the same name, and have them only differ
>> based on parameter. Examples of this style are constructors for
>> java.utils.zip.ZipFile, java.io.FileInputStream,
>> java.io.FileOutputStream, org.apache.commons.io.IOUtils.IOUtils.copy,
>> org.apache.commons.io.FileUtils.isFileNewer, etc
>
>
> This is intentional, in a previous version the method names were the
> same. The problem with using the same name but different param types
> breaks the JavaBean property getter/setter rules and the classes will
> not be as usable in at least some scripting environments.
Hmm. What scripting envirionments don't allow you to call this kind of
Java method? I looked at the documentation for Jythong, JRuby and
Groovy and they all seemed to be O.K. with dealing with methods that
differ by parameter types.
Also, it doesn't seem like the current interface adheres to the Java
Bean specification anyways (unless I'm missing somethign). If you
wanted to adhere to the Java Beans Spec, I would suggest having only one
getter and one setter for each property, i.e. not having a
setUnpackDestinationFile and setUnpackDestinationFileName, but only
having setUnpackDestination(File f).
As I see it, here is the list of areas I think the archiver interface
offers odd behaviour relative to the JavaBean spec:
1) The sourceFile property is recieved by getSourceFile(), but is set
by loadFile and loadFilebyPath. Following the JavaBeans rules, this
means there is a readonly property called sourceFile, rather than this
being a property that is gettable and settable.
In addition to load* are bad names because they don't actually load
anything.
2) The unpackDestination property has a getter, but no setter.
The unpackDestinationFile property has a setter but no getter.
The unpackDestinationName property has a setter but no getter
>> 4) None of your interfaces deal with the case of what to do if the
>> destination file is already there. Choices are either defining it in
>> the interface, or adding a property defining what to do. I would
>> suggest the latter, and would suggest that for Archiver this method
>> should take a FileFilter (since the unpack behaviour can be non-trivial)
>> and should default to FalseFileFilter.INSTANCE. For the Compresser
>> interface a simple boolean is probably sufficient. e.g.
>>
>> Archiver.setOverwriteFilter(TrueFileFilter.INSTANCE)
>> Compresser.setOverwrite(true);
>
>
> I'll disagree that this should be configurable. IMO it should just do
> whatever FileOutputStream or whatever is used does. It's the calling
> code's responsibility to handle any name collisions.
>
How can the calling code handle name collisions? It takes significantly
more work for it to walk through all the entries in the archive and then
walking through all the files in the file ssytem. Are you suggesting a
caller should only unpack into an empty directory?
I'm not really sure I understand what your saying about
FileOutputStream. The apis for the Archiver don't allow you to set an
OutputStream, and it's not really applicable to upack (which is writing
many files).
FileOutputStream allows you to set a boolean for whether the stream
should overwrite or append. This is made more difficult in the unpack
scenario where there are more than one file. From personal experience,
I liked the solution used in FileUtils.iterateFiles and
FileUtils.listFiles for doing yeah/nay on a recursive operation like this.
Cheers,
--Will
> --
> Sandy McArthur
>
> "He who dares not offend cannot be honest."
> - Thomas Paine
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [compress] New draft 5
Posted by "C. Grobmeier" <gr...@possessed.de>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
>> So my comments:
>>
>> o Don't like that compressor does both compression and decompression.
>
> I'm ambivalent on this one.
>
If we would do so, we have 4 different interfaces. But i like the look.
>> o Always use File not String
>
> +1. Only use 'String filename' when you're doing something to a
> filename - otherwise we bloat the APIs for the sake of a new
> File(xxx). It's not worth it.
Agreed, i will delete that.
> Also, the setXxx stuff seems a bit unnecessary.
>
> For the interface, minimal and stateless seem like important goals.
> Having setXxx just means that the implementors have to worry about
> thread safety.
OK, my new draft will remind that. I will keep it pure ;-)
Chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFEV50/kv8rKBUE/T4RAiYiAKCJEGc8eQ07MgenqF/BNBjr4rQ8UQCfRIhL
8K3+oBrOYvXuAGIyUuR/yQM=
=UBTS
-----END PGP SIGNATURE-----
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [compress] New draft 5
Posted by Henri Yandell <fl...@gmail.com>.
On 5/2/06, Torsten Curdt <tc...@apache.org> wrote:
> Sorry, joining the party so late ...but this thread let me to actually
> have a look ;)
>
> So my comments:
>
> o Don't like that compressor does both compression and decompression.
I'm ambivalent on this one.
> o Always use File not String
+1. Only use 'String filename' when you're doing something to a
filename - otherwise we bloat the APIs for the sake of a new
File(xxx). It's not worth it.
Also, the setXxx stuff seems a bit unnecessary.
For the interface, minimal and stateless seem like important goals.
Having setXxx just means that the implementors have to worry about
thread safety.
Hen
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [compress] New draft 5
Posted by Torsten Curdt <tc...@apache.org>.
Sorry, joining the party so late ...but this thread let me to actually
have a look ;)
So my comments:
o Don't like that compressor does both compression and decompression.
o Always use File not String
o Please parameter overloading ...this method naming scheme makes
the API really ugly
But to be a bit more constructive... I would think more of something like:
Compressor compressor = CompressorFactory.newInstance(new File("file.bz2"));
Compressor compressor = CompressorFactory.newInstance("bzip2");
Compressor compressor = new Bzip2Compressor();
Compressor compressor = Bzip2Compressor.getInstance();
File output = compressor.compress(new File("input"));
File output = compressor.compress(new FileInputStream("input"));
InputStream in = compressor.compressStream(new File("input"))
InputStream in = compressor.compressStream(new FileInputStream("input"))
compressor.compressTo(new File("input"), new File("output"));
compressor.compressTo(new FileInputStream("input"), new
FileOutputStream("output"));
Decompressor decompressor = DecompressorFactory.newInstance("bzip2");
File output = decompressor.decompress(new File("input"))
File output = decompressor.decompress(new FileInputStream("input"));
InputStream in = decompressor.decompressStream(new File("input"))
InputStream in = decompressor.decompressStream(new FileInputStream("input"))
decompressor.decompressTo(new File("input"), new File("output"));
decompressor.decompressTo(new FileInputStream("input"), new
FileOutputStream("output"));
Archive arch = new TarArchive(new File("my.tar"));
Archive arch = ArchiveFactory.newInstance(new File("my.tar");
arch.add(new File("input"));
arch.add(new FileInputStream("input"));
arch.delete("path");
arch.save();
arch.save(new File("output"));
arch.save(new FileOutputStream("out"));
for(Iterator it = arch.iterator(); it.hasNext(); it) {
ArchiveEntry entry = (ArchiveEntry) it.next();
...
}
ArchiveEntry entry = arch.getEntry("path");
entry.extract(new File("output"));
entry.extract(new FileOutputStream("output"));
entry.delete();
cheers
--
Torsten
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [compress] New draft 5
Posted by Torsten Curdt <tc...@apache.org>.
On 5/1/06, Sandy McArthur <sa...@apache.org> wrote:
> On 4/30/06, will pugh <wi...@sourcelabs.com> wrote:
> > 1) You often change method names based on the parameter types, e.g.
> > Archiver.addFile + Archiver.addFileName, setUnpackDestinationName +
> > setUnpackDestinationFile, etc. It seems more conventional and less
> > chaotic to give all the methods the same name, and have them only differ
> > based on parameter. Examples of this style are constructors for
> > java.utils.zip.ZipFile, java.io.FileInputStream,
> > java.io.FileOutputStream, org.apache.commons.io.IOUtils.IOUtils.copy,
> > org.apache.commons.io.FileUtils.isFileNewer, etc
>
> This is intentional, in a previous version the method names were the
> same. The problem with using the same name but different param types
> breaks the JavaBean property getter/setter rules and the classes will
> not be as usable in at least some scripting environments.
I am sorry ...whoever wants to use this API from a JavaBean
can easily write a little wrapper class. I rather have a slick API
design than adhering to the JavaBean interface.
My 2 cents
cheers
--
Torsten
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [compress] New draft 5
Posted by Sandy McArthur <sa...@apache.org>.
On 4/30/06, will pugh <wi...@sourcelabs.com> wrote:
> 1) You often change method names based on the parameter types, e.g.
> Archiver.addFile + Archiver.addFileName, setUnpackDestinationName +
> setUnpackDestinationFile, etc. It seems more conventional and less
> chaotic to give all the methods the same name, and have them only differ
> based on parameter. Examples of this style are constructors for
> java.utils.zip.ZipFile, java.io.FileInputStream,
> java.io.FileOutputStream, org.apache.commons.io.IOUtils.IOUtils.copy,
> org.apache.commons.io.FileUtils.isFileNewer, etc
This is intentional, in a previous version the method names were the
same. The problem with using the same name but different param types
breaks the JavaBean property getter/setter rules and the classes will
not be as usable in at least some scripting environments.
> 4) None of your interfaces deal with the case of what to do if the
> destination file is already there. Choices are either defining it in
> the interface, or adding a property defining what to do. I would
> suggest the latter, and would suggest that for Archiver this method
> should take a FileFilter (since the unpack behaviour can be non-trivial)
> and should default to FalseFileFilter.INSTANCE. For the Compresser
> interface a simple boolean is probably sufficient. e.g.
>
> Archiver.setOverwriteFilter(TrueFileFilter.INSTANCE)
> Compresser.setOverwrite(true);
I'll disagree that this should be configurable. IMO it should just do
whatever FileOutputStream or whatever is used does. It's the calling
code's responsibility to handle any name collisions.
--
Sandy McArthur
"He who dares not offend cannot be honest."
- Thomas Paine
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [compress] New draft 5
Posted by will pugh <wi...@sourcelabs.com>.
Overall, I like the interfaces, but I've got a few issues:
0) Update is mentioned in the Javadoc at the beginning of Archiver, but
is not implemented.
1) You often change method names based on the parameter types, e.g.
Archiver.addFile + Archiver.addFileName, setUnpackDestinationName +
setUnpackDestinationFile, etc. It seems more conventional and less
chaotic to give all the methods the same name, and have them only differ
based on parameter. Examples of this style are constructors for
java.utils.zip.ZipFile, java.io.FileInputStream,
java.io.FileOutputStream, org.apache.commons.io.IOUtils.IOUtils.copy,
org.apache.commons.io.FileUtils.isFileNewer, etc
2) You can only add files to an archive, this makes it more difficult
to add generated entries into an archive. You need to actually write
them to disk before including them in an archive
3) UnpackDestination + Destination should be the same property (on the
both the interface as well as underlying implementation), or you should
split packing and unpacking into two different interfaces (or force
folks to pass a path to pack + unpack, instead of setting properties on
the object).
4) None of your interfaces deal with the case of what to do if the
destination file is already there. Choices are either defining it in
the interface, or adding a property defining what to do. I would
suggest the latter, and would suggest that for Archiver this method
should take a FileFilter (since the unpack behaviour can be non-trivial)
and should default to FalseFileFilter.INSTANCE. For the Compresser
interface a simple boolean is probably sufficient. e.g.
Archiver.setOverwriteFilter(TrueFileFilter.INSTANCE)
Compresser.setOverwrite(true);
5) In AbstractCompressor, I don't understand why you have a copy method
and don't just use IOUtils.copy()
6) In AbstractCompressor, I don't understand why you try to create your
own temp name, rather than just letting File.createTempFile do it's thing.
--Will
C. Grobmeier wrote:
>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>Hello all,
>
>here is a new draft for the compress interface:
>
>* http://grobmeier.de/commons-compress-draft-5.zip
>
>I have improved a lot of things, based on the comments of Sandy (thanks
>for that).
>
>This draft is not perfect, as you can see in the todo list. But imho we
>have a quite good base for future work, everything compiles and works
>and is documented. So i would like to propose that someone is comitting
>this, except you have reasons not to do this.
>If you agree with me, i will open a bugzilla issue and add the link to
>the code.
>
>- - Chris.
>-----BEGIN PGP SIGNATURE-----
>Version: GnuPG v1.4.2.1 (MingW32)
>Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
>iD8DBQFEVKMikv8rKBUE/T4RAlxxAKCMYzova2roWDA/skRyoDvFcErE2gCfTjFw
>bT2VrGdR8Byt+VjsRo7Cyhw=
>=1GRF
>-----END PGP SIGNATURE-----
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [compress] New draft 5
Posted by "C. Grobmeier" <gr...@possessed.de>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Sandy McArthur wrote:
> On 4/30/06, C. Grobmeier <gr...@possessed.de> wrote:
>> here is a new draft for the compress interface:
>>
>> * http://grobmeier.de/commons-compress-draft-5.zip
>
> Same as before, suggestions as a stream of consciousness, take the
> ones you like:
>
Thanks again to all for your comments. I will think about em, create a
new draft and come back here in a few days.
Regards,
Chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFEVyfkkv8rKBUE/T4RAsarAJ0dWH03RbU8I1C8ib6Jl5V9RKLovwCeM2VM
L5n+xE2rW5fXJkwmK+Y9Veo=
=M4TW
-----END PGP SIGNATURE-----
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [compress] New draft 5
Posted by Sandy McArthur <sa...@apache.org>.
On 4/30/06, C. Grobmeier <gr...@possessed.de> wrote:
> here is a new draft for the compress interface:
>
> * http://grobmeier.de/commons-compress-draft-5.zip
Same as before, suggestions as a stream of consciousness, take the
ones you like:
package org.apache.commons.compress:
In Archiver:
Why do pack() and packToFilename(String) both return a boolean and
throws a PackException? The Javadocs state "@return true, if the
operation has been ended without exceptions". IMO the return type
isn't meaningful as if there was an exception that would be thrown,
and a return of false would never actually happen. When is it
meaningful to swallow an exception and still return false?
Does the Archiver ever unpack to a file or only a directory? In Java a
directory is represented as a File but I think it would be nice if the
Javadocs made that clear for getUnpackDestination(),
setUnpackDestinationName(String), and setUnpackDestinationFile(File).
And maybe those setters should throw an exception when the parameter
isn't a directory.
I don't really like that the Archiver interface combines both both
pack and unpack and there are difference sets of properties for each
operation. In my mind you have the concept of an Archive and then you
ask for another implementation of "Exploder" and "Packer" interfaces.
Each one only does one thing and the Archiver can choose to make both
types available, only make one type available, support both types but
throw an exception when the second type is requested to enforce one
activity at a time.
In Compressor:
I don't see a need to expose that a FileInputStream is being used
internally. Someday you may want to implement an optimization that
decompresses small files totally in memory. Change:
FileInputStream streamCompressedInputStream(FileInputStream input),
FileInputStream streamCompressedFilename(String pathToFile), and
FileInputStream streamCompressedFile(File input) to return an
InputStream instead.
I also don't like that Compressor mixes compress and decompress
specific behaviors in one interface but its not so bad because there
are not properties that go with a behavior. All relevant information
is included in the method call parameters which is good.
In ArchiverType:
public static ArchiverType ZIP = ...; and public static ArchiverType
TAR = ....; should both be final.
In AbstractCompressor:
The copy( final InputStream input, final OutputStream output ) method,
while useful, doesn't really belong in AbstractCompressor. It's just a
utility method that an implementation may need. IMO it doesn't warrant
being part of the exposed API. It's nice to aid in code reuse but I
don't think it should happen as the expense of sticking to one
purpose.
Since Java 1.2 the File class has had createTempFile methods which
makes the ones in AbstractCompressor unneeded and, while unlikely, it
is possible that createTempFileName() could return the same name in
two successive calls which would be bad.
In CompressorType:
public static CompressorType BZIP2 = ... ; also, final.
For both of ArchiveType and CompressorType add a valueOf(String)
method. This is what Java 1.5 Enums have and it lets you convert a
String, say from a config file, into an Enum. To do this right you'll
need to keep track of other Types with a Map or something in the
constructor.
In CompressException, DecompressException, PackException, and UnpackException:
The constructors that takes (String, Exception) call
this.setStackTrace( e.getStackTrace() ); If Java 1.4 is a minimum
requirement than loose that as it's misleading and use the initCause
method to do exception chaining.
packages org.apache.commons.compress.archivers.tar, and
org.apache.commons.compress.archivers.zip:
These packages have a minimally exposed API and I like that. This will
help make future improvements easier to implement.
package org.apache.commons.compress.compressors.bzip2:
Same comments as before about the Bzip2{In,Out}Streams which you have
in the TODO.txt file.
> This draft is not perfect, as you can see in the todo list. But imho we
> have a quite good base for future work, everything compiles and works
> and is documented. So i would like to propose that someone is comitting
> this, except you have reasons not to do this.
> If you agree with me, i will open a bugzilla issue and add the link to
> the code.
I feel like the public interface was noticeably more usable and has
less noise. IMO it's not yet a great API but it's a very decent one.
--
Sandy McArthur
"He who dares not offend cannot be honest."
- Thomas Paine
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org