You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by sebb <se...@gmail.com> on 2009/03/29 14:31:07 UTC

[COMPRESS] Changeset ideas

The current ChangeSet API allows for:
+ deletion of entries by name
+ addition of entries by ArchiveEntry and InputStream.

This is fine as far as it goes, but I think it would be useful to add:
+ addition of entries by File
+ replacement of an existing named entry by File or Entry+InputStream

It may also be useful to allow the location of new entries to be
specified. For example, one might want to add META-INF data at the
front of an archive. It would be useful to specify the locations as:
+ start
+ end

[I'm not sure if there is a use-case for adding entries relative to an
existing entry, and it would complicate the processing. None of the
archivers I have used allow this.]

I think it would be quite easy to implement:
+ open output file, add any starting entries
+ for each input entry, either copy, skip or replace with new entry
+ at end of input, add any final entries.
+ close archive files

This could be achieved with:
+ HashSet containing names to be deleted
+ HashMap containing new entries for existing names
+ 2 Lists for new entries.

Using Hashes would avoid scanning the list.
Also, I don't think any of the above would need to be updated during
perform(), which would allow them to be re-used on another archive.

However, the creation of the collections would be slightly more involved.

It would probably be useful to have a "NewEntry" class which either
has a File, or has an InputStream + ArchiveEntry to describe it.

The "NewEntry" class might also be useful for ArchiveOutputStream.

WDYT?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [COMPRESS] Changeset ideas

Posted by Stefan Bodewig <bo...@apache.org>.
your API suggesteions sound useful.

On 2009-03-30, sebb <se...@gmail.com> wrote:

> On 29/03/2009, sebb <se...@gmail.com> wrote:
>> On 29/03/2009, sebb <se...@gmail.com> wrote:
>>> The current ChangeSet API allows for:
>>>> deletion of entries by name

> The same method call is currently used for both deleting a single
> file, and for deleting a directory tree. If the string matches a file,
> it is deleted, but if it happens to match a directory name, then the
> directory and any contents are deleted.

> Both of these are useful functions, but I think they should have
> different method calls to avoid accidents.

Agreed, delete and deleteTree or something similar.

>> Forgot about "Move", which is not yet implemented.

>>  I think that should be called "Rename", unless it really means to move
>>  the entry elsewhere in the file.

Agreed as well.

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [COMPRESS] Changeset ideas

Posted by sebb <se...@gmail.com>.
On 29/03/2009, sebb <se...@gmail.com> wrote:
> On 29/03/2009, sebb <se...@gmail.com> wrote:
>  > The current ChangeSet API allows for:
>  >  + deletion of entries by name

The same method call is currently used for both deleting a single
file, and for deleting a directory tree. If the string matches a file,
it is deleted, but if it happens to match a directory name, then the
directory and any contents are deleted.

Both of these are useful functions, but I think they should have
different method calls to avoid accidents.

>  >  + addition of entries by ArchiveEntry and InputStream.
>
>
> Forgot about "Move", which is not yet implemented.
>
>  I think that should be called "Rename", unless it really means to move
>  the entry elsewhere in the file.
>
>
>  >
>  >  This is fine as far as it goes, but I think it would be useful to add:
>  >  + addition of entries by File
>  >  + replacement of an existing named entry by File or Entry+InputStream
>  >
>  >  It may also be useful to allow the location of new entries to be
>  >  specified. For example, one might want to add META-INF data at the
>  >  front of an archive. It would be useful to specify the locations as:
>  >  + start
>  >  + end
>  >
>  >  [I'm not sure if there is a use-case for adding entries relative to an
>  >  existing entry, and it would complicate the processing. None of the
>  >  archivers I have used allow this.]
>  >
>  >  I think it would be quite easy to implement:
>  >  + open output file, add any starting entries
>  >  + for each input entry, either copy, skip or replace with new entry
>
>
> or rename, if that has been requested.
>
>
>  >  + at end of input, add any final entries.
>  >  + close archive files
>  >
>  >  This could be achieved with:
>  >  + HashSet containing names to be deleted
>  >  + HashMap containing new entries for existing names
>
>
> Or combine them into a single HashMap which has operations delete,
>  replace, rename.

And now delete directory tree.

>
>  >  + 2 Lists for new entries.
>
>
> Start and and.

That should be Start and End.

>
>
>  >
>  >  Using Hashes would avoid scanning the list.
>  >  Also, I don't think any of the above would need to be updated during
>  >  perform(), which would allow them to be re-used on another archive.
>  >
>  >  However, the creation of the collections would be slightly more involved.
>  >
>  >  It would probably be useful to have a "NewEntry" class which either
>  >  has a File, or has an InputStream + ArchiveEntry to describe it.
>  >
>  >  The "NewEntry" class might also be useful for ArchiveOutputStream.
>  >
>  >  WDYT?
>  >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [COMPRESS] Changeset ideas

Posted by sebb <se...@gmail.com>.
On 29/03/2009, sebb <se...@gmail.com> wrote:
> The current ChangeSet API allows for:
>  + deletion of entries by name
>  + addition of entries by ArchiveEntry and InputStream.

Forgot about "Move", which is not yet implemented.

I think that should be called "Rename", unless it really means to move
the entry elsewhere in the file.

>
>  This is fine as far as it goes, but I think it would be useful to add:
>  + addition of entries by File
>  + replacement of an existing named entry by File or Entry+InputStream
>
>  It may also be useful to allow the location of new entries to be
>  specified. For example, one might want to add META-INF data at the
>  front of an archive. It would be useful to specify the locations as:
>  + start
>  + end
>
>  [I'm not sure if there is a use-case for adding entries relative to an
>  existing entry, and it would complicate the processing. None of the
>  archivers I have used allow this.]
>
>  I think it would be quite easy to implement:
>  + open output file, add any starting entries
>  + for each input entry, either copy, skip or replace with new entry

or rename, if that has been requested.

>  + at end of input, add any final entries.
>  + close archive files
>
>  This could be achieved with:
>  + HashSet containing names to be deleted
>  + HashMap containing new entries for existing names

Or combine them into a single HashMap which has operations delete,
replace, rename.

>  + 2 Lists for new entries.

Start and and.

>
>  Using Hashes would avoid scanning the list.
>  Also, I don't think any of the above would need to be updated during
>  perform(), which would allow them to be re-used on another archive.
>
>  However, the creation of the collections would be slightly more involved.
>
>  It would probably be useful to have a "NewEntry" class which either
>  has a File, or has an InputStream + ArchiveEntry to describe it.
>
>  The "NewEntry" class might also be useful for ArchiveOutputStream.
>
>  WDYT?
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [COMPRESS] Changeset ideas

Posted by sebb <se...@gmail.com>.
On 30/03/2009, Christian Grobmeier <gr...@gmail.com> wrote:
> Hi,
>
>
>  > The current ChangeSet API allows for:
>  > + deletion of entries by name
>  > + addition of entries by ArchiveEntry and InputStream.
>  >
>  > This is fine as far as it goes, but I think it would be useful to add:
>  > + addition of entries by File
>  > + replacement of an existing named entry by File or Entry+InputStream
>
>
> Sounds good, esspecially the replacement.
>  About additions by file - we use only streams in the api, is it good
>  to start with File now?

I think it would be useful to add File to the putxxx() methods, but I
suggest we track that in a separate e-mail thread.

>  I think this is useful too, having in mind that I wanted to propose
>  some util classes which allow creating zip files on File basis.
>
>
>  >
>  > It may also be useful to allow the location of new entries to be
>  > specified. For example, one might want to add META-INF data at the
>  > front of an archive. It would be useful to specify the locations as:
>  > + start
>  > + end
>
>
> I am not sure about this. I cannot imagine a use case for that.

Start is definitely useful for things such as META-INF and NOTICE,
LICENSE, README files. Otherwise, the default should be end of file.

If I remember correctly, "ar" files have ordering requirements, but
these are complicated, so are probably best dealt with by the
appropriate utility on the OS.

>
>  > I think it would be quite easy to implement:
>  > + open output file, add any starting entries
>  > + for each input entry, either copy, skip or replace with new entry
>  > + at end of input, add any final entries.
>  > + close archive files
>  >
>  > This could be achieved with:
>  > + HashSet containing names to be deleted
>  > + HashMap containing new entries for existing names
>  > + 2 Lists for new entries.
>  >
>  > Using Hashes would avoid scanning the list.
>  > Also, I don't think any of the above would need to be updated during
>  > perform(), which would allow them to be re-used on another archive.
>
>
> That would be cool. Actually I was thinking about  your suggestion with using
>  a copy of the Set to operate on. This would resolve this too.

Indeed.

>
>  Christian
>
>
>  ---------------------------------------------------------------------
>  To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>  For additional commands, e-mail: dev-help@commons.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [COMPRESS] Changeset ideas

Posted by Christian Grobmeier <gr...@gmail.com>.
Hi,

> The current ChangeSet API allows for:
> + deletion of entries by name
> + addition of entries by ArchiveEntry and InputStream.
>
> This is fine as far as it goes, but I think it would be useful to add:
> + addition of entries by File
> + replacement of an existing named entry by File or Entry+InputStream

Sounds good, esspecially the replacement.
About additions by file - we use only streams in the api, is it good
to start with File now?
I think this is useful too, having in mind that I wanted to propose
some util classes which allow creating zip files on File basis.

>
> It may also be useful to allow the location of new entries to be
> specified. For example, one might want to add META-INF data at the
> front of an archive. It would be useful to specify the locations as:
> + start
> + end

I am not sure about this. I cannot imagine a use case for that.

> I think it would be quite easy to implement:
> + open output file, add any starting entries
> + for each input entry, either copy, skip or replace with new entry
> + at end of input, add any final entries.
> + close archive files
>
> This could be achieved with:
> + HashSet containing names to be deleted
> + HashMap containing new entries for existing names
> + 2 Lists for new entries.
>
> Using Hashes would avoid scanning the list.
> Also, I don't think any of the above would need to be updated during
> perform(), which would allow them to be re-used on another archive.

That would be cool. Actually I was thinking about  your suggestion with using
a copy of the Set to operate on. This would resolve this too.

Christian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org