You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Gary Gregory <ga...@gmail.com> on 2011/08/12 21:56:38 UTC

[codec] Generics added in a SVN branch

Hello All:

For a first cut at generics support in Codec, please checkout the
branch https://svn.apache.org/repos/asf/commons/proper/codec/branches/generics

I wrote a migration guide in the root of the project called
Codec2-Migration.htm.

Let's discuss.

I plan on not touching trunk until the generics code is flushed out
and brought back to trunk.

Thank you,
Gary

-- 
http://garygregory.wordpress.com/
http://garygregory.com/
http://people.apache.org/~ggregory/
http://twitter.com/GaryGregory

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [codec] Generics added in a SVN branch

Posted by sebb <se...@gmail.com>.
On 14 August 2011 15:47, Gary Gregory <ga...@gmail.com> wrote:
> On Sat, Aug 13, 2011 at 10:02 PM, sebb <se...@gmail.com> wrote:
>> On 12 August 2011 20:56, Gary Gregory <ga...@gmail.com> wrote:
>>> Hello All:
>>>
>>> For a first cut at generics support in Codec, please checkout the
>>> branch https://svn.apache.org/repos/asf/commons/proper/codec/branches/generics
>>>
>>> I wrote a migration guide in the root of the project called
>>> Codec2-Migration.htm.
>>>
>>> Let's discuss.
>>
>> The original code used Encoder and Decoder interfaces which operated
>> on Object, and extended these with
>> BinaryEncoder (byte[])
>> StringEncoder(String)
>>
>> So for example StringEncoder classes need to implement
>> encode(Object) : Object
>> encode(String) : String
>>
>> As far as I can tell, those interfaces cannot be directly generified,
>> because type erasure causes a method signature clash.
>> So adding generics here means breaking binary compatibilty.
>
> Hello Sebb,
>
> Thank you for taking the time to look this over.
>
> You are correct, a class cannot implement an interface twice, so it
> cannot implement the generics version of StringEncoder AND
> BinaryEncoder.
>
>>
>> Question is, what does adding generics to these interfaces actually provide?
>
> As you know. generics let us define precisely what types are supported
> and avoids the current bug prone guessing game for implementers and
> call sites.

Generics is one way to do this; it's not the only way.

As I see it, the problem with the current Encoder/Decoder interface is
that it allows Object.
The sub-interfaces are specific, so are not a problem.

> For example, look at the current mess in trunk's BinaryCodec:
>
> If I call encode(byte[]), I get a byte[] back.
>
> If I call encode(Object) with a byte[], I get a char[] back.
>
> Uh? Is the code wrong, is the Javadoc wrong, how am I supposed to know
> what is right? I say it's a mess.
>
> I would flip the question around: If I implemented this today (1) why
> would I not use generics, and (2) why would I want to support "Object
> encode(Object)". Codec 2.0 is the opportunity to clean this up. What I
> write about Encoders here applies to Decoders too.
>
> Most encode/decode methods look like this:
>
>    public Object encode(Object raw) throws EncoderException {
>        if (!(raw instanceof byte[])) {
>            throw new EncoderException("argument not a byte array");
>        }
>        return toAsciiChars((byte[]) raw);
>    }
>
> There is always one (or more) if instanceof expr for a typed method
> call, and codec-styled class cast exception in the form an
> EncoderException.This does not smell OO to me.

Agreed, it is messy.

> Tangent: Wouldn't this be better typed as a ClassCastException or an
> IllegalArgumentException anyway?
>
> Any advantage of this pattern (is there one? ;) is ruined by the foggy
> example above. I say foggy and not black box because we are FOSS.
>
> I also want typed asymmetric encoders, for example, for the new BM
> encoder, I want "Set<String> encode(String)", or "Set<CharSequence>
> encode(CharSequence)"

Then the problem becomes that a single class cannot implement multiple
generic interfaces.

As far as I can tell, to use generics here would require many more
smaller classes.

> Hex decoding is a similar but differently confusing example:
>
> If I call encode(byte[]), the result is:
>
> StringUtils.getBytesUnchecked(encodeHexString(array), getCharsetName());
>
> encodeHexString(array) creates a string which encode(byte[]) uses with
> getBytesUnchecked and the char set name. So encodeHexString(array)
> calls "new String(encodeHex(data))".
>
> If I call encode(Object) with a byte[], the result is:
>
> encodeHex(byteArray)
>
> Why the difference with the charset name? Do I want to spend the time
> telling the difference? Is the code wrong? Is the Javadoc wrong?

No idea.

> The Hex decode(byte[]|Object) methods suffer from the same problem.
>
> Every encode/decode(Object) typed method creates duplication and
> opportunities for bugs.
>
> So we have /at least/ three problems today with the current system
> which I claim would be eliminated by using generics because there
> would be one entry point for each type and it would get rid of
> encode/decode(Object) methods.
>
>> Would it not be just as effective to deprecate Decoder and Encoder,
>> and code can then use either BinaryEncoder or StringEncoder (etc)?
>
> We could do that too, but I would still want typed Decoder and Encoder
> to define codecs that do not fit in BinaryEncoder or StringEncoder.
>
> For an example different from the new BM encoder, we could recode the
> DoubleMetaphone encoder as DoubleMetaphoneResult<String,
> DoubleMetaphoneResult>.

I suggested using additional specific interfaces in a later e-mail.

> Today, if you want both encodings (normal and alternate,) you have to
> encode twice with a different parameter to encode(String, boolean)
> which is not in any interface (and both values are computed internally
> each time!) Granted, this problem could be solved without generics,
> but generics makes it obvious what the result type is, instead of
> reading possibly buggy Javadocs for a return typed as Object. Only
> changing the result type in 2.0 of "Object encode(Object)" from String
> to DoubleMetaphoneResult would be a runtime 'surprise' as opposed to a
> compile time 'surprise.' Catching problems the earlier, the better
> when migrating.

Agreed.

> Thank you for the discussion, :)
>
> Gary
>
>>
>> At the moment test code is of the form:
>>
>> Encoder enc = new Base64();
>> ...
>> byte [] ba = (byte[]) enc.encode(binary)); // unchecked cast
>>
>> However this could be written:
>>
>> BinaryEncoder enc = new Base64();
>> ...
>> byte [] ba = enc.encode(binary)); // no cast needed
>>
>>
>>> I plan on not touching trunk until the generics code is flushed out
>>> and brought back to trunk.
>>>
>>> Thank you,
>>> Gary
>>>
>>> --
>>> http://garygregory.wordpress.com/
>>> http://garygregory.com/
>>> http://people.apache.org/~ggregory/
>>> http://twitter.com/GaryGregory
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>
>
>
> --
> Thank you,
> Gary
>
> http://garygregory.wordpress.com/
> http://garygregory.com/
> http://people.apache.org/~ggregory/
> http://twitter.com/GaryGregory
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [codec] Generics added in a SVN branch

Posted by Gary Gregory <ga...@gmail.com>.
On Sat, Aug 13, 2011 at 10:02 PM, sebb <se...@gmail.com> wrote:
> On 12 August 2011 20:56, Gary Gregory <ga...@gmail.com> wrote:
>> Hello All:
>>
>> For a first cut at generics support in Codec, please checkout the
>> branch https://svn.apache.org/repos/asf/commons/proper/codec/branches/generics
>>
>> I wrote a migration guide in the root of the project called
>> Codec2-Migration.htm.
>>
>> Let's discuss.
>
> The original code used Encoder and Decoder interfaces which operated
> on Object, and extended these with
> BinaryEncoder (byte[])
> StringEncoder(String)
>
> So for example StringEncoder classes need to implement
> encode(Object) : Object
> encode(String) : String
>
> As far as I can tell, those interfaces cannot be directly generified,
> because type erasure causes a method signature clash.
> So adding generics here means breaking binary compatibilty.

Hello Sebb,

Thank you for taking the time to look this over.

You are correct, a class cannot implement an interface twice, so it
cannot implement the generics version of StringEncoder AND
BinaryEncoder.

>
> Question is, what does adding generics to these interfaces actually provide?

As you know. generics let us define precisely what types are supported
and avoids the current bug prone guessing game for implementers and
call sites.

For example, look at the current mess in trunk's BinaryCodec:

If I call encode(byte[]), I get a byte[] back.

If I call encode(Object) with a byte[], I get a char[] back.

Uh? Is the code wrong, is the Javadoc wrong, how am I supposed to know
what is right? I say it's a mess.

I would flip the question around: If I implemented this today (1) why
would I not use generics, and (2) why would I want to support "Object
encode(Object)". Codec 2.0 is the opportunity to clean this up. What I
write about Encoders here applies to Decoders too.

Most encode/decode methods look like this:

    public Object encode(Object raw) throws EncoderException {
        if (!(raw instanceof byte[])) {
            throw new EncoderException("argument not a byte array");
        }
        return toAsciiChars((byte[]) raw);
    }

There is always one (or more) if instanceof expr for a typed method
call, and codec-styled class cast exception in the form an
EncoderException.This does not smell OO to me.

Tangent: Wouldn't this be better typed as a ClassCastException or an
IllegalArgumentException anyway?

Any advantage of this pattern (is there one? ;) is ruined by the foggy
example above. I say foggy and not black box because we are FOSS.

I also want typed asymmetric encoders, for example, for the new BM
encoder, I want "Set<String> encode(String)", or "Set<CharSequence>
encode(CharSequence)"

Hex decoding is a similar but differently confusing example:

If I call encode(byte[]), the result is:

StringUtils.getBytesUnchecked(encodeHexString(array), getCharsetName());

encodeHexString(array) creates a string which encode(byte[]) uses with
getBytesUnchecked and the char set name. So encodeHexString(array)
calls "new String(encodeHex(data))".

If I call encode(Object) with a byte[], the result is:

encodeHex(byteArray)

Why the difference with the charset name? Do I want to spend the time
telling the difference? Is the code wrong? Is the Javadoc wrong?

The Hex decode(byte[]|Object) methods suffer from the same problem.

Every encode/decode(Object) typed method creates duplication and
opportunities for bugs.

So we have /at least/ three problems today with the current system
which I claim would be eliminated by using generics because there
would be one entry point for each type and it would get rid of
encode/decode(Object) methods.

> Would it not be just as effective to deprecate Decoder and Encoder,
> and code can then use either BinaryEncoder or StringEncoder (etc)?

We could do that too, but I would still want typed Decoder and Encoder
to define codecs that do not fit in BinaryEncoder or StringEncoder.

For an example different from the new BM encoder, we could recode the
DoubleMetaphone encoder as DoubleMetaphoneResult<String,
DoubleMetaphoneResult>.

Today, if you want both encodings (normal and alternate,) you have to
encode twice with a different parameter to encode(String, boolean)
which is not in any interface (and both values are computed internally
each time!) Granted, this problem could be solved without generics,
but generics makes it obvious what the result type is, instead of
reading possibly buggy Javadocs for a return typed as Object. Only
changing the result type in 2.0 of "Object encode(Object)" from String
to DoubleMetaphoneResult would be a runtime 'surprise' as opposed to a
compile time 'surprise.' Catching problems the earlier, the better
when migrating.

Thank you for the discussion, :)
Gary

>
> At the moment test code is of the form:
>
> Encoder enc = new Base64();
> ...
> byte [] ba = (byte[]) enc.encode(binary)); // unchecked cast
>
> However this could be written:
>
> BinaryEncoder enc = new Base64();
> ...
> byte [] ba = enc.encode(binary)); // no cast needed
>
>
>> I plan on not touching trunk until the generics code is flushed out
>> and brought back to trunk.
>>
>> Thank you,
>> Gary
>>
>> --
>> http://garygregory.wordpress.com/
>> http://garygregory.com/
>> http://people.apache.org/~ggregory/
>> http://twitter.com/GaryGregory
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>



-- 
Thank you,
Gary

http://garygregory.wordpress.com/
http://garygregory.com/
http://people.apache.org/~ggregory/
http://twitter.com/GaryGregory

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [codec] Generics added in a SVN branch

Posted by sebb <se...@gmail.com>.
On 15 August 2011 23:53, Gary Gregory <ga...@gmail.com> wrote:
> On Mon, Aug 15, 2011 at 10:12 AM, sebb <se...@gmail.com> wrote:
>
>> On 14 August 2011 22:38, Gary Gregory <ga...@gmail.com> wrote:
>> > Hi,
>> >
>> > On Aug 14, 2011, at 10:19, sebb <se...@gmail.com> wrote:
>> >
>> >> On 14 August 2011 03:02, sebb <se...@gmail.com> wrote:
>> >>> On 12 August 2011 20:56, Gary Gregory <ga...@gmail.com> wrote:
>> >>>> Hello All:
>> >>>>
>> >>>> For a first cut at generics support in Codec, please checkout the
>> >>>> branch
>> https://svn.apache.org/repos/asf/commons/proper/codec/branches/generics
>> >>>>
>> >>>> I wrote a migration guide in the root of the project called
>> >>>> Codec2-Migration.htm.
>> >>>>
>> >>>> Let's discuss.
>> >>>
>> >>> The original code used Encoder and Decoder interfaces which operated
>> >>> on Object, and extended these with
>> >>> BinaryEncoder (byte[])
>> >>> StringEncoder(String)
>> >>>
>> >>> So for example StringEncoder classes need to implement
>> >>> encode(Object) : Object
>> >>> encode(String) : String
>> >>>
>> >>> As far as I can tell, those interfaces cannot be directly generified,
>> >>> because type erasure causes a method signature clash.
>> >>> So adding generics here means breaking binary compatibilty.
>> >>>
>> >>> Question is, what does adding generics to these interfaces actually
>> provide?
>> >>> Would it not be just as effective to deprecate Decoder and Encoder,
>> >>> and code can then use either BinaryEncoder or StringEncoder (etc)?
>> >>>
>> >>> At the moment test code is of the form:
>> >>>
>> >>> Encoder enc = new Base64();
>> >>> ...
>> >>> byte [] ba = (byte[]) enc.encode(binary)); // unchecked cast
>> >>>
>> >>> However this could be written:
>> >>>
>> >>> BinaryEncoder enc = new Base64();
>> >>> ...
>> >>> byte [] ba = enc.encode(binary)); // no cast needed
>> >>>
>> >>
>> >> Note that the Encoder/Decoder interfaces do not _require_
>> generification.
>> >>
>> >
>> > Well sure. Strictly speaking that's true but I still like the branch
>> > code better.
>> >
>> >> The only non-generified part of trunk is StringEncoderComparator.
>> >>
>> >
>> > I do not quite look at it that way, since I created the branch ;) if
>> > you just want to fix warnings then yes that class needs patching. But
>> > my claim in the branch is that the component is better and type-safe
>> > with generics.
>> >
>> >> This appears from the name as if it compares Strings, but in fact it
>> >> compares Objects in the current code.
>> >> Making it implement Comparator<Object> fixes the warning without
>> >> compromising binary (or source?) compatibility.
>> >>
>> >> The StringEncoders generate an EncoderException if the parameter is
>> >> not a String.
>> >> This does mean that comparison with a non-String will return 0 (equal)
>> >> which is rather strange, but that is the way that the code current
>> >> works.
>> >
>> > That sure sounds like a bug.
>>
>> Well, the method .StringEncoderComparator.compare(Object, Object) Javadoc
>> says:
>>
>> @return the Comparable.compareTo() return code or 0 if an encoding
>> error was caught.
>>
>> Does seem like a design bug, as one cannot distinguish failure from
>> equality.
>>
>> >>
>> >> So what I suggest is - let's release Codec as a binary compatible
>> >> implementation, at least for now.
>> >
>> > ATM there incompat changes due to deprecations and other bits.
>>
>> Yes, but those could be reverted if necessary.
>>
>> >>
>> >> If we do decide to break binary compatibility in a later release, then
>> >> I think the Encoder/Decoder hierarchy needs a bit more work.
>> >> For example, several of the new xxxCodec classes only implement
>> >> Decoder, not Encoder; this is because of the need to avoid erasure
>> >> clashes.
>> >
>> > Another set of refinements the ;)
>>
>> I don't think the redesign goes quite far enough to fix the interface
>> inheritance problem.
>>
>> What I'm suggesting is to take smaller steps to achieve the goal.
>>
>> Release a binary-compatible version now - which has lots of other
>> useful fixes - and release a completely new version later.
>> If problems are found with the new stuff, any API changes can be
>> rolled into the next major version.
>>
>
> Do you mean something like:
>
> - Create a branch for 1.6 from trunk now:
> - Branch 1.6: Revert deprecations and other changes to create a binary
> compatible release with 1.5.

I did a quick check, and there did not seem to be much to revert.

> - Release 1.6.
> - Big changes as Codec2
> - Possible Codec3
>
> ?

Yes.

I would hope that this would make the need to create Codec3 less
likely,  as there would be more time for the new code to be exercised.

Codec is quite a low-level library, so binary and source compatibility
is really important.

> Gary
>
>
>> >>
>> >> I just wonder if trying to use generics here is not causing more
>> >> problems than it solves?
>> >> It's not as if there are lots of different types that are used here -
>> >> only String, byte[] and char[].
>> >> Perhaps we could just add CharEncoder/CharDecoder interfaces instead?
>> >
>> > Please see my proposal for the typing of the new BM encoder which I
>> > want to type as Set<String> encode(String) in a previous email.
>> >
>> > Thank you,
>> > Gary
>> >
>> >>
>> >>>> I plan on not touching trunk until the generics code is flushed out
>> >>>> and brought back to trunk.
>> >>>>
>> >>>> Thank you,
>> >>>> Gary
>> >>>>
>> >>>> --
>> >>>> http://garygregory.wordpress.com/
>> >>>> http://garygregory.com/
>> >>>> http://people.apache.org/~ggregory/
>> >>>> http://twitter.com/GaryGregory
>> >>>>
>> >>>> ---------------------------------------------------------------------
>> >>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> >>>> For additional commands, e-mail: dev-help@commons.apache.org
>> >>>>
>> >>>>
>> >>>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> >> For additional commands, e-mail: dev-help@commons.apache.org
>> >>
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> > For additional commands, e-mail: dev-help@commons.apache.org
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>
>
> --
> Thank you,
> Gary
>
> http://garygregory.wordpress.com/
> http://garygregory.com/
> http://people.apache.org/~ggregory/
> http://twitter.com/GaryGregory
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [codec] Generics added in a SVN branch

Posted by Gary Gregory <ga...@gmail.com>.
On Mon, Aug 15, 2011 at 10:12 AM, sebb <se...@gmail.com> wrote:

> On 14 August 2011 22:38, Gary Gregory <ga...@gmail.com> wrote:
> > Hi,
> >
> > On Aug 14, 2011, at 10:19, sebb <se...@gmail.com> wrote:
> >
> >> On 14 August 2011 03:02, sebb <se...@gmail.com> wrote:
> >>> On 12 August 2011 20:56, Gary Gregory <ga...@gmail.com> wrote:
> >>>> Hello All:
> >>>>
> >>>> For a first cut at generics support in Codec, please checkout the
> >>>> branch
> https://svn.apache.org/repos/asf/commons/proper/codec/branches/generics
> >>>>
> >>>> I wrote a migration guide in the root of the project called
> >>>> Codec2-Migration.htm.
> >>>>
> >>>> Let's discuss.
> >>>
> >>> The original code used Encoder and Decoder interfaces which operated
> >>> on Object, and extended these with
> >>> BinaryEncoder (byte[])
> >>> StringEncoder(String)
> >>>
> >>> So for example StringEncoder classes need to implement
> >>> encode(Object) : Object
> >>> encode(String) : String
> >>>
> >>> As far as I can tell, those interfaces cannot be directly generified,
> >>> because type erasure causes a method signature clash.
> >>> So adding generics here means breaking binary compatibilty.
> >>>
> >>> Question is, what does adding generics to these interfaces actually
> provide?
> >>> Would it not be just as effective to deprecate Decoder and Encoder,
> >>> and code can then use either BinaryEncoder or StringEncoder (etc)?
> >>>
> >>> At the moment test code is of the form:
> >>>
> >>> Encoder enc = new Base64();
> >>> ...
> >>> byte [] ba = (byte[]) enc.encode(binary)); // unchecked cast
> >>>
> >>> However this could be written:
> >>>
> >>> BinaryEncoder enc = new Base64();
> >>> ...
> >>> byte [] ba = enc.encode(binary)); // no cast needed
> >>>
> >>
> >> Note that the Encoder/Decoder interfaces do not _require_
> generification.
> >>
> >
> > Well sure. Strictly speaking that's true but I still like the branch
> > code better.
> >
> >> The only non-generified part of trunk is StringEncoderComparator.
> >>
> >
> > I do not quite look at it that way, since I created the branch ;) if
> > you just want to fix warnings then yes that class needs patching. But
> > my claim in the branch is that the component is better and type-safe
> > with generics.
> >
> >> This appears from the name as if it compares Strings, but in fact it
> >> compares Objects in the current code.
> >> Making it implement Comparator<Object> fixes the warning without
> >> compromising binary (or source?) compatibility.
> >>
> >> The StringEncoders generate an EncoderException if the parameter is
> >> not a String.
> >> This does mean that comparison with a non-String will return 0 (equal)
> >> which is rather strange, but that is the way that the code current
> >> works.
> >
> > That sure sounds like a bug.
>
> Well, the method .StringEncoderComparator.compare(Object, Object) Javadoc
> says:
>
> @return the Comparable.compareTo() return code or 0 if an encoding
> error was caught.
>
> Does seem like a design bug, as one cannot distinguish failure from
> equality.
>
> >>
> >> So what I suggest is - let's release Codec as a binary compatible
> >> implementation, at least for now.
> >
> > ATM there incompat changes due to deprecations and other bits.
>
> Yes, but those could be reverted if necessary.
>
> >>
> >> If we do decide to break binary compatibility in a later release, then
> >> I think the Encoder/Decoder hierarchy needs a bit more work.
> >> For example, several of the new xxxCodec classes only implement
> >> Decoder, not Encoder; this is because of the need to avoid erasure
> >> clashes.
> >
> > Another set of refinements the ;)
>
> I don't think the redesign goes quite far enough to fix the interface
> inheritance problem.
>
> What I'm suggesting is to take smaller steps to achieve the goal.
>
> Release a binary-compatible version now - which has lots of other
> useful fixes - and release a completely new version later.
> If problems are found with the new stuff, any API changes can be
> rolled into the next major version.
>

Do you mean something like:

- Create a branch for 1.6 from trunk now:
- Branch 1.6: Revert deprecations and other changes to create a binary
compatible release with 1.5.
- Release 1.6.
- Big changes as Codec2
- Possible Codec3

?

Gary


> >>
> >> I just wonder if trying to use generics here is not causing more
> >> problems than it solves?
> >> It's not as if there are lots of different types that are used here -
> >> only String, byte[] and char[].
> >> Perhaps we could just add CharEncoder/CharDecoder interfaces instead?
> >
> > Please see my proposal for the typing of the new BM encoder which I
> > want to type as Set<String> encode(String) in a previous email.
> >
> > Thank you,
> > Gary
> >
> >>
> >>>> I plan on not touching trunk until the generics code is flushed out
> >>>> and brought back to trunk.
> >>>>
> >>>> Thank you,
> >>>> Gary
> >>>>
> >>>> --
> >>>> http://garygregory.wordpress.com/
> >>>> http://garygregory.com/
> >>>> http://people.apache.org/~ggregory/
> >>>> http://twitter.com/GaryGregory
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >>>> For additional commands, e-mail: dev-help@commons.apache.org
> >>>>
> >>>>
> >>>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >> For additional commands, e-mail: dev-help@commons.apache.org
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > For additional commands, e-mail: dev-help@commons.apache.org
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


-- 
Thank you,
Gary

http://garygregory.wordpress.com/
http://garygregory.com/
http://people.apache.org/~ggregory/
http://twitter.com/GaryGregory

Re: [codec] Generics added in a SVN branch

Posted by sebb <se...@gmail.com>.
On 14 August 2011 22:38, Gary Gregory <ga...@gmail.com> wrote:
> Hi,
>
> On Aug 14, 2011, at 10:19, sebb <se...@gmail.com> wrote:
>
>> On 14 August 2011 03:02, sebb <se...@gmail.com> wrote:
>>> On 12 August 2011 20:56, Gary Gregory <ga...@gmail.com> wrote:
>>>> Hello All:
>>>>
>>>> For a first cut at generics support in Codec, please checkout the
>>>> branch https://svn.apache.org/repos/asf/commons/proper/codec/branches/generics
>>>>
>>>> I wrote a migration guide in the root of the project called
>>>> Codec2-Migration.htm.
>>>>
>>>> Let's discuss.
>>>
>>> The original code used Encoder and Decoder interfaces which operated
>>> on Object, and extended these with
>>> BinaryEncoder (byte[])
>>> StringEncoder(String)
>>>
>>> So for example StringEncoder classes need to implement
>>> encode(Object) : Object
>>> encode(String) : String
>>>
>>> As far as I can tell, those interfaces cannot be directly generified,
>>> because type erasure causes a method signature clash.
>>> So adding generics here means breaking binary compatibilty.
>>>
>>> Question is, what does adding generics to these interfaces actually provide?
>>> Would it not be just as effective to deprecate Decoder and Encoder,
>>> and code can then use either BinaryEncoder or StringEncoder (etc)?
>>>
>>> At the moment test code is of the form:
>>>
>>> Encoder enc = new Base64();
>>> ...
>>> byte [] ba = (byte[]) enc.encode(binary)); // unchecked cast
>>>
>>> However this could be written:
>>>
>>> BinaryEncoder enc = new Base64();
>>> ...
>>> byte [] ba = enc.encode(binary)); // no cast needed
>>>
>>
>> Note that the Encoder/Decoder interfaces do not _require_ generification.
>>
>
> Well sure. Strictly speaking that's true but I still like the branch
> code better.
>
>> The only non-generified part of trunk is StringEncoderComparator.
>>
>
> I do not quite look at it that way, since I created the branch ;) if
> you just want to fix warnings then yes that class needs patching. But
> my claim in the branch is that the component is better and type-safe
> with generics.
>
>> This appears from the name as if it compares Strings, but in fact it
>> compares Objects in the current code.
>> Making it implement Comparator<Object> fixes the warning without
>> compromising binary (or source?) compatibility.
>>
>> The StringEncoders generate an EncoderException if the parameter is
>> not a String.
>> This does mean that comparison with a non-String will return 0 (equal)
>> which is rather strange, but that is the way that the code current
>> works.
>
> That sure sounds like a bug.

Well, the method .StringEncoderComparator.compare(Object, Object) Javadoc says:

@return the Comparable.compareTo() return code or 0 if an encoding
error was caught.

Does seem like a design bug, as one cannot distinguish failure from equality.

>>
>> So what I suggest is - let's release Codec as a binary compatible
>> implementation, at least for now.
>
> ATM there incompat changes due to deprecations and other bits.

Yes, but those could be reverted if necessary.

>>
>> If we do decide to break binary compatibility in a later release, then
>> I think the Encoder/Decoder hierarchy needs a bit more work.
>> For example, several of the new xxxCodec classes only implement
>> Decoder, not Encoder; this is because of the need to avoid erasure
>> clashes.
>
> Another set of refinements the ;)

I don't think the redesign goes quite far enough to fix the interface
inheritance problem.

What I'm suggesting is to take smaller steps to achieve the goal.

Release a binary-compatible version now - which has lots of other
useful fixes - and release a completely new version later.
If problems are found with the new stuff, any API changes can be
rolled into the next major version.

>>
>> I just wonder if trying to use generics here is not causing more
>> problems than it solves?
>> It's not as if there are lots of different types that are used here -
>> only String, byte[] and char[].
>> Perhaps we could just add CharEncoder/CharDecoder interfaces instead?
>
> Please see my proposal for the typing of the new BM encoder which I
> want to type as Set<String> encode(String) in a previous email.
>
> Thank you,
> Gary
>
>>
>>>> I plan on not touching trunk until the generics code is flushed out
>>>> and brought back to trunk.
>>>>
>>>> Thank you,
>>>> Gary
>>>>
>>>> --
>>>> http://garygregory.wordpress.com/
>>>> http://garygregory.com/
>>>> http://people.apache.org/~ggregory/
>>>> http://twitter.com/GaryGregory
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>
>>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [codec] Generics added in a SVN branch

Posted by Gary Gregory <ga...@gmail.com>.
Hi,

On Aug 14, 2011, at 10:19, sebb <se...@gmail.com> wrote:

> On 14 August 2011 03:02, sebb <se...@gmail.com> wrote:
>> On 12 August 2011 20:56, Gary Gregory <ga...@gmail.com> wrote:
>>> Hello All:
>>>
>>> For a first cut at generics support in Codec, please checkout the
>>> branch https://svn.apache.org/repos/asf/commons/proper/codec/branches/generics
>>>
>>> I wrote a migration guide in the root of the project called
>>> Codec2-Migration.htm.
>>>
>>> Let's discuss.
>>
>> The original code used Encoder and Decoder interfaces which operated
>> on Object, and extended these with
>> BinaryEncoder (byte[])
>> StringEncoder(String)
>>
>> So for example StringEncoder classes need to implement
>> encode(Object) : Object
>> encode(String) : String
>>
>> As far as I can tell, those interfaces cannot be directly generified,
>> because type erasure causes a method signature clash.
>> So adding generics here means breaking binary compatibilty.
>>
>> Question is, what does adding generics to these interfaces actually provide?
>> Would it not be just as effective to deprecate Decoder and Encoder,
>> and code can then use either BinaryEncoder or StringEncoder (etc)?
>>
>> At the moment test code is of the form:
>>
>> Encoder enc = new Base64();
>> ...
>> byte [] ba = (byte[]) enc.encode(binary)); // unchecked cast
>>
>> However this could be written:
>>
>> BinaryEncoder enc = new Base64();
>> ...
>> byte [] ba = enc.encode(binary)); // no cast needed
>>
>
> Note that the Encoder/Decoder interfaces do not _require_ generification.
>

Well sure. Strictly speaking that's true but I still like the branch
code better.

> The only non-generified part of trunk is StringEncoderComparator.
>

I do not quite look at it that way, since I created the branch ;) if
you just want to fix warnings then yes that class needs patching. But
my claim in the branch is that the component is better and type-safe
with generics.

> This appears from the name as if it compares Strings, but in fact it
> compares Objects in the current code.
> Making it implement Comparator<Object> fixes the warning without
> compromising binary (or source?) compatibility.
>
> The StringEncoders generate an EncoderException if the parameter is
> not a String.
> This does mean that comparison with a non-String will return 0 (equal)
> which is rather strange, but that is the way that the code current
> works.

That sure sounds like a bug.

>
> So what I suggest is - let's release Codec as a binary compatible
> implementation, at least for now.

ATM there incompat changes due to deprecations and other bits.

>
> If we do decide to break binary compatibility in a later release, then
> I think the Encoder/Decoder hierarchy needs a bit more work.
> For example, several of the new xxxCodec classes only implement
> Decoder, not Encoder; this is because of the need to avoid erasure
> clashes.

Another set of refinements the ;)

>
> I just wonder if trying to use generics here is not causing more
> problems than it solves?
> It's not as if there are lots of different types that are used here -
> only String, byte[] and char[].
> Perhaps we could just add CharEncoder/CharDecoder interfaces instead?

Please see my proposal for the typing of the new BM encoder which I
want to type as Set<String> encode(String) in a previous email.

Thank you,
Gary

>
>>> I plan on not touching trunk until the generics code is flushed out
>>> and brought back to trunk.
>>>
>>> Thank you,
>>> Gary
>>>
>>> --
>>> http://garygregory.wordpress.com/
>>> http://garygregory.com/
>>> http://people.apache.org/~ggregory/
>>> http://twitter.com/GaryGregory
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>
>>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [codec] Generics added in a SVN branch

Posted by sebb <se...@gmail.com>.
On 14 August 2011 03:02, sebb <se...@gmail.com> wrote:
> On 12 August 2011 20:56, Gary Gregory <ga...@gmail.com> wrote:
>> Hello All:
>>
>> For a first cut at generics support in Codec, please checkout the
>> branch https://svn.apache.org/repos/asf/commons/proper/codec/branches/generics
>>
>> I wrote a migration guide in the root of the project called
>> Codec2-Migration.htm.
>>
>> Let's discuss.
>
> The original code used Encoder and Decoder interfaces which operated
> on Object, and extended these with
> BinaryEncoder (byte[])
> StringEncoder(String)
>
> So for example StringEncoder classes need to implement
> encode(Object) : Object
> encode(String) : String
>
> As far as I can tell, those interfaces cannot be directly generified,
> because type erasure causes a method signature clash.
> So adding generics here means breaking binary compatibilty.
>
> Question is, what does adding generics to these interfaces actually provide?
> Would it not be just as effective to deprecate Decoder and Encoder,
> and code can then use either BinaryEncoder or StringEncoder (etc)?
>
> At the moment test code is of the form:
>
> Encoder enc = new Base64();
> ...
> byte [] ba = (byte[]) enc.encode(binary)); // unchecked cast
>
> However this could be written:
>
> BinaryEncoder enc = new Base64();
> ...
> byte [] ba = enc.encode(binary)); // no cast needed
>

Note that the Encoder/Decoder interfaces do not _require_ generification.

The only non-generified part of trunk is StringEncoderComparator.

This appears from the name as if it compares Strings, but in fact it
compares Objects in the current code.
Making it implement Comparator<Object> fixes the warning without
compromising binary (or source?) compatibility.

The StringEncoders generate an EncoderException if the parameter is
not a String.
This does mean that comparison with a non-String will return 0 (equal)
which is rather strange, but that is the way that the code current
works.

So what I suggest is - let's release Codec as a binary compatible
implementation, at least for now.

If we do decide to break binary compatibility in a later release, then
I think the Encoder/Decoder hierarchy needs a bit more work.
For example, several of the new xxxCodec classes only implement
Decoder, not Encoder; this is because of the need to avoid erasure
clashes.

I just wonder if trying to use generics here is not causing more
problems than it solves?
It's not as if there are lots of different types that are used here -
only String, byte[] and char[].
Perhaps we could just add CharEncoder/CharDecoder interfaces instead?

>> I plan on not touching trunk until the generics code is flushed out
>> and brought back to trunk.
>>
>> Thank you,
>> Gary
>>
>> --
>> http://garygregory.wordpress.com/
>> http://garygregory.com/
>> http://people.apache.org/~ggregory/
>> http://twitter.com/GaryGregory
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [codec] Generics added in a SVN branch

Posted by sebb <se...@gmail.com>.
On 12 August 2011 20:56, Gary Gregory <ga...@gmail.com> wrote:
> Hello All:
>
> For a first cut at generics support in Codec, please checkout the
> branch https://svn.apache.org/repos/asf/commons/proper/codec/branches/generics
>
> I wrote a migration guide in the root of the project called
> Codec2-Migration.htm.
>
> Let's discuss.

The original code used Encoder and Decoder interfaces which operated
on Object, and extended these with
BinaryEncoder (byte[])
StringEncoder(String)

So for example StringEncoder classes need to implement
encode(Object) : Object
encode(String) : String

As far as I can tell, those interfaces cannot be directly generified,
because type erasure causes a method signature clash.
So adding generics here means breaking binary compatibilty.

Question is, what does adding generics to these interfaces actually provide?
Would it not be just as effective to deprecate Decoder and Encoder,
and code can then use either BinaryEncoder or StringEncoder (etc)?

At the moment test code is of the form:

Encoder enc = new Base64();
...
byte [] ba = (byte[]) enc.encode(binary)); // unchecked cast

However this could be written:

BinaryEncoder enc = new Base64();
...
byte [] ba = enc.encode(binary)); // no cast needed


> I plan on not touching trunk until the generics code is flushed out
> and brought back to trunk.
>
> Thank you,
> Gary
>
> --
> http://garygregory.wordpress.com/
> http://garygregory.com/
> http://people.apache.org/~ggregory/
> http://twitter.com/GaryGregory
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org