You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Gilles Sadowski <gi...@gmail.com> on 2022/01/11 13:41:17 UTC

Re: [commons-io] branch master updated: Add CharsetEncoders.

Le mar. 11 janv. 2022 à 14:21, <gg...@apache.org> a écrit :
>
> This is an automated email from the ASF dual-hosted git repository.
>
> ggregory pushed a commit to branch master
> in repository https://gitbox.apache.org/repos/asf/commons-io.git
>
>
> The following commit(s) were added to refs/heads/master by this push:
>      new 7ffb81b  Add CharsetEncoders.
> 7ffb81b is described below
>
> commit 7ffb81b956ff2c148cb27a1b05635a9ea7a98a6d
> Author: Gary Gregory <ga...@gmail.com>
> AuthorDate: Tue Jan 11 08:21:54 2022 -0500
>
>     Add CharsetEncoders.
> ---
>  src/changes/changes.xml                            |  3 ++
>  .../apache/commons/io/charset/CharsetEncoders.java | 36 +++++++++++++++
>  .../apache/commons/io/charset/package-info.java    | 22 +++++++++
>  .../commons/io/charset/CharsetEncodersTest.java    | 54 ++++++++++++++++++++++
>  4 files changed, 115 insertions(+)
>
> diff --git a/src/changes/changes.xml b/src/changes/changes.xml
> index 5fd6789..c6d4778 100644
> --- a/src/changes/changes.xml
> +++ b/src/changes/changes.xml
> @@ -293,6 +293,9 @@ The <action> type attribute can be add,update,fix,remove.
>        <action dev="ggregory" type="add" due-to="Gary Gregory">
>          Add and reuse IOConsumer.forEach(T[], IOConsumer) and forEachIndexed(Stream, IOConsumer).
>        </action>
> +      <action dev="ggregory" type="add" due-to="Gary Gregory">
> +        Add CharsetEncoders.
> +      </action>
>        <!-- UPDATE -->
>        <action dev="ggregory" type="add" due-to="Gary Gregory">
>          Update FileEntry to use FileTime instead of long for file time stamps.
> diff --git a/src/main/java/org/apache/commons/io/charset/CharsetEncoders.java b/src/main/java/org/apache/commons/io/charset/CharsetEncoders.java
> new file mode 100644
> index 0000000..815aaef
> --- /dev/null
> +++ b/src/main/java/org/apache/commons/io/charset/CharsetEncoders.java
> @@ -0,0 +1,36 @@
> +/*
> + * Licensed to the Apache Software Foundation (ASF) under one or more
> + * contributor license agreements.  See the NOTICE file distributed with
> + * this work for additional information regarding copyright ownership.
> + * The ASF licenses this file to You under the Apache License, Version 2.0
> + * (the "License"); you may not use this file except in compliance with
> + * the License.  You may obtain a copy of the License at
> + *
> + *      http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
> +
> +package org.apache.commons.io.charset;
> +
> +import java.nio.charset.Charset;
> +import java.nio.charset.CharsetEncoder;
> +
> +public class CharsetEncoders {
> +
> +    /**
> +     * Returns the given non-null CharsetEncoder or a new default CharsetEncoder.
> +     *
> +     * @param charsetEncoder The CharsetEncoder to test.
> +     * @return the given non-null CharsetEncoder or a new default CharsetEncoder.
> +     * @since 2.12.0
> +     */
> +    public static CharsetEncoder toCharsetEncoder(CharsetEncoder charsetEncoder) {
> +        return charsetEncoder != null ? charsetEncoder : Charset.defaultCharset().newEncoder();
> +    }

What's the use-case for such a function?

void userFunction(CharsetEncoder charsetEncoder) { /* ... */ }

Not using Commons IO:
---CUT---
  userFunction(csEnc == null ? Charset.defaultCharset().newEncoder() : csEnc);
---CUT---
vs using Commons IO:
---CUT---
  userFunction(CharsetEncoders.toCharsetEncoder(csEnc));
---CUT---

IMO, the former call is clearer (self-documenting) and safer (explicit request
for a default).

> [...]

Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [commons-io] branch master updated: Add CharsetEncoders.

Posted by Gilles Sadowski <gi...@gmail.com>.
Hello.

Le mar. 11 janv. 2022 à 16:05, Gary Gregory <ga...@gmail.com> a écrit :
>
> On Tue, Jan 11, 2022 at 9:48 AM Gilles Sadowski <gi...@gmail.com>
> wrote:
>
> > Hello.
> >
> > Le mar. 11 janv. 2022 à 15:22, Gary Gregory <ga...@gmail.com> a
> > écrit :
> > >
> > > Hello Gilles and Happy New Year,
> >
> > Thanks. Best wishes to you, and to all contributors and reviewers.
> >
> > >
> > > We have most input streams, output streams, readers, and writers in
> > Commons
> > > IO that already convert null Charset names and null Charset to the
> > platform
> > > default through convenience APIs; but we do not do this consistently
> > > everywhere, and not for CharsetEncoder and CharsetDecoder.
> > >
> > > I aim to normalize this behavior to be more consistent. See the commits
> > > since this one.
> >
> > I do not have the overall picture of what is required for "consistency".
> > However, does the public API allow a "null" argument passed from
> > user code being silently turned into a non-null default value, thus
> > potentially hiding a programming error?
>
>
> The convenience allows for default behavior to kick in for call sites
> without having to jump through hoops when some input is unspecified (null)
> or you want a way to say "give me the default behavior" by passing a null
> or an actual default object.
>
> [...]

I get the "convenience" part, but it is not an answer to my question
(that goes beyond that specific commit): If a user passes "null" by
mistake, is it better that a low-level library raises NPE, or arbitrarily
chooses to do something that may not be expected by the caller?

IOW, I think that the application is where "null" could be turned into
a meaningful/safe default value.

Perhaps I missed that this type of "convenience" is the purpose of
Commons IO (i.e. users are aware that there is such a default
behaviour).

Gilles

>>> [...]

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [commons-io] branch master updated: Add CharsetEncoders.

Posted by Gary Gregory <ga...@gmail.com>.
On Tue, Jan 11, 2022 at 9:48 AM Gilles Sadowski <gi...@gmail.com>
wrote:

> Hello.
>
> Le mar. 11 janv. 2022 à 15:22, Gary Gregory <ga...@gmail.com> a
> écrit :
> >
> > Hello Gilles and Happy New Year,
>
> Thanks. Best wishes to you, and to all contributors and reviewers.
>
> >
> > We have most input streams, output streams, readers, and writers in
> Commons
> > IO that already convert null Charset names and null Charset to the
> platform
> > default through convenience APIs; but we do not do this consistently
> > everywhere, and not for CharsetEncoder and CharsetDecoder.
> >
> > I aim to normalize this behavior to be more consistent. See the commits
> > since this one.
>
> I do not have the overall picture of what is required for "consistency".
> However, does the public API allow a "null" argument passed from
> user code being silently turned into a non-null default value, thus
> potentially hiding a programming error?


The convenience allows for default behavior to kick in for call sites
without having to jump through hoops when some input is unspecified (null)
or you want a way to say "give me the default behavior" by passing a null
or an actual default object. For example:

Foo foo = some Foo or null;

The convenient: new Something(foo);

The not convenient #1: return foo == null ? new Something(Foo.default())
: new Someting(foo);
The not convenient #2: return foo == null ? new Something() : new
Someting(foo);
The not convenient #3: new Something(foo == null ? Foo.default() : foo);

Rinse and repeat for many call sites.

HTH,
Gary

>


> >
> > I find it much simple to maintain, document, and explain the code base
> with
> > these new methods.
>
> If it is for internal consistency (of Commons IO), shouldn't such methods
> (and utility classes) be defined in an "internal" package (or be private)?
>
> >
> > Of course, you are most welcome to keep writing ternary expressions in
> your
> > call sites ;-)
>
> Depending on the answer to the first question above, the issue I'd see
> is that user code, instead of raising NPE consistently, could behave
> differently on different platforms (due to different defaults).
>
> Regards,
> Gilles
>
> >
> > Gary
> >
> > On Tue, Jan 11, 2022 at 8:41 AM Gilles Sadowski <gi...@gmail.com>
> > [...]
> > > > +
> > > > +public class CharsetEncoders {
> > > > +
> > > > +    /**
> > > > +     * Returns the given non-null CharsetEncoder or a new default
> > > CharsetEncoder.
> > > > +     *
> > > > +     * @param charsetEncoder The CharsetEncoder to test.
> > > > +     * @return the given non-null CharsetEncoder or a new default
> > > CharsetEncoder.
> > > > +     * @since 2.12.0
> > > > +     */
> > > > +    public static CharsetEncoder toCharsetEncoder(CharsetEncoder
> > > charsetEncoder) {
> > > > +        return charsetEncoder != null ? charsetEncoder :
> > > Charset.defaultCharset().newEncoder();
> > > > +    }
> > >
> > > What's the use-case for such a function?
> > >
> > > void userFunction(CharsetEncoder charsetEncoder) { /* ... */ }
> > >
> > > Not using Commons IO:
> > > ---CUT---
> > >   userFunction(csEnc == null ? Charset.defaultCharset().newEncoder() :
> > > csEnc);
> > > ---CUT---
> > > vs using Commons IO:
> > > ---CUT---
> > >   userFunction(CharsetEncoders.toCharsetEncoder(csEnc));
> > > ---CUT---
> > >
> > > IMO, the former call is clearer (self-documenting) and safer (explicit
> > > request
> > > for a default).
> > >
> > > > [...]
> > >
> > > Gilles
> > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Re: [commons-io] branch master updated: Add CharsetEncoders.

Posted by Gilles Sadowski <gi...@gmail.com>.
Hello.

Le mar. 11 janv. 2022 à 15:22, Gary Gregory <ga...@gmail.com> a écrit :
>
> Hello Gilles and Happy New Year,

Thanks. Best wishes to you, and to all contributors and reviewers.

>
> We have most input streams, output streams, readers, and writers in Commons
> IO that already convert null Charset names and null Charset to the platform
> default through convenience APIs; but we do not do this consistently
> everywhere, and not for CharsetEncoder and CharsetDecoder.
>
> I aim to normalize this behavior to be more consistent. See the commits
> since this one.

I do not have the overall picture of what is required for "consistency".
However, does the public API allow a "null" argument passed from
user code being silently turned into a non-null default value, thus
potentially hiding a programming error?

>
> I find it much simple to maintain, document, and explain the code base with
> these new methods.

If it is for internal consistency (of Commons IO), shouldn't such methods
(and utility classes) be defined in an "internal" package (or be private)?

>
> Of course, you are most welcome to keep writing ternary expressions in your
> call sites ;-)

Depending on the answer to the first question above, the issue I'd see
is that user code, instead of raising NPE consistently, could behave
differently on different platforms (due to different defaults).

Regards,
Gilles

>
> Gary
>
> On Tue, Jan 11, 2022 at 8:41 AM Gilles Sadowski <gi...@gmail.com>
> [...]
> > > +
> > > +public class CharsetEncoders {
> > > +
> > > +    /**
> > > +     * Returns the given non-null CharsetEncoder or a new default
> > CharsetEncoder.
> > > +     *
> > > +     * @param charsetEncoder The CharsetEncoder to test.
> > > +     * @return the given non-null CharsetEncoder or a new default
> > CharsetEncoder.
> > > +     * @since 2.12.0
> > > +     */
> > > +    public static CharsetEncoder toCharsetEncoder(CharsetEncoder
> > charsetEncoder) {
> > > +        return charsetEncoder != null ? charsetEncoder :
> > Charset.defaultCharset().newEncoder();
> > > +    }
> >
> > What's the use-case for such a function?
> >
> > void userFunction(CharsetEncoder charsetEncoder) { /* ... */ }
> >
> > Not using Commons IO:
> > ---CUT---
> >   userFunction(csEnc == null ? Charset.defaultCharset().newEncoder() :
> > csEnc);
> > ---CUT---
> > vs using Commons IO:
> > ---CUT---
> >   userFunction(CharsetEncoders.toCharsetEncoder(csEnc));
> > ---CUT---
> >
> > IMO, the former call is clearer (self-documenting) and safer (explicit
> > request
> > for a default).
> >
> > > [...]
> >
> > Gilles
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [commons-io] branch master updated: Add CharsetEncoders.

Posted by Gary Gregory <ga...@gmail.com>.
Hello Gilles and Happy New Year,

We have most input streams, output streams, readers, and writers in Commons
IO that already convert null Charset names and null Charset to the platform
default through convenience APIs; but we do not do this consistently
everywhere, and not for CharsetEncoder and CharsetDecoder.

I aim to normalize this behavior to be more consistent. See the commits
since this one.

I find it much simple to maintain, document, and explain the code base with
these new methods.

Of course, you are most welcome to keep writing ternary expressions in your
call sites ;-)

Gary

On Tue, Jan 11, 2022 at 8:41 AM Gilles Sadowski <gi...@gmail.com>
wrote:

> Le mar. 11 janv. 2022 à 14:21, <gg...@apache.org> a écrit :
> >
> > This is an automated email from the ASF dual-hosted git repository.
> >
> > ggregory pushed a commit to branch master
> > in repository https://gitbox.apache.org/repos/asf/commons-io.git
> >
> >
> > The following commit(s) were added to refs/heads/master by this push:
> >      new 7ffb81b  Add CharsetEncoders.
> > 7ffb81b is described below
> >
> > commit 7ffb81b956ff2c148cb27a1b05635a9ea7a98a6d
> > Author: Gary Gregory <ga...@gmail.com>
> > AuthorDate: Tue Jan 11 08:21:54 2022 -0500
> >
> >     Add CharsetEncoders.
> > ---
> >  src/changes/changes.xml                            |  3 ++
> >  .../apache/commons/io/charset/CharsetEncoders.java | 36 +++++++++++++++
> >  .../apache/commons/io/charset/package-info.java    | 22 +++++++++
> >  .../commons/io/charset/CharsetEncodersTest.java    | 54
> ++++++++++++++++++++++
> >  4 files changed, 115 insertions(+)
> >
> > diff --git a/src/changes/changes.xml b/src/changes/changes.xml
> > index 5fd6789..c6d4778 100644
> > --- a/src/changes/changes.xml
> > +++ b/src/changes/changes.xml
> > @@ -293,6 +293,9 @@ The <action> type attribute can be
> add,update,fix,remove.
> >        <action dev="ggregory" type="add" due-to="Gary Gregory">
> >          Add and reuse IOConsumer.forEach(T[], IOConsumer) and
> forEachIndexed(Stream, IOConsumer).
> >        </action>
> > +      <action dev="ggregory" type="add" due-to="Gary Gregory">
> > +        Add CharsetEncoders.
> > +      </action>
> >        <!-- UPDATE -->
> >        <action dev="ggregory" type="add" due-to="Gary Gregory">
> >          Update FileEntry to use FileTime instead of long for file time
> stamps.
> > diff --git
> a/src/main/java/org/apache/commons/io/charset/CharsetEncoders.java
> b/src/main/java/org/apache/commons/io/charset/CharsetEncoders.java
> > new file mode 100644
> > index 0000000..815aaef
> > --- /dev/null
> > +++ b/src/main/java/org/apache/commons/io/charset/CharsetEncoders.java
> > @@ -0,0 +1,36 @@
> > +/*
> > + * Licensed to the Apache Software Foundation (ASF) under one or more
> > + * contributor license agreements.  See the NOTICE file distributed with
> > + * this work for additional information regarding copyright ownership.
> > + * The ASF licenses this file to You under the Apache License, Version
> 2.0
> > + * (the "License"); you may not use this file except in compliance with
> > + * the License.  You may obtain a copy of the License at
> > + *
> > + *      http://www.apache.org/licenses/LICENSE-2.0
> > + *
> > + * Unless required by applicable law or agreed to in writing, software
> > + * distributed under the License is distributed on an "AS IS" BASIS,
> > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
> implied.
> > + * See the License for the specific language governing permissions and
> > + * limitations under the License.
> > + */
> > +
> > +package org.apache.commons.io.charset;
> > +
> > +import java.nio.charset.Charset;
> > +import java.nio.charset.CharsetEncoder;
> > +
> > +public class CharsetEncoders {
> > +
> > +    /**
> > +     * Returns the given non-null CharsetEncoder or a new default
> CharsetEncoder.
> > +     *
> > +     * @param charsetEncoder The CharsetEncoder to test.
> > +     * @return the given non-null CharsetEncoder or a new default
> CharsetEncoder.
> > +     * @since 2.12.0
> > +     */
> > +    public static CharsetEncoder toCharsetEncoder(CharsetEncoder
> charsetEncoder) {
> > +        return charsetEncoder != null ? charsetEncoder :
> Charset.defaultCharset().newEncoder();
> > +    }
>
> What's the use-case for such a function?
>
> void userFunction(CharsetEncoder charsetEncoder) { /* ... */ }
>
> Not using Commons IO:
> ---CUT---
>   userFunction(csEnc == null ? Charset.defaultCharset().newEncoder() :
> csEnc);
> ---CUT---
> vs using Commons IO:
> ---CUT---
>   userFunction(CharsetEncoders.toCharsetEncoder(csEnc));
> ---CUT---
>
> IMO, the former call is clearer (self-documenting) and safer (explicit
> request
> for a default).
>
> > [...]
>
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Re: [commons-io] branch master updated: Add CharsetEncoders.

Posted by Rob Tompkins <ch...@gmail.com>.

> On Jan 11, 2022, at 8:41 AM, Gilles Sadowski <gi...@gmail.com> wrote:
> 
> Le mar. 11 janv. 2022 à 14:21, <ggregory@apache.org <ma...@apache.org>> a écrit :
>> 
>> This is an automated email from the ASF dual-hosted git repository.
>> 
>> ggregory pushed a commit to branch master
>> in repository https://gitbox.apache.org/repos/asf/commons-io.git
>> 
>> 
>> The following commit(s) were added to refs/heads/master by this push:
>>     new 7ffb81b  Add CharsetEncoders.
>> 7ffb81b is described below
>> 
>> commit 7ffb81b956ff2c148cb27a1b05635a9ea7a98a6d
>> Author: Gary Gregory <ga...@gmail.com>
>> AuthorDate: Tue Jan 11 08:21:54 2022 -0500
>> 
>>    Add CharsetEncoders.
>> ---
>> src/changes/changes.xml                            |  3 ++
>> .../apache/commons/io/charset/CharsetEncoders.java | 36 +++++++++++++++
>> .../apache/commons/io/charset/package-info.java    | 22 +++++++++
>> .../commons/io/charset/CharsetEncodersTest.java    | 54 ++++++++++++++++++++++
>> 4 files changed, 115 insertions(+)
>> 
>> diff --git a/src/changes/changes.xml b/src/changes/changes.xml
>> index 5fd6789..c6d4778 100644
>> --- a/src/changes/changes.xml
>> +++ b/src/changes/changes.xml
>> @@ -293,6 +293,9 @@ The <action> type attribute can be add,update,fix,remove.
>>       <action dev="ggregory" type="add" due-to="Gary Gregory">
>>         Add and reuse IOConsumer.forEach(T[], IOConsumer) and forEachIndexed(Stream, IOConsumer).
>>       </action>
>> +      <action dev="ggregory" type="add" due-to="Gary Gregory">
>> +        Add CharsetEncoders.
>> +      </action>
>>       <!-- UPDATE -->
>>       <action dev="ggregory" type="add" due-to="Gary Gregory">
>>         Update FileEntry to use FileTime instead of long for file time stamps.
>> diff --git a/src/main/java/org/apache/commons/io/charset/CharsetEncoders.java b/src/main/java/org/apache/commons/io/charset/CharsetEncoders.java
>> new file mode 100644
>> index 0000000..815aaef
>> --- /dev/null
>> +++ b/src/main/java/org/apache/commons/io/charset/CharsetEncoders.java
>> @@ -0,0 +1,36 @@
>> +/*
>> + * Licensed to the Apache Software Foundation (ASF) under one or more
>> + * contributor license agreements.  See the NOTICE file distributed with
>> + * this work for additional information regarding copyright ownership.
>> + * The ASF licenses this file to You under the Apache License, Version 2.0
>> + * (the "License"); you may not use this file except in compliance with
>> + * the License.  You may obtain a copy of the License at
>> + *
>> + *      http://www.apache.org/licenses/LICENSE-2.0
>> + *
>> + * Unless required by applicable law or agreed to in writing, software
>> + * distributed under the License is distributed on an "AS IS" BASIS,
>> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
>> + * See the License for the specific language governing permissions and
>> + * limitations under the License.
>> + */
>> +
>> +package org.apache.commons.io.charset;
>> +
>> +import java.nio.charset.Charset;
>> +import java.nio.charset.CharsetEncoder;
>> +
>> +public class CharsetEncoders {
>> +
>> +    /**
>> +     * Returns the given non-null CharsetEncoder or a new default CharsetEncoder.
>> +     *
>> +     * @param charsetEncoder The CharsetEncoder to test.
>> +     * @return the given non-null CharsetEncoder or a new default CharsetEncoder.
>> +     * @since 2.12.0
>> +     */
>> +    public static CharsetEncoder toCharsetEncoder(CharsetEncoder charsetEncoder) {
>> +        return charsetEncoder != null ? charsetEncoder : Charset.defaultCharset().newEncoder();
>> +    }
> 
> What's the use-case for such a function?
> 
> void userFunction(CharsetEncoder charsetEncoder) { /* ... */ }
> 
> Not using Commons IO:
> ---CUT---
>  userFunction(csEnc == null ? Charset.defaultCharset().newEncoder() : csEnc);
> ---CUT---
> vs using Commons IO:
> ---CUT---
>  userFunction(CharsetEncoders.toCharsetEncoder(csEnc));
> ---CUT---
> 
> IMO, the former call is clearer (self-documenting) and safer (explicit request
> for a default).


If things are unclear in the code, why not check in comments as to why**** we’re doing what we’re doing. I’ve found that comments about what’s going on are over the top, but why comments are of great value.

-Rob

> 
>> [...]
> 
> Gilles
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org <ma...@commons.apache.org>
> For additional commands, e-mail: dev-help@commons.apache.org <ma...@commons.apache.org>