You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Gary Gregory <ga...@gmail.com> on 2018/12/13 18:33:45 UTC

[CSV] Records as Lists

Hi All,

I am looking for opinions on turning a CSV record into a list, as opposed
to the minimal current implementation. There would be side-effects like a
record becoming writable instead of read-only as the current implementation.

Memory footprint would also be a concern.

Please see https://github.com/apache/commons-csv/pull/35

Gary

Re: [CSV] Records as Lists

Posted by Gilles <gi...@harfang.homelinux.org>.
On Thu, 13 Dec 2018 11:33:45 -0700, Gary Gregory wrote:
> Hi All,
>
> I am looking for opinions on turning a CSV record into a list, as 
> opposed
> to the minimal current implementation. There would be side-effects 
> like a
> record becoming writable instead of read-only as the current 
> implementation.

IIUC, the patch referred to below does not add those side-effects.

>
> Memory footprint would also be a concern.

The patch does not change that either.

Gilles

>
> Please see https://github.com/apache/commons-csv/pull/35
>
> Gary


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [CSV] Records as Lists

Posted by Gilles <gi...@harfang.homelinux.org>.
Hi.

On Tue, 18 Dec 2018 21:33:42 +0000 (UTC), Bruno P. Kinoshita wrote:
> Hi Gilles!
>
> Sorry, just came back from a long holiday, speaking Portuguese only.
> Semantic, vocabulary, etc, in English is a bit laggy right now.
>
>
>>>On Tue, 18 Dec 2018 07:22:00 +0000 (UTC), Bruno P. Kinoshita wrote:
>>> From what I understood from the previous messages & discussion on
>>> GitHub, it would be more convenient for users to be able to have a
>>> List instead of an Iterable,
>>
>>Why "instead"?
>>The patch makes the class a subclass of "List"; and "List"
>>implements "Iterable".
>
>
> That's right, I meant to say that it would be an AbstractList, and
> not only an Iterable any more.

I think that's the reporter's purpose: get more functionality...

>
>>> However, I think we are delivering an Iterable that's fully capable
>
>>> to be used as an Iterable now. Whereas the proposal would make it a
>>> read-only list, as that returned from unmodifiableList,
>>
>>That's not what the patch does:
>>https://docs.oracle.com/javase/7/docs/api/java/util/AbstractList.html
>
>
> I think tis another semantic/communication issue. From the link 
> above:
>
> "Note that this implementation throws an
> UnsupportedOperationException unless remove(int index) or
> removeRange(int fromIndex, int toIndex) is overridden."
>
>
> That's what I meant. By extending AbstractList, users could now call
> remove and other methods, that would result in the exception above
> (similarly to the list returned via Collections.unmodifiableList).

"remove()" can also be called on the instance return by the
"iterator()" method; and it will throw just the same.

>
>
>>How?
>
>>[In the original code, "toList" is private.]
>
> Oh, good point!!! I wrote the e-mail from memory, a few days after
> reading the code, and after just arriving at home. Mea culpa.
>
>
>>> by accidentally
>>> trying to reuse CSVRecord while reading from one input and writing 
>>> to
>>> an output stream.
>>
>>I don't understand this.
>
>
> That was a contrived example. Say you read the records, transform the
> Iterable into a List, and add/remove/clear columns/rows while
> processing the CSVRecord (e.g. you could be using commons-text and
> other libs to filter/process the entries). Then you get these values
> and create a new CSV.

IIUC, the current code cannot do that because it uses "Arrays.asList"
to return a _view_ of the underlying array (fixed length, remove not
supported).

>
> When users realize they now have a List instead, they could rely on
> the CSVRecord object, instead of creating new Lists, and get some
> runtime errors. Corner case, and - again - a contrived example.

Still not sure I get it; the behaviour does not seem to change apart
from providing more functionality.

>
> But I'm not opposing the change, just don't see the need for using
> AbstractList (though I've been writing Python for the past couple of
> months, so my Java-fu isn't really sharp right now).

The reporter mentioned "subList".

Regards,
Gilles

>
>
> Cheers
> Bruno
>
> ________________________________
> From: Gilles <gi...@harfang.homelinux.org>
> To: dev@commons.apache.org
> Sent: Wednesday, 19 December 2018 1:28 AM
> Subject: Re: [CSV] Records as Lists
>
>
>
> Hi.
>
> On Tue, 18 Dec 2018 07:22:00 +0000 (UTC), Bruno P. Kinoshita wrote:
>> From what I understood from the previous messages & discussion on
>> GitHub, it would be more convenient for users to be able to have a
>> List instead of an Iterable,
>
> Why "instead"?
> The patch makes the class a subclass of "List"; and "List"
> implements "Iterable".
>
>> or instead of having to call the
>> #toList() or convert to a List in some other way.
>> I commented in the pull request, that I don't think there would be a
>> performance penalty in doing so (at least I don't think so, as the
>> values are not streamed, but rather kept in the private values
>> array).
>> However, I think we are delivering an Iterable that's fully capable
>> to be used as an Iterable now. Whereas the proposal would make it a
>> read-only list, as that returned from unmodifiableList,
>
> That's not what the patch does:
>   
> https://docs.oracle.com/javase/7/docs/api/java/util/AbstractList.html
>
>> i.e. throwing
>> exceptions for add/clear/etc operations.
>
> The original code (using "Arrays.asList") could not do those
> operations:
>
> 
> https://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html#asList(T...)
>
>> In my opinion, I prefer to keep it as an Iterable, leave the toList
>> method, as I think current users could be affected
>
> How?
> [In the original code, "toList" is private.]
>
>> by accidentally
>> trying to reuse CSVRecord while reading from one input and writing 
>> to
>> an output stream.
>
> I don't understand this.
>
> Regards,
> Gilles
>
>
>> So I'm -0 for it.
>> Bruno
>>
>>       From: sebb <se...@gmail.com>
>>  To: Commons Developers List <de...@commons.apache.org>
>>  Sent: Tuesday, 18 December 2018 12:24 AM
>>  Subject: Re: [CSV] Records as Lists
>>
>> What is the use-case for using lists?
>>
>> On Thu, 13 Dec 2018 at 18:34, Gary Gregory <ga...@gmail.com>
>> wrote:
>>>
>>> Hi All,
>>>
>>> I am looking for opinions on turning a CSV record into a list, as
>>> opposed
>>> to the minimal current implementation. There would be side-effects
>>> like a
>>> record becoming writable instead of read-only as the current
>>> implementation.
>>>
>>> Memory footprint would also be a concern.
>>>
>>> Please see https://github.com/apache/commons-csv/pull/35
>>>
>>> Gary


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [CSV] Records as Lists

Posted by "Bruno P. Kinoshita" <br...@yahoo.com.br.INVALID>.
Hi Gilles!

Sorry, just came back from a long holiday, speaking Portuguese only. Semantic, vocabulary, etc, in English is a bit laggy right now.


>>On Tue, 18 Dec 2018 07:22:00 +0000 (UTC), Bruno P. Kinoshita wrote:
>> From what I understood from the previous messages & discussion on
>> GitHub, it would be more convenient for users to be able to have a
>> List instead of an Iterable,
>
>Why "instead"?
>The patch makes the class a subclass of "List"; and "List"
>implements "Iterable".


That's right, I meant to say that it would be an AbstractList, and not only an Iterable any more.



>> However, I think we are delivering an Iterable that's fully capable

>> to be used as an Iterable now. Whereas the proposal would make it a
>> read-only list, as that returned from unmodifiableList,
>
>That's not what the patch does:
>https://docs.oracle.com/javase/7/docs/api/java/util/AbstractList.html


I think tis another semantic/communication issue. From the link above:

"Note that this implementation throws an UnsupportedOperationException unless remove(int index) or removeRange(int fromIndex, int toIndex) is overridden."


That's what I meant. By extending AbstractList, users could now call remove and other methods, that would result in the exception above (similarly to the list returned via Collections.unmodifiableList).



>How?

>[In the original code, "toList" is private.]

Oh, good point!!! I wrote the e-mail from memory, a few days after reading the code, and after just arriving at home. Mea culpa.


>> by accidentally
>> trying to reuse CSVRecord while reading from one input and writing to
>> an output stream.
>
>I don't understand this.


That was a contrived example. Say you read the records, transform the Iterable into a List, and add/remove/clear columns/rows while processing the CSVRecord (e.g. you could be using commons-text and other libs to filter/process the entries). Then you get these values and create a new CSV.

When users realize they now have a List instead, they could rely on the CSVRecord object, instead of creating new Lists, and get some runtime errors. Corner case, and - again - a contrived example.

But I'm not opposing the change, just don't see the need for using AbstractList (though I've been writing Python for the past couple of months, so my Java-fu isn't really sharp right now).


Cheers
Bruno

________________________________
From: Gilles <gi...@harfang.homelinux.org>
To: dev@commons.apache.org 
Sent: Wednesday, 19 December 2018 1:28 AM
Subject: Re: [CSV] Records as Lists



Hi.

On Tue, 18 Dec 2018 07:22:00 +0000 (UTC), Bruno P. Kinoshita wrote:
> From what I understood from the previous messages & discussion on
> GitHub, it would be more convenient for users to be able to have a
> List instead of an Iterable,

Why "instead"?
The patch makes the class a subclass of "List"; and "List"
implements "Iterable".

> or instead of having to call the
> #toList() or convert to a List in some other way.
> I commented in the pull request, that I don't think there would be a
> performance penalty in doing so (at least I don't think so, as the
> values are not streamed, but rather kept in the private values 
> array).
> However, I think we are delivering an Iterable that's fully capable
> to be used as an Iterable now. Whereas the proposal would make it a
> read-only list, as that returned from unmodifiableList,

That's not what the patch does:
  https://docs.oracle.com/javase/7/docs/api/java/util/AbstractList.html

> i.e. throwing
> exceptions for add/clear/etc operations.

The original code (using "Arrays.asList") could not do those 
operations:
  
https://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html#asList(T...)

> In my opinion, I prefer to keep it as an Iterable, leave the toList
> method, as I think current users could be affected

How?
[In the original code, "toList" is private.]

> by accidentally
> trying to reuse CSVRecord while reading from one input and writing to
> an output stream.

I don't understand this.

Regards,
Gilles


> So I'm -0 for it.
> Bruno
>
>       From: sebb <se...@gmail.com>
>  To: Commons Developers List <de...@commons.apache.org>
>  Sent: Tuesday, 18 December 2018 12:24 AM
>  Subject: Re: [CSV] Records as Lists
>
> What is the use-case for using lists?
>
> On Thu, 13 Dec 2018 at 18:34, Gary Gregory <ga...@gmail.com> 
> wrote:
>>
>> Hi All,
>>
>> I am looking for opinions on turning a CSV record into a list, as 
>> opposed
>> to the minimal current implementation. There would be side-effects 
>> like a
>> record becoming writable instead of read-only as the current 
>> implementation.
>>
>> Memory footprint would also be a concern.
>>
>> Please see https://github.com/apache/commons-csv/pull/35
>>
>> Gary
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [CSV] Records as Lists

Posted by Gilles <gi...@harfang.homelinux.org>.
Hi.

On Tue, 18 Dec 2018 07:22:00 +0000 (UTC), Bruno P. Kinoshita wrote:
> From what I understood from the previous messages & discussion on
> GitHub, it would be more convenient for users to be able to have a
> List instead of an Iterable,

Why "instead"?
The patch makes the class a subclass of "List"; and "List"
implements "Iterable".

> or instead of having to call the
> #toList() or convert to a List in some other way.
> I commented in the pull request, that I don't think there would be a
> performance penalty in doing so (at least I don't think so, as the
> values are not streamed, but rather kept in the private values 
> array).
> However, I think we are delivering an Iterable that's fully capable
> to be used as an Iterable now. Whereas the proposal would make it a
> read-only list, as that returned from unmodifiableList,

That's not what the patch does:
   https://docs.oracle.com/javase/7/docs/api/java/util/AbstractList.html

> i.e. throwing
> exceptions for add/clear/etc operations.

The original code (using "Arrays.asList") could not do those 
operations:
   
https://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html#asList(T...)

> In my opinion, I prefer to keep it as an Iterable, leave the toList
> method, as I think current users could be affected

How?
[In the original code, "toList" is private.]

> by accidentally
> trying to reuse CSVRecord while reading from one input and writing to
> an output stream.

I don't understand this.

Regards,
Gilles


> So I'm -0 for it.
> Bruno
>
>       From: sebb <se...@gmail.com>
>  To: Commons Developers List <de...@commons.apache.org>
>  Sent: Tuesday, 18 December 2018 12:24 AM
>  Subject: Re: [CSV] Records as Lists
>
> What is the use-case for using lists?
>
> On Thu, 13 Dec 2018 at 18:34, Gary Gregory <ga...@gmail.com> 
> wrote:
>>
>> Hi All,
>>
>> I am looking for opinions on turning a CSV record into a list, as 
>> opposed
>> to the minimal current implementation. There would be side-effects 
>> like a
>> record becoming writable instead of read-only as the current 
>> implementation.
>>
>> Memory footprint would also be a concern.
>>
>> Please see https://github.com/apache/commons-csv/pull/35
>>
>> Gary
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: [CSV] Records as Lists

Posted by "Bruno P. Kinoshita" <br...@yahoo.com.br.INVALID>.
From what I understood from the previous messages & discussion on GitHub, it would be more convenient for users to be able to have a List instead of an Iterable, or instead of having to call the #toList() or convert to a List in some other way.
I commented in the pull request, that I don't think there would be a performance penalty in doing so (at least I don't think so, as the values are not streamed, but rather kept in the private values array).
However, I think we are delivering an Iterable that's fully capable to be used as an Iterable now. Whereas the proposal would make it a read-only list, as that returned from unmodifiableList, i.e. throwing exceptions for add/clear/etc operations.
In my opinion, I prefer to keep it as an Iterable, leave the toList method, as I think current users could be affected by accidentally trying to reuse CSVRecord while reading from one input and writing to an output stream.
So I'm -0 for it.
Bruno

      From: sebb <se...@gmail.com>
 To: Commons Developers List <de...@commons.apache.org> 
 Sent: Tuesday, 18 December 2018 12:24 AM
 Subject: Re: [CSV] Records as Lists
   
What is the use-case for using lists?

On Thu, 13 Dec 2018 at 18:34, Gary Gregory <ga...@gmail.com> wrote:
>
> Hi All,
>
> I am looking for opinions on turning a CSV record into a list, as opposed
> to the minimal current implementation. There would be side-effects like a
> record becoming writable instead of read-only as the current implementation.
>
> Memory footprint would also be a concern.
>
> Please see https://github.com/apache/commons-csv/pull/35
>
> Gary

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org



   

Re: [CSV] Records as Lists

Posted by sebb <se...@gmail.com>.
What is the use-case for using lists?

On Thu, 13 Dec 2018 at 18:34, Gary Gregory <ga...@gmail.com> wrote:
>
> Hi All,
>
> I am looking for opinions on turning a CSV record into a list, as opposed
> to the minimal current implementation. There would be side-effects like a
> record becoming writable instead of read-only as the current implementation.
>
> Memory footprint would also be a concern.
>
> Please see https://github.com/apache/commons-csv/pull/35
>
> Gary

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org