You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/08/06 21:23:22 UTC
[GitHub] [arrow] westonpace opened a new pull request #10897: ARROW-13580: [C++] quoted_strings_can_be_null only applied to string columns
westonpace opened a new pull request #10897:
URL: https://github.com/apache/arrow/pull/10897
Allow quoted nulls if quoted_strings_can_be_null is true
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou commented on pull request #10897: ARROW-13580: [C++] quoted_strings_can_be_null only applied to string columns
Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #10897:
URL: https://github.com/apache/arrow/pull/10897#issuecomment-908556559
AppVeyor build on my fork: https://ci.appveyor.com/project/pitrou/arrow/builds/40577420
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] westonpace edited a comment on pull request #10897: ARROW-13580: [C++] quoted_strings_can_be_null only applied to string columns
Posted by GitBox <gi...@apache.org>.
westonpace edited a comment on pull request #10897:
URL: https://github.com/apache/arrow/pull/10897#issuecomment-895450555
Ah, if this is a behavior change then I can update that docstring as well. I'll add, in support of my argument, that we parse quoted non-nulls as integers. For example:
```
>>> pyarrow.csv.read_csv(io.BytesIO(b'A\n"1"\n"2"\n'))
pyarrow.Table
A: int64
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #10897: ARROW-13580: [C++] quoted_strings_can_be_null only applied to string columns
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10897:
URL: https://github.com/apache/arrow/pull/10897#issuecomment-894525563
https://issues.apache.org/jira/browse/ARROW-13580
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kkraus14 commented on pull request #10897: ARROW-13580: [C++] quoted_strings_can_be_null only applied to string columns
Posted by GitBox <gi...@apache.org>.
kkraus14 commented on pull request #10897:
URL: https://github.com/apache/arrow/pull/10897#issuecomment-895417384
Agreed that it's beneficial to change the behavior and docstring to allow for treating empty quoted strings as null in the case of numeric columns.
Bigger picture, CSVs are a mess where we should ideally allow controlling options like `null_values`, `true_values`, `false_values`, `quoted_strings_can_be_null` on a per-column level in addition to globally.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] westonpace commented on pull request #10897: ARROW-13580: [C++] quoted_strings_can_be_null only applied to string columns
Posted by GitBox <gi...@apache.org>.
westonpace commented on pull request #10897:
URL: https://github.com/apache/arrow/pull/10897#issuecomment-895450555
Ah, if this is a behavior change then I can update that docstring as well. I'll add, in support of my argument, that we parsed quoted non-nulls as integers. For example:
```
>>> pyarrow.csv.read_csv(io.BytesIO(b'A\n"1"\n"2"\n'))
pyarrow.Table
A: int64
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou closed pull request #10897: ARROW-13580: [C++] quoted_strings_can_be_null only applied to string columns
Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #10897:
URL: https://github.com/apache/arrow/pull/10897
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou commented on pull request #10897: ARROW-13580: [C++] quoted_strings_can_be_null only applied to string columns
Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #10897:
URL: https://github.com/apache/arrow/pull/10897#issuecomment-894645853
Well, the docstring looks quite clear to me:
```c++
/// Whether string / binary columns can have quoted null values.
///
/// If true *and* `strings_can_be_null` is true, then quoted strings in
/// "null_values" are also considered null for string columns. Otherwise,
/// quoted strings are never considered null.
```
Perhaps it's beneficial to change the behaviour?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou closed pull request #10897: ARROW-13580: [C++] quoted_strings_can_be_null only applied to string columns
Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #10897:
URL: https://github.com/apache/arrow/pull/10897
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou commented on pull request #10897: ARROW-13580: [C++] quoted_strings_can_be_null only applied to string columns
Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #10897:
URL: https://github.com/apache/arrow/pull/10897#issuecomment-908556559
AppVeyor build on my fork: https://ci.appveyor.com/project/pitrou/arrow/builds/40577420
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org