You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/08/06 21:23:22 UTC

[GitHub] [arrow] westonpace opened a new pull request #10897: ARROW-13580: [C++] quoted_strings_can_be_null only applied to string columns

westonpace opened a new pull request #10897:
URL: https://github.com/apache/arrow/pull/10897


   Allow quoted nulls if quoted_strings_can_be_null is true


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #10897: ARROW-13580: [C++] quoted_strings_can_be_null only applied to string columns

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #10897:
URL: https://github.com/apache/arrow/pull/10897#issuecomment-908556559


   AppVeyor build on my fork: https://ci.appveyor.com/project/pitrou/arrow/builds/40577420


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] westonpace edited a comment on pull request #10897: ARROW-13580: [C++] quoted_strings_can_be_null only applied to string columns

Posted by GitBox <gi...@apache.org>.
westonpace edited a comment on pull request #10897:
URL: https://github.com/apache/arrow/pull/10897#issuecomment-895450555


   Ah, if this is a behavior change then I can update that docstring as well.  I'll add, in support of my argument, that we parse quoted non-nulls as integers.  For example:
   
   ```
   >>> pyarrow.csv.read_csv(io.BytesIO(b'A\n"1"\n"2"\n'))
   pyarrow.Table
   A: int64
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10897: ARROW-13580: [C++] quoted_strings_can_be_null only applied to string columns

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10897:
URL: https://github.com/apache/arrow/pull/10897#issuecomment-894525563


   https://issues.apache.org/jira/browse/ARROW-13580


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kkraus14 commented on pull request #10897: ARROW-13580: [C++] quoted_strings_can_be_null only applied to string columns

Posted by GitBox <gi...@apache.org>.
kkraus14 commented on pull request #10897:
URL: https://github.com/apache/arrow/pull/10897#issuecomment-895417384


   Agreed that it's beneficial to change the behavior and docstring to allow for treating empty quoted strings as null in the case of numeric columns.
   
   Bigger picture, CSVs are a mess where we should ideally allow controlling options like `null_values`, `true_values`, `false_values`, `quoted_strings_can_be_null` on a per-column level in addition to globally.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] westonpace commented on pull request #10897: ARROW-13580: [C++] quoted_strings_can_be_null only applied to string columns

Posted by GitBox <gi...@apache.org>.
westonpace commented on pull request #10897:
URL: https://github.com/apache/arrow/pull/10897#issuecomment-895450555


   Ah, if this is a behavior change then I can update that docstring as well.  I'll add, in support of my argument, that we parsed quoted non-nulls as integers.  For example:
   
   ```
   >>> pyarrow.csv.read_csv(io.BytesIO(b'A\n"1"\n"2"\n'))
   pyarrow.Table
   A: int64
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou closed pull request #10897: ARROW-13580: [C++] quoted_strings_can_be_null only applied to string columns

Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #10897:
URL: https://github.com/apache/arrow/pull/10897


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #10897: ARROW-13580: [C++] quoted_strings_can_be_null only applied to string columns

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #10897:
URL: https://github.com/apache/arrow/pull/10897#issuecomment-894645853


   Well, the docstring looks quite clear to me:
   ```c++
     /// Whether string / binary columns can have quoted null values.
     ///
     /// If true *and* `strings_can_be_null` is true, then quoted strings in
     /// "null_values" are also considered null for string columns.  Otherwise,
     /// quoted strings are never considered null.
   ```
   
   Perhaps it's beneficial to change the behaviour?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou closed pull request #10897: ARROW-13580: [C++] quoted_strings_can_be_null only applied to string columns

Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #10897:
URL: https://github.com/apache/arrow/pull/10897


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #10897: ARROW-13580: [C++] quoted_strings_can_be_null only applied to string columns

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #10897:
URL: https://github.com/apache/arrow/pull/10897#issuecomment-908556559


   AppVeyor build on my fork: https://ci.appveyor.com/project/pitrou/arrow/builds/40577420


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org