You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "haohuaijin (via GitHub)" <gi...@apache.org> on 2023/10/16 10:48:37 UTC

[I] Should we make blank values and empty string to `None` in csv? [arrow-rs]

haohuaijin opened a new issue, #4939:
URL: https://github.com/apache/arrow-rs/issues/4939

   **Which part is this question about**
   <!--
   Is it code base, library api, documentation or some other part?
   -->
   arrow-csv
   https://github.com/apache/arrow-rs/blob/31bc84c91e7d6c509443f6e73bda0df32a0a5cba/arrow-csv/src/reader/mod.rs#L792-L795
   
   **Describe your question**
   <!--
   A clear and concise description of what the question is.
   -->
   related to https://github.com/apache/arrow-datafusion/issues/7797
   In our current implement, we make blank values and empty string to empty string.
   But in [spark,](https://mrpowers.medium.com/sparks-treatment-of-empty-strings-and-null-values-in-csv-files-80748893451f) the blank values and empty strings are treated equally to NULL, should we also make blank values and empty value to NULL.
   
   **Additional context**
   <!--
   Add any other context about the problem here.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Should we make blank values and empty string to `None` in csv? [arrow-rs]

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on issue #4939:
URL: https://github.com/apache/arrow-rs/issues/4939#issuecomment-1768076363

   `label_issue.py` automatically added labels {'arrow'} from #4942


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Should we make blank values and empty string to `None` in csv? [arrow-rs]

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold closed issue #4939: Should we make blank values and empty string to `None` in csv?
URL: https://github.com/apache/arrow-rs/issues/4939


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Should we make blank values and empty string to `None` in csv? [arrow-rs]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #4939:
URL: https://github.com/apache/arrow-rs/issues/4939#issuecomment-1764753835

   This also makes sense to me -- thank you @haohuaijin 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Should we make blank values and empty string to `None` in csv? [arrow-rs]

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on issue #4939:
URL: https://github.com/apache/arrow-rs/issues/4939#issuecomment-1764249630

   I think making the null_regex apply to string columns makes sense to me, I honestly thought that was already the case


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Should we make blank values and empty string to `None` in csv? [arrow-rs]

Posted by "haohuaijin (via GitHub)" <gi...@apache.org>.
haohuaijin commented on issue #4939:
URL: https://github.com/apache/arrow-rs/issues/4939#issuecomment-1764298554

   I find we don't apply null_regex to string columns. If you don't mind, I want to fix him.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org