You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by "ByteBaker (via GitHub)" <gi...@apache.org> on 2023/04/08 14:41:16 UTC

[GitHub] [arrow-rs] ByteBaker commented on issue #3373: Handle BYTE_ARRAY physical type in arrow-json (be able to load files output from pandas with no dtypes)

ByteBaker commented on issue #3373:
URL: https://github.com/apache/arrow-rs/issues/3373#issuecomment-1500904949

   I was facing the same problem a few days back and came here to create an issue, then found this.
   
   To confirm, I loaded the original file into pandas and then saved to another one using `pyarrow` as the engine this time and the problem was gone.
   
   The problem is, our dataset is quite large (hundreds of GBs of parquet). And it'll be a daunting task to reload everything. What should I do to handle this issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org