You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "FlyTOmeLight (via GitHub)" <gi...@apache.org> on 2024/03/30 15:52:09 UTC

[I] [Python]pyarrow.json.read_json when read indent json file with report error [arrow]

FlyTOmeLight opened a new issue, #40912:
URL: https://github.com/apache/arrow/issues/40912

   ### Describe the bug, including details regarding any error messages, version, and platform.
   
   pyarrow version: 14.0.2
   
       pajson.read_json("indent.json")
     File "pyarrow/_json.pyx", line 308, in pyarrow._json.read_json
     File "pyarrow/error.pxi", line 154, in pyarrow.lib.pyarrow_internal_check_status
     File "pyarrow/error.pxi", line 91, in pyarrow.lib.check_status
   pyarrow.lib.ArrowInvalid: JSON parse error: Column() changed from object to string in row 0
   
   <code>
   import pyarrow.json as pajson
   pajson.read_json("indent.json")
   </code>
   
   when i write indent.json, i use json.dump(raw_data, fp, ensure_ascii=False, indent=4)
   and then i use pajson.read_json, that bug will be report, i wonder know how to fix it.
   here is my wrong json.
   [wrong.json](https://github.com/apache/arrow/files/14812530/wrong.json)
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python]pyarrow.json.read_json when read indent json file with report error [arrow]

Posted by "martsec (via GitHub)" <gi...@apache.org>.
martsec commented on issue #40912:
URL: https://github.com/apache/arrow/issues/40912#issuecomment-2047372738

   As far as I am aware, Arrow only supports to read line-delimited JSON files ([see docs and note](https://arrow.apache.org/docs/python/json.html))
   
   Though there it seems to be a couple options that could help with reading your json https://arrow.apache.org/docs/python/generated/pyarrow.json.ParseOptions.html#pyarrow.json.ParseOptions 
   
   > newlines_in_values[bool](https://docs.python.org/3/library/stdtypes.html#bltin-boolean-values), optional (default [False](https://docs.python.org/3/library/constants.html#False))
   >
   > Whether objects may be printed across multiple lines (for example pretty printed). If false, input must end with an empty line.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org