You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/12 00:30:32 UTC

[GitHub] [arrow-rs] chadbrewbaker edited a comment on issue #703: Empty or null list of struct cannot be written to parquet

chadbrewbaker edited a comment on issue #703:
URL: https://github.com/apache/arrow-rs/issues/703#issuecomment-991809095


   This line of JSON is barfing in json2parquet with: 
   
   ```bash
   thread 'main' panicked at 'Cannot filter indices on a non-primitive array, found List(true)'
   ```
   https://github.com/apache/arrow-rs/blob/e0abda2c178be0c38d4257d22de2e4a3bfafde82/parquet/src/arrow/levels.rs#L757
   
   ```json
   {"ts":1331901001.88,"fuid":"Fd3cGk2agqUftBeFx4","tx_hosts":["192.168.229.251"],"rx_hosts":["192.168.202.79"],"conn_uids":["CaJMZy195M8cuXfxn4"],"source":"HTTP","depth":0,"analyzers":[],"mime_type":"text/html","duration":0.0,"is_orig":false,"seen_bytes":1433,"total_bytes":1433,"missing_bytes":0,"overflow_bytes":0,"timedout":false}
   ```
   
   The Python bindings handle this just fine.
   
   ```python 
   from pyarrow import json
   fn = 'mini.json'
   table = json.read_json(fn)
   print(table)
   ```
   ```bash
   pyarrow.Table
   ts: double
   fuid: string
   tx_hosts: list<item: string>
     child 0, item: string
   rx_hosts: list<item: string>
     child 0, item: string
   conn_uids: list<item: string>
     child 0, item: string
   source: string
   depth: int64
   analyzers: list<item: null>
     child 0, item: null
   mime_type: string
   duration: double
   is_orig: bool
   seen_bytes: int64
   total_bytes: int64
   missing_bytes: int64
   overflow_bytes: int64
   timedout: bool
   ----
   ts: [[1331901001.88]]
   fuid: [["Fd3cGk2agqUftBeFx4"]]
   tx_hosts: [[["192.168.229.251"]]]
   rx_hosts: [[["192.168.202.79"]]]
   conn_uids: [[["CaJMZy195M8cuXfxn4"]]]
   source: [["HTTP"]]
   depth: [[0]]
   analyzers: [[0 nulls]]
   mime_type: [["text/html"]]
   duration: [[0]]
   ...
   ```
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org