You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "pkrefta (via GitHub)" <gi...@apache.org> on 2023/03/01 11:47:13 UTC

[GitHub] [arrow] pkrefta opened a new issue, #34397: How to handle None values when writing a parquet using pyarrow

pkrefta opened a new issue, #34397:
URL: https://github.com/apache/arrow/issues/34397

   ### Describe the usage question you have. Please include as many useful details as  possible.
   
   
   Hi,
   
   I'm trying to create very simple Parquet file that contains None values
   
   ```
   import pyarrow as pa
   import pyarrow.parquet as pq
   
   
   schema = pa.schema([('field', pa.int64())])
   
   # THIS WORKS
   with pq.ParquetWriter('out.parquet', schema=schema) as writer:
       table = pa.Table.from_pydict({'field': [None, 1]})
       writer.write_table(table)
   
   # THIS DOESN'T
   with pq.ParquetWriter('out.parquet', schema=schema) as writer:
       table = pa.Table.from_pydict({'field': [None, None]})
       writer.write_table(table)
   
   with pq.ParquetWriter('out.parquet', schema=schema) as writer:
       table = pa.Table.from_pydict({'field': [None,]})
       writer.write_table(table)
   ```
   
   When I'm trying to write two None values to column it returns an error
   ```
   ValueError: Table schema does not match schema used to create file: 
   table:
   field: null vs. 
   file:
   field: int64
   ```
   
   Is there any way to write only None values to a column ?
   
   ### Component(s)
   
   Parquet, Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] AlenkaF commented on issue #34397: How to handle None values when writing a parquet using pyarrow

Posted by "AlenkaF (via GitHub)" <gi...@apache.org>.
AlenkaF commented on issue #34397:
URL: https://github.com/apache/arrow/issues/34397#issuecomment-1450058994

   You will need to pass `schema` to `from_pydict()` when constructing pyarrow Table like so:
   
   ```python
   >>> pa.Table.from_pydict({'field': [None, None]}, schema=schema)
   pyarrow.Table
   field: int64
   ----
   field: [[null,null]]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pkrefta closed issue #34397: [Python] How to handle None values when writing a parquet using pyarrow

Posted by "pkrefta (via GitHub)" <gi...@apache.org>.
pkrefta closed issue #34397: [Python] How to handle None values when writing a parquet using pyarrow
URL: https://github.com/apache/arrow/issues/34397


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pkrefta commented on issue #34397: [Python] How to handle None values when writing a parquet using pyarrow

Posted by "pkrefta (via GitHub)" <gi...@apache.org>.
pkrefta commented on issue #34397:
URL: https://github.com/apache/arrow/issues/34397#issuecomment-1450099103

   Thank you @AlenkaF for your help - it works now 😃


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org