You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "paolosartiprom (via GitHub)" <gi...@apache.org> on 2023/09/06 10:59:54 UTC

[GitHub] [arrow] paolosartiprom commented on issue #37583: [Python] pyarrow.dataset.write_dataset doesn't write empty datasets, even if they had a schema, thus losing it when reading back the dataset

paolosartiprom commented on issue #37583:
URL: https://github.com/apache/arrow/issues/37583#issuecomment-1708124712

   Well, what I expected was that if I write a table with a certain schema to a dataset, when reading it back it would keep its schema (and metadata). Instead it just skips writing altogether.
   
   The workaround/solution that I found is to write a parquet file inside the folder. As the parquet file itself supports writing an empty table, keeping schema and metadata information.
   
   The important part for me is that schema and metadata information should be preserved even when the data is empty when reading the dataset back.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org