You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "sidneymau (via GitHub)" <gi...@apache.org> on 2023/11/16 00:15:04 UTC

Re: [I] [Python] write_dataset does not preserve field metadata from schema [arrow]

sidneymau commented on issue #37054:
URL: https://github.com/apache/arrow/issues/37054#issuecomment-1813501412

   Apologies for my own late reply! In 13.0.0, running this code shows that the dataset writer preserves the metadata:
   ```
   Dataset schema
   field_1: double
     -- field metadata --
     field_1: 'description of field 1'
   field_2: int32
     -- field metadata --
     field_2: 'description of field 2'
   -- schema metadata --
   pyarrow version: '13.0.0'
   
   Writer schema
   field_1: double
     -- field metadata --
     field_1: 'description of field 1'
   field_2: int32
     -- field metadata --
     field_2: 'description of field 2'
   -- schema metadata --
   pyarrow version: '13.0.0'
   
   Dataset schema == Writer schema? True
   ```
   The comment above about schema equality not checking metadata still holds, but that appears to be desired behavior (and it's easy enough to compare `schema.metadata` between two schemas if one wants to check that).
   
   Closing to reduce noise!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org