You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/15 22:28:55 UTC

[GitHub] [arrow] westonpace commented on issue #11967: Parquet schema / data type for entire null object DataFrame columns

westonpace commented on issue #11967:
URL: https://github.com/apache/arrow/issues/11967#issuecomment-995263886


   Copying my question from the gist:
   
   What is parquet_file.schema[0].logical_type? For me, if I do not specify a schema, it is Null (which is different than None). In your first snippet the logical type is None so I assume you are specifying the schema when writing.
   
   Perhaps you have some files with Null logical type and some with None logical type? This could explain the behavior as the new datasets API infers the schema from a single file (picked more or less at random). So if it picked one of the null ones then you may end up with the behavior you are describing.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org