You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2022/09/14 06:57:00 UTC

[jira] [Commented] (ARROW-17719) [Python] Improve error message when all values in a column are null in a parquet partition

    [ https://issues.apache.org/jira/browse/ARROW-17719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603912#comment-17603912 ] 

Joris Van den Bossche commented on ARROW-17719:
-----------------------------------------------

Yes, that's the consequence of the default of inferring the schema only from the first file. 
If such a cast errors happens for subsequent files, it would indeed be good to hint the user to the fact that there is a mismatching schema and that the solution could be to provide a schema manually.

> [Python] Improve error message when all values in a column are null in a parquet partition
> ------------------------------------------------------------------------------------------
>
>                 Key: ARROW-17719
>                 URL: https://issues.apache.org/jira/browse/ARROW-17719
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>    Affects Versions: 9.0.0
>            Reporter: Philipp Moritz
>            Priority: Minor
>             Fix For: 10.0.0
>
>
> There is a good bug report about this in [https://stackoverflow.com/a/70568419/10891801] and it still seems to be a problem.
> Basically the error message is pretty bad if all values in a given column of a parquet partition are null. We should either handle this case better or give a better error message.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)