You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/07/15 16:00:15 UTC

[GitHub] [arrow] jorisvandenbossche commented on pull request #8912: ARROW-8221: [Python][Dataset] Expose schema inference/validation factory options through the validate_schema keyword

jorisvandenbossche commented on pull request #8912:
URL: https://github.com/apache/arrow/pull/8912#issuecomment-880815839


   Yeah, sorry for the slow follow-up here. It was on my to do list to have a look at today.
   
   > Naming it unify_schemas or infer_schemas or similar makes sense to me. I see only three main cases:
   > ...
   > - Schema is inferred (nothing to validate against!)
   
   But for this last case, you might still have the options of inferring from the first fragment, or reading the schema of all fragments and unifying them (or erroring when they can't be unified).
   
   So if we have eg a `infer_schemas`, I suppose we want to keep the different types of arguments what I now did for `validate_schema` (True meaning infer schema of all fragments, and an integer meaning the number of fragments to check).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org