You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/21 10:13:31 UTC

[GitHub] [arrow-datafusion] thinkharderdev commented on issue #2292: Add SchemaAdapterExec

thinkharderdev commented on issue #2292:
URL: https://github.com/apache/arrow-datafusion/issues/2292#issuecomment-1105010711

   I have some concerns about this. The problem is that this sort of assumes that we actually know at planning time what the schema for each individual file is in a `ListingScan`. And if you infer the schemas at planning and merge then together to get the table schema then that is true. But since this happens during planning and can be quite expensive, I suspect that real world use cases will leverage some sort of metadata catalog to get the merged schema for a logical table instead of re-deriving it for each query. In that case we have no idea what the individual file schemas are. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org