You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "metesynnada (via GitHub)" <gi...@apache.org> on 2023/07/24 12:49:41 UTC

[GitHub] [arrow-datafusion] metesynnada opened a new issue, #7067: Allow providing file schema for parquet files

metesynnada opened a new issue, #7067:
URL: https://github.com/apache/arrow-datafusion/issues/7067

   ### Is your feature request related to a problem or challenge?
   
   Currently, [manually specifying the schema for a Parquet file will error](https://github.com/apache/arrow-datafusion/blob/49c91b563ad894b2f368690d85402895bdeaa73a/datafusion/sql/src/statement.rs#L636). There is no particular reason for this, we may allow schema providing for parquet files as well.
   
   This is a blocker for the https://github.com/apache/arrow-datafusion/issues/7036.
   
   ### Describe the solution you'd like
   
   - Add schema field in `ParquetReadOptions`
   - Remove the error message.
   - Add tests when a schema is provided.
   
   ### Describe alternatives you've considered
   
   NA
   
   ### Additional context
   
   NA


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #7067: Allow providing file schema for parquet files

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #7067:
URL: https://github.com/apache/arrow-datafusion/issues/7067#issuecomment-1657116790

   Another usecase for manually defining a parquet file's schema is to provide constraints like `PRIMARY KEY` which are not encoded in the parquet file itself


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org