You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/09 12:31:13 UTC

[GitHub] [arrow-rs] alamb commented on issue #1666: Handling Unsupported Arrow Types in Parquet

alamb commented on issue #1666:
URL: https://github.com/apache/arrow-rs/issues/1666#issuecomment-1121038516

   > I think that we should come out a consistent approach on dealing with this kind of Arrow types. 
   
   100% agree with @viirya  on this one
   
   > The final option is the simplest to implement, the least surprising to users, and what I would propose implementing. It would break ecosystem interoperability in certain cases, but I think it is more important that we faithfully round-trip data than maintain maximal compatibility.
   
   I agree with @jorgecarleitao  and @tustvold  that this (option 3) would be best for some users (such as IOx), but as @viirya  mentions this may not be ideal from a "least surprising" point of view with people using a broader selection of ecosystem tools
   
   Maybe we could add an option to `WriterProperties` that allows choosing the tradeoff desired. Something like
   
   ```rust
     /// Converts data prior to writing to maximize compatibility with other 
     /// parquet implementations. This may result in losing precision on some data
     /// types that can not be natively represented, such as `Timestamp(Nanosecond)`
     /// as parquet only supports millisecond precision timestamps natively
     maximize_ecosystem_compatibility: bool
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org