You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/11/19 18:30:05 UTC

[GitHub] [arrow] jorgecarleitao edited a comment on pull request #8715: ARROW-10656: [Rust] Use DataType comparison without values

jorgecarleitao edited a comment on pull request #8715:
URL: https://github.com/apache/arrow/pull/8715#issuecomment-730555090


   Hey @ch-sc , thanks for your PR!
   
   @nevi-me, could you help here? I am a bit worried about introducing another comparison of datatypes, but I was unable to find anything in the specification stating that a DataType of a ListArray should have a field name.
   
   My main concern here is that this change would allow the following: I receive a `RecordBatch`, and I want to verify that it is consistent. While doing so, I end up reaching to the conclusion that `batch.schema().field(0).data_type() != batch.column(0).data_type()`. IMO this goes against the whole idea of having a `RecordBatch` in the first place.
   
   OTOH, I also understand the motivation for this change: if the field name is irrelevant, then it should not be used in the comparison.
   
   My feeling is that if we need to introduce a different comparison, this often hints that there is useless information on the `DataType` that we should eliminate. If it is useless but needs to stay for some reason, then my suggestion is that we implement a custom `PartialEq` that ignores it, so that there is a common source of truth about whether two datatypes are equal.
   
   What are your thoughts @nevi-me, @alamb and @ch-sc ?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org