You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "Wes McKinney (Jira)" <ji...@apache.org> on 2019/08/26 15:28:00 UTC
[jira] [Created] (ARROW-6359) [C++] Raw data equality in arrays vs.
semantic value equality
Wes McKinney created ARROW-6359:
-----------------------------------
Summary: [C++] Raw data equality in arrays vs. semantic value equality
Key: ARROW-6359
URL: https://issues.apache.org/jira/browse/ARROW-6359
Project: Apache Arrow
Issue Type: Improvement
Components: C++
Reporter: Wes McKinney
I have observed a conflict in requirements / expectations in our {{Equals}} functions. The initial implementations of these functions would compare the raw bytes found in non-null data slots, in addition to the validity bitmaps in each array, and their respective children, taken slice offsets and so forth into account.
Recently we have been adding type-specific value comparison semantics to these comparisons, notably propagating {{NaN != NaN}}. This has led to such issues as ARROW-6043.
Rather than creating "one true way" to compare array contents, I would suggest introducing functions that perform slightly different comparisons:
* Raw data comparison, skipping masked null values
* Raw data comparison, comparing all buffer contents (up to the semantic "extent" of an array -- so ignoring the contents of padding, or excess buffer contents when dealing with slices)
thoughts?
--
This message was sent by Atlassian Jira
(v8.3.2#803003)