You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/06 15:57:38 UTC

[GitHub] [arrow-rs] crepererum edited a comment on issue #264: Include NaN in Parquet stats (again)

crepererum edited a comment on issue #264:
URL: https://github.com/apache/arrow-rs/issues/264#issuecomment-833629444


   # Prior Art
   
   ## PostgreSQL
   > In most implementations of the “not-a-number” concept, NaN is not considered equal to any other numeric value (including NaN). In order to allow numeric values to be sorted and used in tree-based indexes, PostgreSQL treats NaN values as equal, and greater than all non-NaN values.
   
   https://www.postgresql.org/docs/13/datatype-numeric.html
   
   **=> Follows this proposal**
   
   ## CockroachDB
   Follows PostgreSQL, see https://github.com/cockroachdb/cockroach/issues/18860
   
   **=> Follows this proposal**
   
   ## Oracle
   > The floating-point value NaN is greater than any other numeric value and is equal to itself.
   
   https://docs.oracle.com/database/121/TTSQL/types.htm#TTSQL165
   
   **=> Follows this proposal**
   
   ## Snowflake
   > Comparison semantics for `'NaN'` differ from the IEEE 754 standard in the following ways:
   >
   > Condition | Snowflake | IEEE 754 | Comment
   > -- | -- | -- | --
   > `'NaN' = 'NaN'` | `TRUE` | `FALSE` | In Snowflake, `'NaN'` values are all equal.
   > `'NaN' > X`  where `X` is any FLOAT value, including  infinity (other than `NaN` itself). | `TRUE` | `FALSE` | Snowflake treats `'NaN'` as greater  than any other FLOAT value,  including infinity.
   
   
   https://docs.snowflake.com/en/sql-reference/data-types-numeric.html#float-float4-float8
   
   **=> Follows this proposal**
   
   ## Vertica
   > Vertica follows the IEEE definition of NaNs (IEEE 754). The SQL standards do not specify how floating point works in detail. [...]
   > 
   > ### Rules
   > -  -0 == +0
   > - 1/0 = Infinity
   > - 0/0 == Nan
   > - NaN != anything (even NaN)
   >
   > [...]
   >
   > ### Sort Order (Ascending)
   > - NaN
   > - -Inf
   > - numbers
   > - +Inf
   > - NULL
   
   https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/DataTypes/Numeric/DOUBLEPRECISIONFLOAT.htm
   
   **=> Does NOT follow this proposal**
   
   ## SQL Standard
   Not open, and scanning through the drafts I've found does not reveal more than "implementation defined". I trust Vertica when they say:
   
   > The SQL standards do not specify how floating point works in detail.
   
   https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/DataTypes/Numeric/DOUBLEPRECISIONFLOAT.htm
   
   **=> Undefined**
   
   ## Rust Std
   Normally floats are only partially ordered, but there is some proposed nightly API that orders them:
   
   > The values are ordered in following order:
   >
   > - Negative quiet NaN
   > - Negative signaling NaN
   > - Negative infinity
   > - Negative numbers
   > - Negative subnormal numbers
   > - Negative zero
   > - Positive zero
   > - Positive subnormal numbers
   > - Positive numbers
   > - Positive infinity
   > - Positive signaling NaN
   > - Positive quiet NaN
   
   
   https://doc.rust-lang.org/nightly/std/primitive.f64.html#method.total_cmp
   
   ## Rust `ordered-float`
   > NaN is sorted as _greater_ than all other values and _equal_ to itself, in contradiction with the IEEE standard.
   
   https://docs.rs/ordered-float/2.2.0/ordered_float/struct.OrderedFloat.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org