You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by ya...@apache.org on 2020/04/25 00:03:08 UTC
[spark] branch master updated: [SPARK-31491][SQL][DOCS] Re-arrange
Data Types page to document Floating Point Special Values
This is an automated email from the ASF dual-hosted git repository.
yamamuro pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 054bef9 [SPARK-31491][SQL][DOCS] Re-arrange Data Types page to document Floating Point Special Values
054bef9 is described below
commit 054bef94ca7e84ff8e2e27af65e00e183f7be6da
Author: Huaxin Gao <hu...@us.ibm.com>
AuthorDate: Sat Apr 25 09:02:16 2020 +0900
[SPARK-31491][SQL][DOCS] Re-arrange Data Types page to document Floating Point Special Values
### What changes were proposed in this pull request?
Re-arrange Data Types page to document Floating Point Special Values
### Why are the changes needed?
To complete SQL Reference
### Does this PR introduce any user-facing change?
Yes
- add Floating Point Special Values in Data Types page
- move NaN Semantics to Data Types page
<img width="1050" alt="Screen Shot 2020-04-24 at 9 14 57 AM" src="https://user-images.githubusercontent.com/13592258/80233996-3da25600-860c-11ea-8285-538efc16e431.png">
<img width="1050" alt="Screen Shot 2020-04-24 at 9 15 22 AM" src="https://user-images.githubusercontent.com/13592258/80234001-4004b000-860c-11ea-8954-72f63c92d50d.png">
<img width="1049" alt="Screen Shot 2020-04-24 at 9 15 44 AM" src="https://user-images.githubusercontent.com/13592258/80234006-41ce7380-860c-11ea-96bf-15e1aa2102ff.png">
### How was this patch tested?
Manually build and check
Closes #28264 from huaxingao/datatypes.
Authored-by: Huaxin Gao <hu...@us.ibm.com>
Signed-off-by: Takeshi Yamamuro <ya...@apache.org>
---
docs/_data/menu-sql.yaml | 2 -
docs/sql-ref-datatypes.md | 119 ++++++++++++++++++++++++++++++++++++++++++
docs/sql-ref-nan-semantics.md | 29 ----------
3 files changed, 119 insertions(+), 31 deletions(-)
diff --git a/docs/_data/menu-sql.yaml b/docs/_data/menu-sql.yaml
index 26cca61..1097079 100644
--- a/docs/_data/menu-sql.yaml
+++ b/docs/_data/menu-sql.yaml
@@ -84,8 +84,6 @@
url: sql-ref-literals.html
- text: Null Semantics
url: sql-ref-null-semantics.html
- - text: NaN Semantics
- url: sql-ref-nan-semantics.html
- text: ANSI Compliance
url: sql-ref-ansi-compliance.html
subitems:
diff --git a/docs/sql-ref-datatypes.md b/docs/sql-ref-datatypes.md
index 150e194..0d49f6f 100644
--- a/docs/sql-ref-datatypes.md
+++ b/docs/sql-ref-datatypes.md
@@ -19,6 +19,8 @@ license: |
limitations under the License.
---
+### Supported Data Types
+
Spark SQL and DataFrames support the following data types:
* Numeric types
@@ -706,3 +708,120 @@ The following table shows the type names as well as aliases used in Spark SQL pa
</table>
</div>
</div>
+
+### Floating Point Special Values
+
+Spark SQL supports several special floating point values in a case-insensitive manner:
+
+ * Inf/+Inf/Infinity/+Infinity: positive infinity
+ * ```FloatType```: equivalent to Scala <code>Float.PositiveInfinity</code>.
+ * ```DoubleType```: equivalent to Scala <code>Double.PositiveInfinity</code>.
+ * -Inf/-Infinity: negative infinity
+ * ```FloatType```: equivalent to Scala <code>Float.NegativeInfinity</code>.
+ * ```DoubleType```: equivalent to Scala <code>Double.NegativeInfinity</code>.
+ * NaN: not a number
+ * ```FloatType```: equivalent to Scala <code>Float.NaN</code>.
+ * ```DoubleType```: equivalent to Scala <code>Double.NaN</code>.
+
+#### Positive/Negative Infinity Semantics
+
+There is special handling for positive and negative infinity. They have the following semantics:
+
+ * Positive infinity multiplied by any positive value returns positive infinity.
+ * Negative infinity multiplied by any positive value returns negative infinity.
+ * Positive infinity multiplied by any negative value returns negative infinity.
+ * Negative infinity multiplied by any negative value returns positive infinity.
+ * Positive/negative infinity multiplied by 0 returns NaN.
+ * Positive/negative infinity is equal to itself.
+ * In aggregations, all positive infinity values are grouped together. Similarly, all negative infinity values are grouped together.
+ * Positive infinity and negative infinity are treated as normal values in join keys.
+ * Positive infinity sorts lower than NaN and higher than any other values.
+ * Negative infinity sorts lower than any other values.
+
+#### NaN Semantics
+
+There is special handling for not-a-number (NaN) when dealing with `float` or `double` types that
+do not exactly match standard floating point semantics.
+Specifically:
+
+ * NaN = NaN returns true.
+ * In aggregations, all NaN values are grouped together.
+ * NaN is treated as a normal value in join keys.
+ * NaN values go last when in ascending order, larger than any other numeric value.
+
+#### Examples
+
+{% highlight sql %}
+SELECT double('infinity') AS col;
++--------+
+| col|
++--------+
+|Infinity|
++--------+
+
+SELECT float('-inf') AS col;
++---------+
+| col|
++---------+
+|-Infinity|
++---------+
+
+SELECT float('NaN') AS col;
++---+
+|col|
++---+
+|NaN|
++---+
+
+SELECT double('infinity') * 0 AS col;
++---+
+|col|
++---+
+|NaN|
++---+
+
+SELECT double('-infinity') * (-1234567) AS col;
++--------+
+| col|
++--------+
+|Infinity|
++--------+
+
+SELECT double('infinity') < double('NaN') AS col;
++----+
+| col|
++----+
+|true|
++----+
+
+SELECT double('NaN') = double('NaN') AS col;
++----+
+| col|
++----+
+|true|
++----+
+
+SELECT double('inf') = double('infinity') AS col;
++----+
+| col|
++----+
+|true|
++----+
+
+CREATE TABLE test (c1 int, c2 double);
+INSERT INTO test VALUES (1, double('infinity'));
+INSERT INTO test VALUES (2, double('infinity'));
+INSERT INTO test VALUES (3, double('inf'));
+INSERT INTO test VALUES (4, double('-inf'));
+INSERT INTO test VALUES (5, double('NaN'));
+INSERT INTO test VALUES (6, double('NaN'));
+INSERT INTO test VALUES (7, double('-infinity'));
+SELECT COUNT(*), c2 FROM test GROUP BY c2;
++---------+---------+
+| count(1)| c2|
++---------+---------+
+| 2| NaN|
+| 2|-Infinity|
+| 3| Infinity|
++---------+---------+
+{% endhighlight %}
\ No newline at end of file
diff --git a/docs/sql-ref-nan-semantics.md b/docs/sql-ref-nan-semantics.md
deleted file mode 100644
index f6a8572..0000000
--- a/docs/sql-ref-nan-semantics.md
+++ /dev/null
@@ -1,29 +0,0 @@
----
-layout: global
-title: Nan Semantics
-displayTitle: NaN Semantics
-license: |
- Licensed to the Apache Software Foundation (ASF) under one or more
- contributor license agreements. See the NOTICE file distributed with
- this work for additional information regarding copyright ownership.
- The ASF licenses this file to You under the Apache License, Version 2.0
- (the "License"); you may not use this file except in compliance with
- the License. You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
----
-
-There is specially handling for not-a-number (NaN) when dealing with `float` or `double` types that
-does not exactly match standard floating point semantics.
-Specifically:
-
- - NaN = NaN returns true.
- - In aggregations, all NaN values are grouped together.
- - NaN is treated as a normal value in join keys.
- - NaN values go last when in ascending order, larger than any other numeric value.
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org