You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by "Dr. Kent Yao" <ya...@qq.com> on 2019/12/05 11:13:54 UTC

[PROPOSAL] Support ANSI type real/numeric as synonyms for float/decimal

Hi all,
   
For better SQL standard support, I recently opened a pull request
https://github.com/apache/spark/pull/26537 to support real type as float and
numeric as decimal. We have researched a bit and discussed it among several
contributors/committers. Sending the email to the dev list to welcome more
comments.

Strictly speaking, in SQL standard, the real and the float, the numeric and
the decimal are not exactly the same.

> FLOAT specifies the data type approximate numeric, with binary precision
> equal to or greater than the value of the specified 
> <precision>
> . The maximum value of 
> <precision>
>  is implementation-defined.
> <precision>
>  shall not be greater than this value.
> 
> REAL specifies the data type approximate numeric, with
> implementation-defined precision.
> 
> For the exact numeric types DECIMAL and NUMERIC:
> a) The maximum value of precision is implementation-defined. precision
> shall not be greater than this value.
> b) The maximum value of scale is implementation-defined. scale shall not
> be greater than this maximum value.
> 
> NUMERIC specifies the data type exact numeric, with the decimal precision
> and scale specified by the precision and scale.
> 
> DECIMAL specifies the data type exact numeric, with the decimal scale
> specified by the scale and the implementation-defined decimal precision
> equal to or greater than the value of the specified precision.

Many modern databases do not follow this exactly, and treat real/float(4),
numeric/decimal as the same as their "implementation-defined".


> An SQL-implementation is permitted to regard certain s as equivalent, if
> they have the same precision, scale, and radix, as permitted by the Syntax
> Rules of Subclause 6.1. When two
> or more s are equivalent, the SQL-implementation chooses one of these
> equivalent s as the normal form representing that equivalence class of s.
> The normal form determines the name of the exact numeric type in the
> numeric type descriptor.
> Similarly, an SQL-implementation is permitted to regard certain s as
> equivalent, as permitted by the Syntax Rules of Subclause 6.1, in which
> case the SQL-implementation chooses a normal form to represent each
> equivalence class of and the normal form determines the name of the
> approximate numeric type.

In Spark's "implementation-defined", the float satisfies the meaning of SQL
standard' real, and the numeric satisfies the decimal. So I propose we
support real/numeric types as synonyms for float/decimal for better ANSI
support.

How do other modern databases deal with these types? Please forgive me for
not doing the copy-paste work, and you may refer to the link here:
https://github.com/apache/spark/pull/26537#discussion_r346870158, including
PostgreSQL, Mimer SQL Engine, PrestoSQL, MS SQL Server, etc.

All comments are welcome!

Best regards,
Kent



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org