You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/02/14 22:36:28 UTC

[GitHub] [spark] planga82 opened a new pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

planga82 opened a new pull request #29837:
URL: https://github.com/apache/spark/pull/29837


   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   Add new documentation about how spark treate data type compatibility
   
   ![image](https://user-images.githubusercontent.com/12819544/98147843-ab391400-1ec4-11eb-9bc9-bd929079ecc3.png)
   ![image](https://user-images.githubusercontent.com/12819544/98148062-b9873000-1ec4-11eb-9dc6-a70af8b46450.png)
   ![image](https://user-images.githubusercontent.com/12819544/98148229-c4da5b80-1ec4-11eb-96b8-423d5a3ed87f.png)
   ![image](https://user-images.githubusercontent.com/12819544/98148410-d1f74a80-1ec4-11eb-9640-50ab78187f45.png)
   ![image](https://user-images.githubusercontent.com/12819544/98148560-dcb1df80-1ec4-11eb-919b-0c25b1c39ad6.png)
   
   
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   It's interesting for the final users
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   Only documentation
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   Not needed


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r502787239



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is the transformation of some data types into others in order to solve the expressions. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 
+
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types 
+
+Numeric expresions:
+
+|               |ByteType   |ShortType  |IntegerType |LongType  |FloatType |DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType  |FloatType |DoubleType |DoubleType |
+|**LongType**   |LongType   |LongType   |LongType    |--        |FloatType |DoubleType |DoubleType |
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType |--        |DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|--         |DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|DoubleType |--         |
+
+The case of DecimalType, is treated differently, for example, there is no common type for double and decimal because double's range is larger than decimal, and yet decimal is more precise than double, but in an expresion, we would cast the decimal into double.
+
+Time expresions:
+
+|                  |DateType     |TimestampType |
+|------------------|-------------|--------------|
+|**DateType**      |--           |TimestampType |
+|**TimestampType** |TimestampType|--            |
+
+#### Type coercion examples
+
+```sql
+DESCRIBE TABLE numericTable;
++-------------+---------+-------+
+|col_name     |data_type|comment|
++-------------+---------+-------+
+|integerColumn|int      |null   |
+|doubleColumn |double   |null   |
++-------------+---------+-------+
+
+DESCRIBE SELECT integerColumn + doubleColumn as result FROM numericTable;
++--------+---------+-------+
+|col_name|data_type|comment|
++--------+---------+-------+
+|  result|   double|   null|
++--------+---------+-------+
+
+```
+
+```sql
+DESCRIBE dateTable;
++---------------+---------+-------+
+|       col_name|data_type|comment|
++---------------+---------+-------+
+|     dateColumn|     date|   null|
+|timestampColumn|timestamp|   null|
++---------------+---------+-------+
+
+SELECT MONTHS_BETWEEN(dateColumn,timestampColumn) FROM dateTable;
+
+```
+
+#### Explicit casting and store assignment casting

Review comment:
       `Explicit casting and store assignment casting` -> `Explicit Casting and Store Assignment Casting`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r517131842



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,206 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type Conversion
+
+In general, an expression can contain different data types and type conversion is the transformation of some data types into others in order to resolve type mismatches. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type Coercion in Operations between Different Types 
+
+Type Coercion refers to the automatic or implicit conversion of values from one type to another when you need to to resolve type mismatches.
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types.
+
+**Numeric Expressions**:
+
+|               |ByteType   |ShortType  |IntegerType |LongType   |FloatType             |DoubleType            |DecimalType                  |
+|---------------|-----------|-----------|------------|-----------|----------------------|----------------------|-----------------------------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType   |FloatType             |DoubleType            |DecimalType(3,0)<sup>1</sup> |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType   |FloatType             |DoubleType            |DecimalType(5,0)<sup>1</sup> |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType   |FloatType             |DoubleType            |DecimalType(10,0)<sup>1</sup>|
+|**LongType**   |LongType   |LongType   |LongType    |--         |FloatType             |DoubleType            |DecimalType(20,0)<sup>1</sup>|
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType  |--                    |DoubleType            |DoubleType                   |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType |DoubleType            |--                    |DoubleType                   |
+|**DecimalType**|DecimalType|DecimalType|DecimalType |DecimalType|DoubleType<sup>2</sup>|DoubleType<sup>2</sup>|--                           |
+
+**Note 1**: DecimalType(precision,scale)   
+**Note 2**: In these cases DecimalType can lose precision, there is no common type for decimal and double because double's range is larger than decimal, and yet decimal is more precise than double so when we cast Decimaltype into DobleType it could lose precision.
+
+**StringType Behavior**  
+* Arithmetic Expressions: When we have an arithmetic expression with one operand of type StringType, both operands will be implicitly casted to DoubleType.
+
+    |               |ByteType   |ShortType  |IntegerType |LongType   |FloatType    |DoubleType  |
+    |---------------|-----------|-----------|------------|-----------|-------------|------------|
+    |**StringType** |DoubleType |DoubleType |DoubleType  |DoubleType |DoubleType   |DoubleType  |
+
+* Comparison: When we have a comparison expression with an operand of type StringType, the operand StringType will be casted implicitly according to the following table.
+
+    |               |ByteType   |ShortType  |IntegerType |LongType   |FloatType    |DoubleType  |DecimalType |DateType             |TimestampType             |
+    |---------------|-----------|-----------|------------|-----------|-------------|------------|------------|---------------------|--------------------------|
+    |**StringType** |ByteType   |ShortType  |IntegerType |LongType   |FloatType    |DoubleType  |DoubleType  |DateType<sup>1</sup> |TimestampType<sup>1</sup> |
+
+    **Note 1**: If `spark.sql.legacy.typeCoercion.datetimeToString` is true, DateType and TimestampType will be casted to StringType
+    
+* in, except, intersect, union, array: If the list of values has a StringType element, all the elements will be casted to StringType.
+ 
+* concat, concat_ws, array_join: All elements will be casted to StringType.
+
+* map_concat: If the list of key has a StringType element, all the keys will be casted to StringType. The same goes for the values.
+
+* if, when: If any of the results has StringType, all the results will be casted to StringType.
+
+**Time Expressions**:
+
+|                  |DateType     |TimestampType |
+|------------------|-------------|--------------|
+|**DateType**      |--           |TimestampType |
+|**TimestampType** |TimestampType|--            |
+
+
+**Possible implicit conversions**:
+
+|                  |ByteType  |ShortType |IntegerType |LongType |FloatType |DoubleType |DecimalType|StringType |BinaryType |BooleanType |TimestampType |DateType|
+|------------------|----------|----------|------------|---------|----------|-----------|-----------|-----------|-----------|------------|--------------|--------|
+|**ByteType**      |--        |X         |X           |X        |X         |X          |X          |X          |           |            |              |        |
+|**ShortType**     |X         |--        |X           |X        |X         |X          |X          |X          |           |            |              |        |
+|**IntegerType**   |X         |X         |--          |X        |X         |X          |X          |X          |           |            |              |        |
+|**LongType**      |X         |X         |X           |--       |X         |X          |X          |X          |           |            |              |        |
+|**FloatType**     |X         |X         |X           |X        |--        |X          |X          |X          |           |            |              |        |
+|**DoubleType**    |X         |X         |X           |X        |X         |--         |X          |X          |           |            |              |        |
+|**DecimalType**   |X         |X         |X           |X        |X         |X          |--         |X          |           |            |              |        |
+|**StringType**    |X         |X         |X           |X        |X         |X          |X          |--         |X          |X           |X             |X       |
+|**BinaryType**    |          |          |            |         |          |           |           |X          |--         |            |              |        |
+|**BooleanType**   |          |          |            |         |          |           |           |X          |           |--          |              |        |
+|**TimestampType** |          |          |            |         |          |           |           |X          |           |            |--            |X       |
+|**DateType**      |          |          |            |         |          |           |           |X          |           |            |X             |--      |
+
+#### Type Coercion Examples
+
+```sql
+DESCRIBE TABLE numericTable;
++-------------+---------+-------+
+|col_name     |data_type|comment|
++-------------+---------+-------+
+|integerColumn|int      |null   |
+|doubleColumn |double   |null   |
++-------------+---------+-------+
+
+DESCRIBE SELECT integerColumn + doubleColumn as result FROM numericTable;
++--------+---------+-------+
+|col_name|data_type|comment|
++--------+---------+-------+
+|  result|   double|   null|
++--------+---------+-------+
+
+```
+
+```sql
+DESCRIBE SELECT MONTHS_BETWEEN(CAST('2020-10-10' AS Date),CAST('2020-08-13' AS timestamp))
+
++------------------------------------------------------------------------------------------------+---------+-------+
+|col_name                                                                                        |data_type|comment|
++------------------------------------------------------------------------------------------------+---------+-------+
+|months_between(CAST(CAST(2020-10-10 AS DATE) AS TIMESTAMP), CAST(2020-08-13 AS TIMESTAMP), true)|double   |null   |
++------------------------------------------------------------------------------------------------+---------+-------+
+
+```
+
+```sql
+DESCRIBE SELECT 1 + '2'
+
++---------------------------------------+---------+-------+
+|col_name                               |data_type|comment|
++---------------------------------------+---------+-------+
+|(CAST(1 AS DOUBLE) + CAST(2 AS DOUBLE))|double   |null   |
++---------------------------------------+---------+-------+
+
+```
+
+```sql
+DESCRIBE SELECT 1 = '2'
+
++--------------------+---------+-------+
+|col_name            |data_type|comment|
++--------------------+---------+-------+
+|(1 = CAST(2 AS INT))|boolean  |null   |
++--------------------+---------+-------+
+
+```
+
+```sql
+DESCRIBE SELECT 1 IN ('2', 3)
+
++-------------------------------------------------------------+---------+-------+
+|col_name                                                     |data_type|comment|
++-------------------------------------------------------------+---------+-------+
+|(CAST(1 AS STRING) IN (CAST(2 AS STRING), CAST(3 AS STRING)))|boolean  |null   |
++-------------------------------------------------------------+---------+-------+
+
+```
+
+
+
+#### Explicit Casting and Store Assignment Casting

Review comment:
       Ansi mode is off by default, and Store Assignment is using ANSI policy by default. That said, these two have different behaviors by default.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
planga82 commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r502800457



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is the transformation of some data types into others in order to solve the expressions. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 
+
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types 
+
+Numeric expresions:
+
+|               |ByteType   |ShortType  |IntegerType |LongType  |FloatType |DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType  |FloatType |DoubleType |DoubleType |
+|**LongType**   |LongType   |LongType   |LongType    |--        |FloatType |DoubleType |DoubleType |
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType |--        |DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|--         |DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|DoubleType |--         |
+
+The case of DecimalType, is treated differently, for example, there is no common type for double and decimal because double's range is larger than decimal, and yet decimal is more precise than double, but in an expresion, we would cast the decimal into double.
+
+Time expresions:
+
+|                  |DateType     |TimestampType |
+|------------------|-------------|--------------|
+|**DateType**      |--           |TimestampType |
+|**TimestampType** |TimestampType|--            |
+
+#### Type coercion examples
+
+```sql
+DESCRIBE TABLE numericTable;
++-------------+---------+-------+
+|col_name     |data_type|comment|
++-------------+---------+-------+
+|integerColumn|int      |null   |
+|doubleColumn |double   |null   |
++-------------+---------+-------+
+
+DESCRIBE SELECT integerColumn + doubleColumn as result FROM numericTable;
++--------+---------+-------+
+|col_name|data_type|comment|
++--------+---------+-------+
+|  result|   double|   null|
++--------+---------+-------+
+
+```
+
+```sql
+DESCRIBE dateTable;
++---------------+---------+-------+
+|       col_name|data_type|comment|
++---------------+---------+-------+
+|     dateColumn|     date|   null|
+|timestampColumn|timestamp|   null|
++---------------+---------+-------+
+
+SELECT MONTHS_BETWEEN(dateColumn,timestampColumn) FROM dateTable;
+

Review comment:
       I don't added it because the result is an integer and it don't add anything interesting for the reader about type coercion between timestamp and date types. don't you think?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r513462200



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,200 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type Conversion
+
+In general, an expression can contain different data types and type conversion is the transformation of some data types into others in order to resolve type mismatches. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type Coercion in Operations between Different Types 
+
+Type Coercion refers to the automatic or implicit conversion of values from one type to another when you need to to resolve type mismatches.
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types.
+
+**Numeric Expressions**:
+
+|               |ByteType   |ShortType  |IntegerType |LongType   |FloatType             |DoubleType            |DecimalType                  |
+|---------------|-----------|-----------|------------|-----------|----------------------|----------------------|-----------------------------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType   |FloatType             |DoubleType            |DecimalType(3,0)<sup>1</sup> |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType   |FloatType             |DoubleType            |DecimalType(5,0)<sup>1</sup> |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType   |FloatType             |DoubleType            |DecimalType(10,0)<sup>1</sup>|
+|**LongType**   |LongType   |LongType   |LongType    |--         |FloatType             |DoubleType            |DecimalType(20,0)<sup>1</sup>|
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType  |--                    |DoubleType            |DoubleType                   |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType |DoubleType            |--                    |DoubleType                   |
+|**DecimalType**|DecimalType|DecimalType|DecimalType |DecimalType|DoubleType<sup>2</sup>|DoubleType<sup>2</sup>|--                           |
+
+**Note 1**: DecimalType(precision,scale)   
+**Note 2**: In these cases DecimalType can lose precision, there is no common type for decimal and double because double's range is larger than decimal, and yet decimal is more precise than double so when we cast Decimaltype into DobleType it could lose precision.
+
+**StringType Behavior**  
+* Arithmetic Expressions:
+
+    |               |ByteType   |ShortType  |IntegerType |LongType   |FloatType    |DoubleType  |
+    |---------------|-----------|-----------|------------|-----------|-------------|------------|
+    |**StringType** |DoubleType |DoubleType |DoubleType  |DoubleType |DoubleType   |DoubleType  |
+
+* Comparison:
+
+    |               |ByteType   |ShortType  |IntegerType |LongType   |FloatType    |DoubleType  |DecimalType |DateType             |TimestampType             |
+    |---------------|-----------|-----------|------------|-----------|-------------|------------|------------|---------------------|--------------------------|
+    |**StringType** |ByteType   |ShortType  |IntegerType |LongType   |FloatType    |DoubleType  |DoubleType  |DateType<sup>1</sup> |TimestampType<sup>1</sup> |
+
+    **Note 1**: If `spark.sql.legacy.typeCoercion.datetimeToString` is true, DateType and TimestampType will be casted to StringType
+    
+* IN Expressions: Expressions like `x IN list_values`.  If the list of values has a StringType element, all the elements will be casted to StringType  

Review comment:
       I think we need to write down all the expressions here. 
   ```
   In
   Except
   Intersect
   Union
   CreateArray
   Concat
   Sequence
   MapConcat
   CreateMap
   CaseWhen
   If
   ```
   Please check if there is any missing one
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on a change in pull request #29837: [SPARK-32463][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
planga82 commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r492821724



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,33 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+The following is the hierarchy of data type compatibility. In an operation involving different and compatible data types, these will be promoted to the lowest common top type to perform the operation.
+
+For example, if you have an add operation between an integer and a float, the integer will be treated as a float, the least common compatible type, resulting the operation in a float.
+
+The most common operations where this hierarchy is applied are:

Review comment:
       Yes, I was trying to simplify and put in this list the most popular "operations" but I can extend it. I'm going to review it. Thanks!

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,33 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+The following is the hierarchy of data type compatibility. In an operation involving different and compatible data types, these will be promoted to the lowest common top type to perform the operation.
+
+For example, if you have an add operation between an integer and a float, the integer will be treated as a float, the least common compatible type, resulting the operation in a float.
+
+The most common operations where this hierarchy is applied are:

Review comment:
       I'm not sure if explain with a lot of detail this cases is useful. Possibly it's better to remove this list to avoid misunderstand. @HyukjinKwon what do you think?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29837: [SPARK-32463][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-696660088


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 edited a comment on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
planga82 edited a comment on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-712418140


   I have updated the document with the StringType behavior and also added new examples.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] huaxingao commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
huaxingao commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r505898587



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,151 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type Conversion
+
+In general, an expression can contain different data types and type conversion is the transformation of some data types into others in order to resolve type mismatches. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type Coercion in Operations between Different Types 
+
+Type Coercion refers to the automatic or implicit conversion of values from one type to another when you need to to resolve type mismatches.
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types.
+
+**Numeric expressions**:

Review comment:
       super nit: Numeric expressions -> Numeric Expressions?

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,151 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type Conversion
+
+In general, an expression can contain different data types and type conversion is the transformation of some data types into others in order to resolve type mismatches. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type Coercion in Operations between Different Types 
+
+Type Coercion refers to the automatic or implicit conversion of values from one type to another when you need to to resolve type mismatches.
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types.
+
+**Numeric expressions**:
+
+|               |ByteType   |ShortType  |IntegerType |LongType   |FloatType             |DoubleType            |StringType |DecimalType                  |
+|---------------|-----------|-----------|------------|-----------|----------------------|----------------------|-----------|-----------------------------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType   |FloatType             |DoubleType            |DoubleType |DecimalType(3,0)<sup>1</sup> |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType   |FloatType             |DoubleType            |DoubleType |DecimalType(5,0)<sup>1</sup> |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType   |FloatType             |DoubleType            |DoubleType |DecimalType(10,0)<sup>1</sup>|
+|**LongType**   |LongType   |LongType   |LongType    |--         |FloatType             |DoubleType            |DoubleType |DecimalType(20,0)<sup>1</sup>|
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType  |--                    |DoubleType            |DoubleType |DoubleType                   |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType |DoubleType            |--                    |DoubleType |DoubleType                   |
+|**StringType** |DoubleType |DoubleType |DoubleType  |DoubleType |DoubleType            |DoubleType            |--         |DoubleType                   |
+|**DecimalType**|DecimalType|DecimalType|DecimalType |DecimalType|DoubleType<sup>2</sup>|DoubleType<sup>2</sup>|DoubleType |--                           |
+
+**Note 1**: DecimalType(precision,scale)   
+**Note 2**: In these cases DecimalType can lose precision, there is no common type for decimal and double because double's range is larger than decimal, and yet decimal is more precise than double so when we cast Decimaltype into DobleType it could lose precision.
+
+**Time expressions**:

Review comment:
       super nit: Time expressions -> Time Expressions?

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,151 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type Conversion
+
+In general, an expression can contain different data types and type conversion is the transformation of some data types into others in order to resolve type mismatches. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type Coercion in Operations between Different Types 
+
+Type Coercion refers to the automatic or implicit conversion of values from one type to another when you need to to resolve type mismatches.
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types.
+
+**Numeric expressions**:
+
+|               |ByteType   |ShortType  |IntegerType |LongType   |FloatType             |DoubleType            |StringType |DecimalType                  |
+|---------------|-----------|-----------|------------|-----------|----------------------|----------------------|-----------|-----------------------------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType   |FloatType             |DoubleType            |DoubleType |DecimalType(3,0)<sup>1</sup> |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType   |FloatType             |DoubleType            |DoubleType |DecimalType(5,0)<sup>1</sup> |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType   |FloatType             |DoubleType            |DoubleType |DecimalType(10,0)<sup>1</sup>|
+|**LongType**   |LongType   |LongType   |LongType    |--         |FloatType             |DoubleType            |DoubleType |DecimalType(20,0)<sup>1</sup>|
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType  |--                    |DoubleType            |DoubleType |DoubleType                   |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType |DoubleType            |--                    |DoubleType |DoubleType                   |
+|**StringType** |DoubleType |DoubleType |DoubleType  |DoubleType |DoubleType            |DoubleType            |--         |DoubleType                   |
+|**DecimalType**|DecimalType|DecimalType|DecimalType |DecimalType|DoubleType<sup>2</sup>|DoubleType<sup>2</sup>|DoubleType |--                           |
+
+**Note 1**: DecimalType(precision,scale)   
+**Note 2**: In these cases DecimalType can lose precision, there is no common type for decimal and double because double's range is larger than decimal, and yet decimal is more precise than double so when we cast Decimaltype into DobleType it could lose precision.
+
+**Time expressions**:
+
+|                  |DateType     |TimestampType |
+|------------------|-------------|--------------|
+|**DateType**      |--           |TimestampType |
+|**TimestampType** |TimestampType|--            |
+
+
+**Possible implicit conversions**:

Review comment:
       super nit: Possible implicit conversions -> Possible Implicit Conversions?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29837: [SPARK-32463][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-696659480






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] huaxingao commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
huaxingao commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r496069671



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,33 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+The following is the hierarchy of data type compatibility. In an operation involving different and compatible data types, these will be promoted to the lowest common top type to perform the operation.
+
+For example, if you have an add operation between an integer and a float, the integer will be treated as a float, the least common compatible type, resulting the operation in a float.
+
+The most common operations where this hierarchy is applied are:

Review comment:
       I guess maybe keep the current Hierarchy compatible types table because it shows the best to worst order, and also add a matrix format data conversion table like Oracle doc? IBM doc actually has two tables, one for Data Type Conversion Precedence List, the other is for casting between data types.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29837: [SPARK-32463][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r494072159



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,33 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+The following is the hierarchy of data type compatibility. In an operation involving different and compatible data types, these will be promoted to the lowest common top type to perform the operation.
+
+For example, if you have an add operation between an integer and a float, the integer will be treated as a float, the least common compatible type, resulting the operation in a float.
+
+The most common operations where this hierarchy is applied are:

Review comment:
       Have you checked the other docs, e.g., Oracle, SQL server, ...? If they've already described  type coercsion, I think we are able to refer to them.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
planga82 commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r516958046



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,200 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type Conversion
+
+In general, an expression can contain different data types and type conversion is the transformation of some data types into others in order to resolve type mismatches. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type Coercion in Operations between Different Types 
+
+Type Coercion refers to the automatic or implicit conversion of values from one type to another when you need to to resolve type mismatches.
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types.
+
+**Numeric Expressions**:
+
+|               |ByteType   |ShortType  |IntegerType |LongType   |FloatType             |DoubleType            |DecimalType                  |
+|---------------|-----------|-----------|------------|-----------|----------------------|----------------------|-----------------------------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType   |FloatType             |DoubleType            |DecimalType(3,0)<sup>1</sup> |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType   |FloatType             |DoubleType            |DecimalType(5,0)<sup>1</sup> |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType   |FloatType             |DoubleType            |DecimalType(10,0)<sup>1</sup>|
+|**LongType**   |LongType   |LongType   |LongType    |--         |FloatType             |DoubleType            |DecimalType(20,0)<sup>1</sup>|
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType  |--                    |DoubleType            |DoubleType                   |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType |DoubleType            |--                    |DoubleType                   |
+|**DecimalType**|DecimalType|DecimalType|DecimalType |DecimalType|DoubleType<sup>2</sup>|DoubleType<sup>2</sup>|--                           |
+
+**Note 1**: DecimalType(precision,scale)   
+**Note 2**: In these cases DecimalType can lose precision, there is no common type for decimal and double because double's range is larger than decimal, and yet decimal is more precise than double so when we cast Decimaltype into DobleType it could lose precision.
+
+**StringType Behavior**  
+* Arithmetic Expressions:
+
+    |               |ByteType   |ShortType  |IntegerType |LongType   |FloatType    |DoubleType  |
+    |---------------|-----------|-----------|------------|-----------|-------------|------------|
+    |**StringType** |DoubleType |DoubleType |DoubleType  |DoubleType |DoubleType   |DoubleType  |
+
+* Comparison:
+
+    |               |ByteType   |ShortType  |IntegerType |LongType   |FloatType    |DoubleType  |DecimalType |DateType             |TimestampType             |
+    |---------------|-----------|-----------|------------|-----------|-------------|------------|------------|---------------------|--------------------------|
+    |**StringType** |ByteType   |ShortType  |IntegerType |LongType   |FloatType    |DoubleType  |DoubleType  |DateType<sup>1</sup> |TimestampType<sup>1</sup> |
+
+    **Note 1**: If `spark.sql.legacy.typeCoercion.datetimeToString` is true, DateType and TimestampType will be casted to StringType
+    
+* IN Expressions: Expressions like `x IN list_values`.  If the list of values has a StringType element, all the elements will be casted to StringType  

Review comment:
        I have included the expressions where I have checked that type casting was done in StringType. 
   I have left out the following:
   * greatest, array_intersect, array_union, array_except: Don't support diffent types, so there will be no casting.
   * Sequence: Don't support StringType




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
planga82 commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r517453525



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,206 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type Conversion
+
+In general, an expression can contain different data types and type conversion is the transformation of some data types into others in order to resolve type mismatches. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type Coercion in Operations between Different Types 
+
+Type Coercion refers to the automatic or implicit conversion of values from one type to another when you need to to resolve type mismatches.
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types.
+
+**Numeric Expressions**:
+
+|               |ByteType   |ShortType  |IntegerType |LongType   |FloatType             |DoubleType            |DecimalType                  |
+|---------------|-----------|-----------|------------|-----------|----------------------|----------------------|-----------------------------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType   |FloatType             |DoubleType            |DecimalType(3,0)<sup>1</sup> |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType   |FloatType             |DoubleType            |DecimalType(5,0)<sup>1</sup> |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType   |FloatType             |DoubleType            |DecimalType(10,0)<sup>1</sup>|
+|**LongType**   |LongType   |LongType   |LongType    |--         |FloatType             |DoubleType            |DecimalType(20,0)<sup>1</sup>|
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType  |--                    |DoubleType            |DoubleType                   |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType |DoubleType            |--                    |DoubleType                   |
+|**DecimalType**|DecimalType|DecimalType|DecimalType |DecimalType|DoubleType<sup>2</sup>|DoubleType<sup>2</sup>|--                           |
+
+**Note 1**: DecimalType(precision,scale)   
+**Note 2**: In these cases DecimalType can lose precision, there is no common type for decimal and double because double's range is larger than decimal, and yet decimal is more precise than double so when we cast Decimaltype into DobleType it could lose precision.
+
+**StringType Behavior**  
+* Arithmetic Expressions: When we have an arithmetic expression with one operand of type StringType, both operands will be implicitly casted to DoubleType.
+
+    |               |ByteType   |ShortType  |IntegerType |LongType   |FloatType    |DoubleType  |
+    |---------------|-----------|-----------|------------|-----------|-------------|------------|
+    |**StringType** |DoubleType |DoubleType |DoubleType  |DoubleType |DoubleType   |DoubleType  |
+
+* Comparison: When we have a comparison expression with an operand of type StringType, the operand StringType will be casted implicitly according to the following table.
+
+    |               |ByteType   |ShortType  |IntegerType |LongType   |FloatType    |DoubleType  |DecimalType |DateType             |TimestampType             |
+    |---------------|-----------|-----------|------------|-----------|-------------|------------|------------|---------------------|--------------------------|
+    |**StringType** |ByteType   |ShortType  |IntegerType |LongType   |FloatType    |DoubleType  |DoubleType  |DateType<sup>1</sup> |TimestampType<sup>1</sup> |
+
+    **Note 1**: If `spark.sql.legacy.typeCoercion.datetimeToString` is true, DateType and TimestampType will be casted to StringType
+    
+* in, except, intersect, union, array: If the list of values has a StringType element, all the elements will be casted to StringType.
+ 
+* concat, concat_ws, array_join: All elements will be casted to StringType.
+
+* map_concat: If the list of key has a StringType element, all the keys will be casted to StringType. The same goes for the values.
+
+* if, when: If any of the results has StringType, all the results will be casted to StringType.
+
+**Time Expressions**:
+
+|                  |DateType     |TimestampType |
+|------------------|-------------|--------------|
+|**DateType**      |--           |TimestampType |
+|**TimestampType** |TimestampType|--            |
+
+
+**Possible implicit conversions**:
+
+|                  |ByteType  |ShortType |IntegerType |LongType |FloatType |DoubleType |DecimalType|StringType |BinaryType |BooleanType |TimestampType |DateType|
+|------------------|----------|----------|------------|---------|----------|-----------|-----------|-----------|-----------|------------|--------------|--------|
+|**ByteType**      |--        |X         |X           |X        |X         |X          |X          |X          |           |            |              |        |
+|**ShortType**     |X         |--        |X           |X        |X         |X          |X          |X          |           |            |              |        |
+|**IntegerType**   |X         |X         |--          |X        |X         |X          |X          |X          |           |            |              |        |
+|**LongType**      |X         |X         |X           |--       |X         |X          |X          |X          |           |            |              |        |
+|**FloatType**     |X         |X         |X           |X        |--        |X          |X          |X          |           |            |              |        |
+|**DoubleType**    |X         |X         |X           |X        |X         |--         |X          |X          |           |            |              |        |
+|**DecimalType**   |X         |X         |X           |X        |X         |X          |--         |X          |           |            |              |        |
+|**StringType**    |X         |X         |X           |X        |X         |X          |X          |--         |X          |X           |X             |X       |
+|**BinaryType**    |          |          |            |         |          |           |           |X          |--         |            |              |        |
+|**BooleanType**   |          |          |            |         |          |           |           |X          |           |--          |              |        |
+|**TimestampType** |          |          |            |         |          |           |           |X          |           |            |--            |X       |
+|**DateType**      |          |          |            |         |          |           |           |X          |           |            |X             |--      |
+
+#### Type Coercion Examples
+
+```sql
+DESCRIBE TABLE numericTable;
++-------------+---------+-------+
+|col_name     |data_type|comment|
++-------------+---------+-------+
+|integerColumn|int      |null   |
+|doubleColumn |double   |null   |
++-------------+---------+-------+
+
+DESCRIBE SELECT integerColumn + doubleColumn as result FROM numericTable;
++--------+---------+-------+
+|col_name|data_type|comment|
++--------+---------+-------+
+|  result|   double|   null|
++--------+---------+-------+
+
+```
+
+```sql
+DESCRIBE SELECT MONTHS_BETWEEN(CAST('2020-10-10' AS Date),CAST('2020-08-13' AS timestamp))
+
++------------------------------------------------------------------------------------------------+---------+-------+
+|col_name                                                                                        |data_type|comment|
++------------------------------------------------------------------------------------------------+---------+-------+
+|months_between(CAST(CAST(2020-10-10 AS DATE) AS TIMESTAMP), CAST(2020-08-13 AS TIMESTAMP), true)|double   |null   |
++------------------------------------------------------------------------------------------------+---------+-------+
+
+```
+
+```sql
+DESCRIBE SELECT 1 + '2'
+
++---------------------------------------+---------+-------+
+|col_name                               |data_type|comment|
++---------------------------------------+---------+-------+
+|(CAST(1 AS DOUBLE) + CAST(2 AS DOUBLE))|double   |null   |
++---------------------------------------+---------+-------+
+
+```
+
+```sql
+DESCRIBE SELECT 1 = '2'
+
++--------------------+---------+-------+
+|col_name            |data_type|comment|
++--------------------+---------+-------+
+|(1 = CAST(2 AS INT))|boolean  |null   |
++--------------------+---------+-------+
+
+```
+
+```sql
+DESCRIBE SELECT 1 IN ('2', 3)
+
++-------------------------------------------------------------+---------+-------+
+|col_name                                                     |data_type|comment|
++-------------------------------------------------------------+---------+-------+
+|(CAST(1 AS STRING) IN (CAST(2 AS STRING), CAST(3 AS STRING)))|boolean  |null   |
++-------------------------------------------------------------+---------+-------+
+
+```
+
+
+
+#### Explicit Casting and Store Assignment Casting

Review comment:
       Ok, thanks, I'll explain it in the table notes




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] github-actions[bot] commented on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-778531123


   We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r501381733



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,51 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility

Review comment:
       nit: `Data type compatibility` => `Type Conversion`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29837: [SPARK-32463][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-696659480


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
planga82 commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-706521610


   Thanks @huaxingao !
   > I quickly tried a few explicit casting, seems I can cast Boolean to Timestamp OK?
   I noticed it, but I didn't put it in because even though it was allowed, I thought it didn't make sense. Shall we put it in anyway?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r502785507



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is the transformation of some data types into others in order to solve the expressions. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 
+

Review comment:
       Could you describe what' a type coercion at the beginning?

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is the transformation of some data types into others in order to solve the expressions. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 
+
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types 
+
+Numeric expresions:
+
+|               |ByteType   |ShortType  |IntegerType |LongType  |FloatType |DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType  |FloatType |DoubleType |DoubleType |
+|**LongType**   |LongType   |LongType   |LongType    |--        |FloatType |DoubleType |DoubleType |
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType |--        |DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|--         |DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|DoubleType |--         |

Review comment:
       This matrix means specific type coercion rules in binary arithmetic operations, right? Cold you add a matrix for type coercion in more general cases, too, like `Table 2-10 Implicit Type Conversion Matrix` in the Oracle doc? You could refer to the rules in `TypeCoercion.scala` https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala#L983-L1064

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is the transformation of some data types into others in order to solve the expressions. 

Review comment:
       `in order to solve the expressions.` -> `in order to resolve type mismatches.`?

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is the transformation of some data types into others in order to solve the expressions. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 
+
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types 
+
+Numeric expresions:
+
+|               |ByteType   |ShortType  |IntegerType |LongType  |FloatType |DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType  |FloatType |DoubleType |DoubleType |
+|**LongType**   |LongType   |LongType   |LongType    |--        |FloatType |DoubleType |DoubleType |
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType |--        |DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|--         |DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|DoubleType |--         |
+
+The case of DecimalType, is treated differently, for example, there is no common type for double and decimal because double's range is larger than decimal, and yet decimal is more precise than double, but in an expresion, we would cast the decimal into double.
+
+Time expresions:
+
+|                  |DateType     |TimestampType |
+|------------------|-------------|--------------|
+|**DateType**      |--           |TimestampType |
+|**TimestampType** |TimestampType|--            |
+
+#### Type coercion examples
+
+```sql
+DESCRIBE TABLE numericTable;
++-------------+---------+-------+
+|col_name     |data_type|comment|
++-------------+---------+-------+
+|integerColumn|int      |null   |
+|doubleColumn |double   |null   |
++-------------+---------+-------+
+
+DESCRIBE SELECT integerColumn + doubleColumn as result FROM numericTable;
++--------+---------+-------+
+|col_name|data_type|comment|
++--------+---------+-------+
+|  result|   double|   null|
++--------+---------+-------+
+
+```
+
+```sql
+DESCRIBE dateTable;
++---------------+---------+-------+
+|       col_name|data_type|comment|
++---------------+---------+-------+
+|     dateColumn|     date|   null|
+|timestampColumn|timestamp|   null|
++---------------+---------+-------+
+
+SELECT MONTHS_BETWEEN(dateColumn,timestampColumn) FROM dateTable;
+
+```
+
+#### Explicit casting and store assignment casting
+
+When you are using explicit casting by CAST or doing INSERT INTO operations that need to cast types to different store types, the following matrix shows if the conversion is allowed
+
+|             |ByteType  |ShortType |IntegerType |LongType |FloatType |DoubleType |StringType |BinaryType |BooleanType |TimestampType |DateType|
+|-------------|----------|----------|------------|---------|----------|-----------|-----------|-----------|------------|--------------|--------|
+|**ByteType** |--        |X         |X           |X        |X         |X          |X          |X          |X           |X             |        |
+|**ShortType**|*         |--        |X           |X        |X         |X          |X          |X          |X           |X             |        |
+|**IntegerType**|*       |*         |--          |X        |X         |X          |X          |X          |X           |X             |        |
+|**LongType** |*         |*         |*           |--       |X         |X          |X          |X          |X           |X             |        |
+|**FloatType** |*        |*         |*           |*        |--        |X          |X          |           |X           |X             |        |
+|**DoubleType** |*       |*         |*           |*        |*         |--         |X          |           |X           |X             |        |
+|**StringType** |*       |*         |*           |*        |*         |*          |--         |X          |X           |X             |X       |
+|**BinaryType** |        |          |            |         |          |           |           |--         |            |              |        |
+|**BooleanType** |X      |X         |X           |X        |X         |X          |X          |           |--          |              |        |
+|**TimestampType** |*    |*         |*           |X        |X         |X          |X          |           |            |--            |X       |
+|**DateType** |*         |*         |X           |X        |X         |X          |X          |           |            |X             |--      |
+
+X: Conversion allowed (cast ByteType in ShortType)  
+*: An overflow can occur (cast ShortType in ByteType)
+
+If an overflow occurs and ANSI compliance is activated (spark.sql.ansi.enabled is set to true for casting or spark.sql.storeAssignmentPolicy=ANSI for store assignment casting) an exception will be thrown. 

Review comment:
       spark.sql.storeAssignmentPolicy=ANSI -> \`spark.sql.storeAssignmentPolicy=ANSI\`

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is the transformation of some data types into others in order to solve the expressions. 

Review comment:
       `... can contain different data types, type conversion ...` -> `... can contain different data types and type conversion ...`?

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion

Review comment:
       `conversion` -> `Conversion`

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is the transformation of some data types into others in order to solve the expressions. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 
+
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types 
+
+Numeric expresions:
+
+|               |ByteType   |ShortType  |IntegerType |LongType  |FloatType |DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType  |FloatType |DoubleType |DoubleType |
+|**LongType**   |LongType   |LongType   |LongType    |--        |FloatType |DoubleType |DoubleType |
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType |--        |DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|--         |DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|DoubleType |--         |
+
+The case of DecimalType, is treated differently, for example, there is no common type for double and decimal because double's range is larger than decimal, and yet decimal is more precise than double, but in an expresion, we would cast the decimal into double.
+
+Time expresions:
+
+|                  |DateType     |TimestampType |
+|------------------|-------------|--------------|
+|**DateType**      |--           |TimestampType |
+|**TimestampType** |TimestampType|--            |
+
+#### Type coercion examples
+
+```sql
+DESCRIBE TABLE numericTable;
++-------------+---------+-------+
+|col_name     |data_type|comment|
++-------------+---------+-------+
+|integerColumn|int      |null   |
+|doubleColumn |double   |null   |
++-------------+---------+-------+
+
+DESCRIBE SELECT integerColumn + doubleColumn as result FROM numericTable;
++--------+---------+-------+
+|col_name|data_type|comment|
++--------+---------+-------+
+|  result|   double|   null|
++--------+---------+-------+
+
+```
+
+```sql
+DESCRIBE dateTable;
++---------------+---------+-------+
+|       col_name|data_type|comment|
++---------------+---------+-------+
+|     dateColumn|     date|   null|
+|timestampColumn|timestamp|   null|
++---------------+---------+-------+
+
+SELECT MONTHS_BETWEEN(dateColumn,timestampColumn) FROM dateTable;
+
+```
+
+#### Explicit casting and store assignment casting
+
+When you are using explicit casting by CAST or doing INSERT INTO operations that need to cast types to different store types, the following matrix shows if the conversion is allowed
+
+|             |ByteType  |ShortType |IntegerType |LongType |FloatType |DoubleType |StringType |BinaryType |BooleanType |TimestampType |DateType|
+|-------------|----------|----------|------------|---------|----------|-----------|-----------|-----------|------------|--------------|--------|
+|**ByteType** |--        |X         |X           |X        |X         |X          |X          |X          |X           |X             |        |
+|**ShortType**|*         |--        |X           |X        |X         |X          |X          |X          |X           |X             |        |
+|**IntegerType**|*       |*         |--          |X        |X         |X          |X          |X          |X           |X             |        |
+|**LongType** |*         |*         |*           |--       |X         |X          |X          |X          |X           |X             |        |
+|**FloatType** |*        |*         |*           |*        |--        |X          |X          |           |X           |X             |        |
+|**DoubleType** |*       |*         |*           |*        |*         |--         |X          |           |X           |X             |        |
+|**StringType** |*       |*         |*           |*        |*         |*          |--         |X          |X           |X             |X       |
+|**BinaryType** |        |          |            |         |          |           |           |--         |            |              |        |
+|**BooleanType** |X      |X         |X           |X        |X         |X          |X          |           |--          |              |        |
+|**TimestampType** |*    |*         |*           |X        |X         |X          |X          |           |            |--            |X       |
+|**DateType** |*         |*         |X           |X        |X         |X          |X          |           |            |X             |--      |
+
+X: Conversion allowed (cast ByteType in ShortType)  
+*: An overflow can occur (cast ShortType in ByteType)
+
+If an overflow occurs and ANSI compliance is activated (spark.sql.ansi.enabled is set to true for casting or spark.sql.storeAssignmentPolicy=ANSI for store assignment casting) an exception will be thrown. 

Review comment:
       spark.sql.ansi.enabled -> \`spark.sql.ansi.enabled\`

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is the transformation of some data types into others in order to solve the expressions. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 
+
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types 
+
+Numeric expresions:
+
+|               |ByteType   |ShortType  |IntegerType |LongType  |FloatType |DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType  |FloatType |DoubleType |DoubleType |
+|**LongType**   |LongType   |LongType   |LongType    |--        |FloatType |DoubleType |DoubleType |
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType |--        |DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|--         |DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|DoubleType |--         |
+
+The case of DecimalType, is treated differently, for example, there is no common type for double and decimal because double's range is larger than decimal, and yet decimal is more precise than double, but in an expresion, we would cast the decimal into double.

Review comment:
       Could you describe more about decimal type coercion by referring to `DecimalPrecision`?  https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala#L29-L63

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is the transformation of some data types into others in order to solve the expressions. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 
+
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types 
+
+Numeric expresions:
+
+|               |ByteType   |ShortType  |IntegerType |LongType  |FloatType |DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType  |FloatType |DoubleType |DoubleType |
+|**LongType**   |LongType   |LongType   |LongType    |--        |FloatType |DoubleType |DoubleType |
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType |--        |DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|--         |DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|DoubleType |--         |
+
+The case of DecimalType, is treated differently, for example, there is no common type for double and decimal because double's range is larger than decimal, and yet decimal is more precise than double, but in an expresion, we would cast the decimal into double.
+
+Time expresions:
+
+|                  |DateType     |TimestampType |
+|------------------|-------------|--------------|
+|**DateType**      |--           |TimestampType |
+|**TimestampType** |TimestampType|--            |
+
+#### Type coercion examples
+
+```sql
+DESCRIBE TABLE numericTable;
++-------------+---------+-------+
+|col_name     |data_type|comment|
++-------------+---------+-------+
+|integerColumn|int      |null   |
+|doubleColumn |double   |null   |
++-------------+---------+-------+
+
+DESCRIBE SELECT integerColumn + doubleColumn as result FROM numericTable;
++--------+---------+-------+
+|col_name|data_type|comment|
++--------+---------+-------+
+|  result|   double|   null|
++--------+---------+-------+
+
+```
+
+```sql
+DESCRIBE dateTable;
++---------------+---------+-------+
+|       col_name|data_type|comment|
++---------------+---------+-------+
+|     dateColumn|     date|   null|
+|timestampColumn|timestamp|   null|
++---------------+---------+-------+
+
+SELECT MONTHS_BETWEEN(dateColumn,timestampColumn) FROM dateTable;
+

Review comment:
       Could you put the output for following the other doc examples?

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is the transformation of some data types into others in order to solve the expressions. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 
+
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types 
+
+Numeric expresions:
+
+|               |ByteType   |ShortType  |IntegerType |LongType  |FloatType |DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType  |FloatType |DoubleType |DoubleType |
+|**LongType**   |LongType   |LongType   |LongType    |--        |FloatType |DoubleType |DoubleType |
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType |--        |DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|--         |DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|DoubleType |--         |
+
+The case of DecimalType, is treated differently, for example, there is no common type for double and decimal because double's range is larger than decimal, and yet decimal is more precise than double, but in an expresion, we would cast the decimal into double.
+
+Time expresions:
+
+|                  |DateType     |TimestampType |
+|------------------|-------------|--------------|
+|**DateType**      |--           |TimestampType |
+|**TimestampType** |TimestampType|--            |
+
+#### Type coercion examples
+
+```sql
+DESCRIBE TABLE numericTable;
++-------------+---------+-------+
+|col_name     |data_type|comment|
++-------------+---------+-------+
+|integerColumn|int      |null   |
+|doubleColumn |double   |null   |
++-------------+---------+-------+
+
+DESCRIBE SELECT integerColumn + doubleColumn as result FROM numericTable;
++--------+---------+-------+
+|col_name|data_type|comment|
++--------+---------+-------+
+|  result|   double|   null|
++--------+---------+-------+
+
+```
+
+```sql
+DESCRIBE dateTable;
++---------------+---------+-------+
+|       col_name|data_type|comment|
++---------------+---------+-------+
+|     dateColumn|     date|   null|
+|timestampColumn|timestamp|   null|
++---------------+---------+-------+
+
+SELECT MONTHS_BETWEEN(dateColumn,timestampColumn) FROM dateTable;
+
+```
+
+#### Explicit casting and store assignment casting

Review comment:
       cc: @gengliangwang 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] huaxingao commented on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
huaxingao commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-709623604


   @gatorsmile Could you please take a quick look?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29837: [SPARK-32463][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r494072159



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,33 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+The following is the hierarchy of data type compatibility. In an operation involving different and compatible data types, these will be promoted to the lowest common top type to perform the operation.
+
+For example, if you have an add operation between an integer and a float, the integer will be treated as a float, the least common compatible type, resulting the operation in a float.
+
+The most common operations where this hierarchy is applied are:

Review comment:
       Have you checked the other docs, e.g., Oracle, SQL server, ...? If they've already described  type coercsion, I think we are able to refer to them.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #29837: [SPARK-32463][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r492769769



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,33 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+The following is the hierarchy of data type compatibility. In an operation involving different and compatible data types, these will be promoted to the lowest common top type to perform the operation.
+
+For example, if you have an add operation between an integer and a float, the integer will be treated as a float, the least common compatible type, resulting the operation in a float.
+
+The most common operations where this hierarchy is applied are:

Review comment:
       I believe we have much more complicated rules than here ..  see `TypeCoercion.scala`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29837: [SPARK-32463][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-696659480


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on pull request #29837: [SPARK-32463][SQL][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
planga82 commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-705062210


   I have done a refactor, I tried to explain more clear the concepts
   
   ![image](https://user-images.githubusercontent.com/12819544/95361866-2f4bac00-08c5-11eb-8375-efdab32e8440.png)
   ![image](https://user-images.githubusercontent.com/12819544/95361898-38d51400-08c5-11eb-94cd-5aa22509528e.png)
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 edited a comment on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
planga82 edited a comment on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-706521610


   Thanks @huaxingao !
   > I quickly tried a few explicit casting, seems I can cast Boolean to Timestamp OK?
   
   I noticed it, but I didn't put it in because even though it was allowed, I thought it didn't make sense. Shall we put it in anyway?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-709818254


   > Ok, I see, do you think it's better to drop StringType from this matrix? Do we have other differences in other types?
   
   Yes, but still we need to describe the behavior of type conversion between string type and numeric/date/timestamp type.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r501390470



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,51 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+#### Type Coercion in operations between different types 
+
+The following is the hierarchy of data type compatibility and the possible implicit conversions that can be made. In an operation involving different and compatible data types, these will be promoted to the lowest common top type to perform the operation.
+
+For example, if you have an add operation between an integer and a float, the integer will be treated as a float, the least common compatible type, resulting the operation in a float.
+
+|Data type|Hierarchy compatible types|
+|---------|--------------------------|
+|ByteType |ByteType, ShortType, IntegerType, LongType, FloatType, DoubleType|
+|ShortType |ShortType, IntegerType, LongType, FloatType, DoubleType|
+|IntegerType |IntegerType, LongType, FloatType, DoubleType|
+|LongType |LongType, FloatType, DoubleType|
+|FloatType |FloatType, DoubleType|
+|DoubleType |DoubleType|
+|StringType |DoubleType (in numeric operations), StringType |
+|BinaryType |BinaryType|
+|BooleanType |BooleanType|
+|TimestampType |TimestampType, DateType|
+|DateType |DateType|
+
+The case of DecimalType, is treated differently, for example, there is no common type for double and decimal because double's range is larger than decimal, and yet decimal is more precise than double, but in an operation, we would cast the decimal into double.
+
+#### Explicit casting and store assignment casting
+
+When you are using explicit casting by CAST or doing INSERT INTO operations that need to cast types to different store types, the following matrix shows if the conversion is allowed
+
+|         |ByteType  |ShortType |IntegerType |LongType |FloatType |DoubleType |StringType |BinaryType |BooleanType |TimestampType |DateType|
+|---------|----------|----------|------------|---------|----------|-----------|-----------|-----------|------------|--------------|--------|
+|ByteType |--        |X         |X           |X        |X         |X          |X          |X          |X           |X             |        |
+|ShortType|*         |--        |X           |X        |X         |X          |X          |X          |X           |X             |        |
+|IntegerType|*       |*         |--          |X        |X         |X          |X          |X          |X           |X             |        |
+|LongType |*         |*         |*           |--       |X         |X          |X          |X          |X           |X             |        |
+|FloatType |*        |*         |*           |*        |--        |X          |X          |           |X           |X             |        |
+|DoubleType |*       |*         |*           |*        |*         |--         |X          |           |X           |X             |        |
+|StringType |*       |*         |*           |*        |*         |*          |--         |X          |X           |X             |X       |
+|BinaryType |        |          |            |         |          |           |           |--         |            |              |        |
+|BooleanType |X      |X         |X           |X        |X         |X          |X          |           |--          |              |        |
+|TimestampType |*    |*         |*           |X        |X         |X          |X          |           |            |--            |X       |
+|DateType |*         |*         |X           |X        |X         |X          |X          |           |            |X             |--      |
+
+X: Conversion allowed (cast ByteType in ShortType)  
+*: An overflow can occur, check ANSI compliance for the result in this case (cast ShortType in ByteType)

Review comment:
       The overflow behaviour is different between ANSI/non-ANSI modes. Could you explain about it here and add a link to the ANSI page? https://github.com/apache/spark/blob/master/docs/sql-ref-ansi-compliance.md#type-conversion We might need a subsection for the explanation.

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,51 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility

Review comment:
       How about organizing the structure of this section by referring to the Oracle doc like this? https://docs.oracle.com/cd/B28359_01/server.111/b28286/sql_elements002.htm#SQLRF51043
   ```
   #### Type Conversion
   <What does "Type Conversion" means? How does Spark handle type conversion? brabrabra...>
   
   #### Type Coercion
   <Coercion Matrix> 
   
   ##### Type Coercion Examples
   <examples...>
   
   #### Explicit Casting and Store Assignment Casting
   <Casting Matrix> 
   
   ##### Type Casting Examples
   <examples...>
   ```
   cc: @gatorsmile @HyukjinKwon @huaxingao 

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,51 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility

Review comment:
       `Data type compatibility` => `Type Conversion`?

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,51 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+#### Type Coercion in operations between different types 
+
+The following is the hierarchy of data type compatibility and the possible implicit conversions that can be made. In an operation involving different and compatible data types, these will be promoted to the lowest common top type to perform the operation.
+
+For example, if you have an add operation between an integer and a float, the integer will be treated as a float, the least common compatible type, resulting the operation in a float.
+
+|Data type|Hierarchy compatible types|

Review comment:
       Could we use a matrix form for type coercion, too?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on pull request #29837: [SPARK-32463][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
planga82 commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-696660627


   @huaxingao Any comments are welcome. Thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gatorsmile commented on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
gatorsmile commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-778853446


   cc @gengliangwang Could you take a look? We should add it to our doc.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
planga82 commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-706028741


   I have restructured the sections, I think it is more complete as you proposed. Any more ideas? Thank you!
   
   ![image](https://user-images.githubusercontent.com/12819544/95556788-c3279000-0a0b-11eb-8e3a-e75528c4d7b1.png)
   ![image](https://user-images.githubusercontent.com/12819544/95556827-d2a6d900-0a0b-11eb-84fa-5dd07c3042c3.png)
   ![image](https://user-images.githubusercontent.com/12819544/95556843-de929b00-0a0b-11eb-87c9-cfe1a03b82a2.png)
   ![image](https://user-images.githubusercontent.com/12819544/95556871-e7836c80-0a0b-11eb-9b0c-a81de0a744c4.png)
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 edited a comment on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
planga82 edited a comment on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-709815971


   > This is **not true**. The type conversion rules are more complex than that.
   > 
   > ```
   > spark-sql> explain select 1 in (2, 'a');
   > *(1) Project [false AS (CAST(1 AS STRING) IN (CAST(2 AS STRING), CAST(a AS STRING)))#19]
   > 
   > spark-sql> explain select 1 = '2';
   > *(1) Project [false AS (1 = CAST(2 AS INT))#11]
   > 
   > spark-sql> explain select 1 + '2';
   > *(1) Project [3.0 AS (CAST(1 AS DOUBLE) + CAST(2 AS DOUBLE))#17]
   > ```
   
   Ok, I see, do you think it's better to drop StringType from this matrix? Do we have other differences in other types? Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r513460544



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,200 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type Conversion
+
+In general, an expression can contain different data types and type conversion is the transformation of some data types into others in order to resolve type mismatches. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type Coercion in Operations between Different Types 
+
+Type Coercion refers to the automatic or implicit conversion of values from one type to another when you need to to resolve type mismatches.
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types.
+
+**Numeric Expressions**:
+
+|               |ByteType   |ShortType  |IntegerType |LongType   |FloatType             |DoubleType            |DecimalType                  |
+|---------------|-----------|-----------|------------|-----------|----------------------|----------------------|-----------------------------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType   |FloatType             |DoubleType            |DecimalType(3,0)<sup>1</sup> |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType   |FloatType             |DoubleType            |DecimalType(5,0)<sup>1</sup> |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType   |FloatType             |DoubleType            |DecimalType(10,0)<sup>1</sup>|
+|**LongType**   |LongType   |LongType   |LongType    |--         |FloatType             |DoubleType            |DecimalType(20,0)<sup>1</sup>|
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType  |--                    |DoubleType            |DoubleType                   |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType |DoubleType            |--                    |DoubleType                   |
+|**DecimalType**|DecimalType|DecimalType|DecimalType |DecimalType|DoubleType<sup>2</sup>|DoubleType<sup>2</sup>|--                           |
+
+**Note 1**: DecimalType(precision,scale)   
+**Note 2**: In these cases DecimalType can lose precision, there is no common type for decimal and double because double's range is larger than decimal, and yet decimal is more precise than double so when we cast Decimaltype into DobleType it could lose precision.
+
+**StringType Behavior**  
+* Arithmetic Expressions:
+
+    |               |ByteType   |ShortType  |IntegerType |LongType   |FloatType    |DoubleType  |
+    |---------------|-----------|-----------|------------|-----------|-------------|------------|
+    |**StringType** |DoubleType |DoubleType |DoubleType  |DoubleType |DoubleType   |DoubleType  |

Review comment:
       I think we need to make it clear that the String type will be implicit cast as Double type.

##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,200 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type Conversion
+
+In general, an expression can contain different data types and type conversion is the transformation of some data types into others in order to resolve type mismatches. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type Coercion in Operations between Different Types 
+
+Type Coercion refers to the automatic or implicit conversion of values from one type to another when you need to to resolve type mismatches.
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types.
+
+**Numeric Expressions**:
+
+|               |ByteType   |ShortType  |IntegerType |LongType   |FloatType             |DoubleType            |DecimalType                  |
+|---------------|-----------|-----------|------------|-----------|----------------------|----------------------|-----------------------------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType   |FloatType             |DoubleType            |DecimalType(3,0)<sup>1</sup> |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType   |FloatType             |DoubleType            |DecimalType(5,0)<sup>1</sup> |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType   |FloatType             |DoubleType            |DecimalType(10,0)<sup>1</sup>|
+|**LongType**   |LongType   |LongType   |LongType    |--         |FloatType             |DoubleType            |DecimalType(20,0)<sup>1</sup>|
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType  |--                    |DoubleType            |DoubleType                   |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType |DoubleType            |--                    |DoubleType                   |
+|**DecimalType**|DecimalType|DecimalType|DecimalType |DecimalType|DoubleType<sup>2</sup>|DoubleType<sup>2</sup>|--                           |
+
+**Note 1**: DecimalType(precision,scale)   
+**Note 2**: In these cases DecimalType can lose precision, there is no common type for decimal and double because double's range is larger than decimal, and yet decimal is more precise than double so when we cast Decimaltype into DobleType it could lose precision.
+
+**StringType Behavior**  
+* Arithmetic Expressions:
+
+    |               |ByteType   |ShortType  |IntegerType |LongType   |FloatType    |DoubleType  |
+    |---------------|-----------|-----------|------------|-----------|-------------|------------|
+    |**StringType** |DoubleType |DoubleType |DoubleType  |DoubleType |DoubleType   |DoubleType  |
+
+* Comparison:
+
+    |               |ByteType   |ShortType  |IntegerType |LongType   |FloatType    |DoubleType  |DecimalType |DateType             |TimestampType             |
+    |---------------|-----------|-----------|------------|-----------|-------------|------------|------------|---------------------|--------------------------|
+    |**StringType** |ByteType   |ShortType  |IntegerType |LongType   |FloatType    |DoubleType  |DoubleType  |DateType<sup>1</sup> |TimestampType<sup>1</sup> |

Review comment:
       ditto




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on a change in pull request #29837: [SPARK-32463][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
planga82 commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r492821724



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,33 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+The following is the hierarchy of data type compatibility. In an operation involving different and compatible data types, these will be promoted to the lowest common top type to perform the operation.
+
+For example, if you have an add operation between an integer and a float, the integer will be treated as a float, the least common compatible type, resulting the operation in a float.
+
+The most common operations where this hierarchy is applied are:

Review comment:
       Yes, I was trying to simplify and put in this list the most popular "operations" but I can extend it. I'm going to review it. Thanks!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] huaxingao commented on a change in pull request #29837: [SPARK-32463][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
huaxingao commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r493965798



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,33 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+The following is the hierarchy of data type compatibility. In an operation involving different and compatible data types, these will be promoted to the lowest common top type to perform the operation.
+
+For example, if you have an add operation between an integer and a float, the integer will be treated as a float, the least common compatible type, resulting the operation in a float.
+
+The most common operations where this hierarchy is applied are:

Review comment:
       @gatorsmile WDYT? Do you want to have a table to list the most common rules like the one in https://www.ibm.com/support/knowledgecenter/SSEPGG_10.1.0/com.ibm.db2.luw.sql.ref.doc/doc/r0008477.html?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on a change in pull request #29837: [SPARK-32463][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
planga82 commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r494431837



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,33 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+The following is the hierarchy of data type compatibility. In an operation involving different and compatible data types, these will be promoted to the lowest common top type to perform the operation.
+
+For example, if you have an add operation between an integer and a float, the integer will be treated as a float, the least common compatible type, resulting the operation in a float.
+
+The most common operations where this hierarchy is applied are:

Review comment:
       I have checked Oracle and SQL server documentation.
   SQL server don't detail all implicit conversions:
   https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-conversion-database-engine?view=sql-server-ver15
   https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-precedence-transact-sql?view=sql-server-ver15
   But Oracle have a more detailed rules (see "Implicit Data Conversion")
   https://docs.oracle.com/cd/B28359_01/server.111/b28286/sql_elements002.htm#SQLRF00214
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on a change in pull request #29837: [SPARK-32463][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
planga82 commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r494431837



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,33 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+The following is the hierarchy of data type compatibility. In an operation involving different and compatible data types, these will be promoted to the lowest common top type to perform the operation.
+
+For example, if you have an add operation between an integer and a float, the integer will be treated as a float, the least common compatible type, resulting the operation in a float.
+
+The most common operations where this hierarchy is applied are:

Review comment:
       I have checked Oracle and SQL server documentation.
   SQL server don't detail all implicit conversions:
   https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-conversion-database-engine?view=sql-server-ver15
   https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-precedence-transact-sql?view=sql-server-ver15
   But Oracle have a more detailed rules (see "Implicit Data Conversion")
   https://docs.oracle.com/cd/B28359_01/server.111/b28286/sql_elements002.htm#SQLRF00214
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
planga82 commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-706685246


   I have update the PR with all comments. very useful, thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 edited a comment on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
planga82 edited a comment on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-706685246


   I have update the PR with all comments. very useful, thanks!
   
   ![image](https://user-images.githubusercontent.com/12819544/95687803-db5f0100-0bfd-11eb-952a-c714f880556d.png)
   ![image](https://user-images.githubusercontent.com/12819544/95687812-e580ff80-0bfd-11eb-8442-3f531b4f489a.png)
   ![image](https://user-images.githubusercontent.com/12819544/95687822-f6317580-0bfd-11eb-8877-888056b268f3.png)
   ![image](https://user-images.githubusercontent.com/12819544/95687904-5fb18400-0bfe-11eb-935b-84e78871a5f0.png)
   ![image](https://user-images.githubusercontent.com/12819544/95687839-0b0e0900-0bfe-11eb-9242-d2ac718d30f4.png)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29837: [SPARK-32463][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r495709929



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,33 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+The following is the hierarchy of data type compatibility. In an operation involving different and compatible data types, these will be promoted to the lowest common top type to perform the operation.
+
+For example, if you have an add operation between an integer and a float, the integer will be treated as a float, the least common compatible type, resulting the operation in a float.
+
+The most common operations where this hierarchy is applied are:

Review comment:
       Ah, I see. IMO the matrix format of the Oracle doc looks better.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gatorsmile commented on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
gatorsmile commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-709694710


   cc @gengliangwang @cloud-fan who is working on this area recently. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
planga82 commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-706028741


   I have restructured the sections, I think it is more complete as you proposed. Any more ideas? Thank you!
   
   ![image](https://user-images.githubusercontent.com/12819544/95556788-c3279000-0a0b-11eb-8e3a-e75528c4d7b1.png)
   ![image](https://user-images.githubusercontent.com/12819544/95556827-d2a6d900-0a0b-11eb-84fa-5dd07c3042c3.png)
   ![image](https://user-images.githubusercontent.com/12819544/95556843-de929b00-0a0b-11eb-87c9-cfe1a03b82a2.png)
   ![image](https://user-images.githubusercontent.com/12819544/95556871-e7836c80-0a0b-11eb-9b0c-a81de0a744c4.png)
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
planga82 commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-721376889


   @gengliangwang I have introduced descriptions to clarify and listed the expressions
   
   ![image](https://user-images.githubusercontent.com/12819544/98040246-da457c00-1e17-11eb-8875-8dc1a466c204.png)
   ![image](https://user-images.githubusercontent.com/12819544/98040320-f8ab7780-1e17-11eb-8bce-bd3fd51ac140.png)
   ![image](https://user-images.githubusercontent.com/12819544/98040346-082ac080-1e18-11eb-8632-5ab8c4855b45.png)
   ![image](https://user-images.githubusercontent.com/12819544/98040369-11b42880-1e18-11eb-883d-9c581372f451.png)
   ![image](https://user-images.githubusercontent.com/12819544/98040508-4a540200-1e18-11eb-883d-cf2bc120016d.png)
   ![image](https://user-images.githubusercontent.com/12819544/98040534-52ac3d00-1e18-11eb-951d-eec1013c729f.png)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
planga82 commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-712418140


   I have updated the document with the StringType behavior and also added new examples.
   
   ![image](https://user-images.githubusercontent.com/12819544/96506836-34f3ab00-1250-11eb-9e52-f03dacab29b0.png)
   ![image](https://user-images.githubusercontent.com/12819544/96506863-45a42100-1250-11eb-9265-94b962ea153c.png)
   ![image](https://user-images.githubusercontent.com/12819544/96506896-52287980-1250-11eb-95ac-5b139698d067.png)
   ![image](https://user-images.githubusercontent.com/12819544/96506952-5fddff00-1250-11eb-80d5-2d19ee362b83.png)
   ![image](https://user-images.githubusercontent.com/12819544/96506976-69fffd80-1250-11eb-8191-25ecc2c1cde4.png)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang edited a comment on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
gengliangwang edited a comment on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-709809437


   ![image](https://user-images.githubusercontent.com/12819544/95687803-db5f0100-0bfd-11eb-952a-c714f880556d.png)
   This is **not true**. The type conversion rules are more complex than that.
   
   ```
   spark-sql> explain select 1 in (2, 'a');
   *(1) Project [false AS (CAST(1 AS STRING) IN (CAST(2 AS STRING), CAST(a AS STRING)))#19]
   
   spark-sql> explain select 1 = '2';
   *(1) Project [false AS (1 = CAST(2 AS INT))#11]
   
   spark-sql> explain select 1 + '2';
   *(1) Project [3.0 AS (CAST(1 AS DOUBLE) + CAST(2 AS DOUBLE))#17]
   ```
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #29837: [SPARK-32463][SQL][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-705259146


   Could you move the screenshots into the PR description?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r502787171



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is the transformation of some data types into others in order to solve the expressions. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 

Review comment:
       `Type coercion in operations between different types ` -> `Type Coercion in Operations between Different Types `




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r499217171



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,49 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+The following is the hierarchy of data type compatibility and the possible implicit conversions that can be made. In an operation involving different and compatible data types, these will be promoted to the lowest common top type to perform the operation.

Review comment:
       The current description looks ambiguous and many topics get mixed up, I think. What's the topics you would like to describe in this section? If you want to describe type conversion with the default mode (ansi=false), I think we need to pick up three categories: [explicit casting, type coercion, and store assignment casting](https://spark.apache.org/docs/latest/sql-ref-ansi-compliance.html#type-conversion). Anyway, we need a clearer structure to describe type behaviours for easy-to-read user documents.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r502787271



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is the transformation of some data types into others in order to solve the expressions. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 
+
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types 
+
+Numeric expresions:
+
+|               |ByteType   |ShortType  |IntegerType |LongType  |FloatType |DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType  |FloatType |DoubleType |DoubleType |
+|**LongType**   |LongType   |LongType   |LongType    |--        |FloatType |DoubleType |DoubleType |
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType |--        |DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|--         |DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|DoubleType |--         |
+
+The case of DecimalType, is treated differently, for example, there is no common type for double and decimal because double's range is larger than decimal, and yet decimal is more precise than double, but in an expresion, we would cast the decimal into double.
+
+Time expresions:
+
+|                  |DateType     |TimestampType |
+|------------------|-------------|--------------|
+|**DateType**      |--           |TimestampType |
+|**TimestampType** |TimestampType|--            |
+
+#### Type coercion examples
+
+```sql
+DESCRIBE TABLE numericTable;
++-------------+---------+-------+
+|col_name     |data_type|comment|
++-------------+---------+-------+
+|integerColumn|int      |null   |
+|doubleColumn |double   |null   |
++-------------+---------+-------+
+
+DESCRIBE SELECT integerColumn + doubleColumn as result FROM numericTable;
++--------+---------+-------+
+|col_name|data_type|comment|
++--------+---------+-------+
+|  result|   double|   null|
++--------+---------+-------+
+
+```
+
+```sql
+DESCRIBE dateTable;
++---------------+---------+-------+
+|       col_name|data_type|comment|
++---------------+---------+-------+
+|     dateColumn|     date|   null|
+|timestampColumn|timestamp|   null|
++---------------+---------+-------+
+
+SELECT MONTHS_BETWEEN(dateColumn,timestampColumn) FROM dateTable;
+
+```
+
+#### Explicit casting and store assignment casting
+
+When you are using explicit casting by CAST or doing INSERT INTO operations that need to cast types to different store types, the following matrix shows if the conversion is allowed
+
+|             |ByteType  |ShortType |IntegerType |LongType |FloatType |DoubleType |StringType |BinaryType |BooleanType |TimestampType |DateType|
+|-------------|----------|----------|------------|---------|----------|-----------|-----------|-----------|------------|--------------|--------|
+|**ByteType** |--        |X         |X           |X        |X         |X          |X          |X          |X           |X             |        |
+|**ShortType**|*         |--        |X           |X        |X         |X          |X          |X          |X           |X             |        |
+|**IntegerType**|*       |*         |--          |X        |X         |X          |X          |X          |X           |X             |        |
+|**LongType** |*         |*         |*           |--       |X         |X          |X          |X          |X           |X             |        |
+|**FloatType** |*        |*         |*           |*        |--        |X          |X          |           |X           |X             |        |
+|**DoubleType** |*       |*         |*           |*        |*         |--         |X          |           |X           |X             |        |
+|**StringType** |*       |*         |*           |*        |*         |*          |--         |X          |X           |X             |X       |
+|**BinaryType** |        |          |            |         |          |           |           |--         |            |              |        |
+|**BooleanType** |X      |X         |X           |X        |X         |X          |X          |           |--          |              |        |
+|**TimestampType** |*    |*         |*           |X        |X         |X          |X          |           |            |--            |X       |
+|**DateType** |*         |*         |X           |X        |X         |X          |X          |           |            |X             |--      |
+
+X: Conversion allowed (cast ByteType in ShortType)  
+*: An overflow can occur (cast ShortType in ByteType)
+
+If an overflow occurs and ANSI compliance is activated (spark.sql.ansi.enabled is set to true for casting or spark.sql.storeAssignmentPolicy=ANSI for store assignment casting) an exception will be thrown. 
+Otherwise, a truncate value will be used. See more on [Ansi Compliance](sql-ref-ansi-compliance.html#type-conversion).
+
+#### Type Casting examples

Review comment:
       `Type Casting examples` -> `Type Casting Examples`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #29837: [SPARK-32463][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r492769769



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,33 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+The following is the hierarchy of data type compatibility. In an operation involving different and compatible data types, these will be promoted to the lowest common top type to perform the operation.
+
+For example, if you have an add operation between an integer and a float, the integer will be treated as a float, the least common compatible type, resulting the operation in a float.
+
+The most common operations where this hierarchy is applied are:

Review comment:
       I believe we have much more complicated rules than here ..  see `TypeCoercion.scala`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 edited a comment on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
planga82 edited a comment on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-721376889


   @gengliangwang I have introduced descriptions to clarify and listed the expressions
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] github-actions[bot] closed pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #29837:
URL: https://github.com/apache/spark/pull/29837


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
planga82 commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r502897968



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is the transformation of some data types into others in order to solve the expressions. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 
+
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types 
+
+Numeric expresions:
+
+|               |ByteType   |ShortType  |IntegerType |LongType  |FloatType |DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType  |FloatType |DoubleType |DoubleType |
+|**LongType**   |LongType   |LongType   |LongType    |--        |FloatType |DoubleType |DoubleType |
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType |--        |DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|--         |DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|DoubleType |--         |
+
+The case of DecimalType, is treated differently, for example, there is no common type for double and decimal because double's range is larger than decimal, and yet decimal is more precise than double, but in an expresion, we would cast the decimal into double.

Review comment:
       Great information, I have included DecimalType in all matrix.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] github-actions[bot] closed pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #29837:
URL: https://github.com/apache/spark/pull/29837


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on pull request #29837: [SPARK-32463][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
planga82 commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-696660627


   @huaxingao Any comments are welcome. Thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r502787284



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is the transformation of some data types into others in order to solve the expressions. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 
+
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types 
+
+Numeric expresions:
+
+|               |ByteType   |ShortType  |IntegerType |LongType  |FloatType |DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType  |FloatType |DoubleType |DoubleType |
+|**LongType**   |LongType   |LongType   |LongType    |--        |FloatType |DoubleType |DoubleType |
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType |--        |DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|--         |DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|DoubleType |--         |
+
+The case of DecimalType, is treated differently, for example, there is no common type for double and decimal because double's range is larger than decimal, and yet decimal is more precise than double, but in an expresion, we would cast the decimal into double.
+
+Time expresions:
+
+|                  |DateType     |TimestampType |
+|------------------|-------------|--------------|
+|**DateType**      |--           |TimestampType |
+|**TimestampType** |TimestampType|--            |
+
+#### Type coercion examples
+
+```sql
+DESCRIBE TABLE numericTable;
++-------------+---------+-------+
+|col_name     |data_type|comment|
++-------------+---------+-------+
+|integerColumn|int      |null   |
+|doubleColumn |double   |null   |
++-------------+---------+-------+
+
+DESCRIBE SELECT integerColumn + doubleColumn as result FROM numericTable;
++--------+---------+-------+
+|col_name|data_type|comment|
++--------+---------+-------+
+|  result|   double|   null|
++--------+---------+-------+
+
+```
+
+```sql
+DESCRIBE dateTable;
++---------------+---------+-------+
+|       col_name|data_type|comment|
++---------------+---------+-------+
+|     dateColumn|     date|   null|
+|timestampColumn|timestamp|   null|
++---------------+---------+-------+
+
+SELECT MONTHS_BETWEEN(dateColumn,timestampColumn) FROM dateTable;
+
+```
+
+#### Explicit casting and store assignment casting
+
+When you are using explicit casting by CAST or doing INSERT INTO operations that need to cast types to different store types, the following matrix shows if the conversion is allowed
+
+|             |ByteType  |ShortType |IntegerType |LongType |FloatType |DoubleType |StringType |BinaryType |BooleanType |TimestampType |DateType|
+|-------------|----------|----------|------------|---------|----------|-----------|-----------|-----------|------------|--------------|--------|
+|**ByteType** |--        |X         |X           |X        |X         |X          |X          |X          |X           |X             |        |
+|**ShortType**|*         |--        |X           |X        |X         |X          |X          |X          |X           |X             |        |
+|**IntegerType**|*       |*         |--          |X        |X         |X          |X          |X          |X           |X             |        |
+|**LongType** |*         |*         |*           |--       |X         |X          |X          |X          |X           |X             |        |
+|**FloatType** |*        |*         |*           |*        |--        |X          |X          |           |X           |X             |        |
+|**DoubleType** |*       |*         |*           |*        |*         |--         |X          |           |X           |X             |        |
+|**StringType** |*       |*         |*           |*        |*         |*          |--         |X          |X           |X             |X       |
+|**BinaryType** |        |          |            |         |          |           |           |--         |            |              |        |
+|**BooleanType** |X      |X         |X           |X        |X         |X          |X          |           |--          |              |        |
+|**TimestampType** |*    |*         |*           |X        |X         |X          |X          |           |            |--            |X       |
+|**DateType** |*         |*         |X           |X        |X         |X          |X          |           |            |X             |--      |
+
+X: Conversion allowed (cast ByteType in ShortType)  
+*: An overflow can occur (cast ShortType in ByteType)
+
+If an overflow occurs and ANSI compliance is activated (spark.sql.ansi.enabled is set to true for casting or spark.sql.storeAssignmentPolicy=ANSI for store assignment casting) an exception will be thrown. 
+Otherwise, a truncate value will be used. See more on [Ansi Compliance](sql-ref-ansi-compliance.html#type-conversion).
+
+#### Type Casting examples
+
+```sql
+DESCRIBE castTable;
++-------------+---------+-------+
+|     col_name|data_type|comment|
++-------------+---------+-------+
+|IntegerColumn|      int|   null|
+|   longColumn|   bigint|   null|
+|  FloatColumn|    float|   null|
++-------------+---------+-------+
+
+DESCRIBE SELECT CAST(IntegerColumn AS LONG), CAST(longColumn AS DOUBLE), CAST(FloatColumn AS INTEGER) FROM castTable;
++-------------+---------+-------+
+|     col_name|data_type|comment|
++-------------+---------+-------+
+|IntegerColumn|   bigint|   null|
+|   longColumn|   double|   null|
+|  FloatColumn|      int|   null|
++-------------+---------+-------+
+
+```
+
+#### Store assignment casting examples

Review comment:
       `Store assignment casting examples` -> `Store Assignment Casting Examples`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r502787171



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is the transformation of some data types into others in order to solve the expressions. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 

Review comment:
       `Type coercion in operations between different types ` -> Type Coercion in Operations between Different Types 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
planga82 commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-721877012


   Image updated in the description of the PR


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on a change in pull request #29837: [SPARK-32463][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
planga82 commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r492870967



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,33 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+The following is the hierarchy of data type compatibility. In an operation involving different and compatible data types, these will be promoted to the lowest common top type to perform the operation.
+
+For example, if you have an add operation between an integer and a float, the integer will be treated as a float, the least common compatible type, resulting the operation in a float.
+
+The most common operations where this hierarchy is applied are:

Review comment:
       I'm not sure if explain with a lot of detail this cases is useful. Possibly it's better to remove this list to avoid misunderstand. @HyukjinKwon what do you think?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r517125806



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,206 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type Conversion
+
+In general, an expression can contain different data types and type conversion is the transformation of some data types into others in order to resolve type mismatches. 

Review comment:
       ```
   Type conversion turns the values of one data type to another data type. Spark needs to perform
   type conversions if users explicitly ask to do so via the CAST operator, or to resolve data type
   mismatch in operators, functions, and table writing implicitly.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 edited a comment on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
planga82 edited a comment on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-706521610


   Thanks @huaxingao !
   > I quickly tried a few explicit casting, seems I can cast Boolean to Timestamp OK?
   
   I noticed it, but I didn't put it in because even though it was allowed, I thought it didn't make sense. Shall we put it in anyway?
   Finally I have included. thanks!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
planga82 commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-717816263


   @gengliangwang Do you think the explanation is correct now? Thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29837: [SPARK-32463][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-696659480


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #29837: [SPARK-32463][SQL][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-703181387


   @planga82 Could you put the screenshot of the updated doc for reviews in the PR description? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
planga82 commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-709815971


   > This is **not true**. The type conversion rules are more complex than that.
   > 
   > ```
   > spark-sql> explain select 1 in (2, 'a');
   > *(1) Project [false AS (CAST(1 AS STRING) IN (CAST(2 AS STRING), CAST(a AS STRING)))#19]
   > 
   > spark-sql> explain select 1 = '2';
   > *(1) Project [false AS (1 = CAST(2 AS INT))#11]
   > 
   > spark-sql> explain select 1 + '2';
   > *(1) Project [3.0 AS (CAST(1 AS DOUBLE) + CAST(2 AS DOUBLE))#17]
   > ```
   
   Ok, I see, do you think it's better to drop StringType from this matrix? Do we have other differences in other types?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-709809437


   ![image](https://user-images.githubusercontent.com/12819544/95687803-db5f0100-0bfd-11eb-952a-c714f880556d.png)
   This is not true. The type conversion rules are more complex than that.
   
   ```
   spark-sql> explain select 1 in (2, 'a');
   *(1) Project [false AS (CAST(1 AS STRING) IN (CAST(2 AS STRING), CAST(a AS STRING)))#19]
   
   spark-sql> explain select 1 = '2';
   *(1) Project [false AS (1 = CAST(2 AS INT))#11]
   
   spark-sql> explain select 1 + '2';
   *(1) Project [3.0 AS (CAST(1 AS DOUBLE) + CAST(2 AS DOUBLE))#17]
   ```
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
planga82 commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r499373510



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,49 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+The following is the hierarchy of data type compatibility and the possible implicit conversions that can be made. In an operation involving different and compatible data types, these will be promoted to the lowest common top type to perform the operation.

Review comment:
       Thank you very much! I didn't know about Ansi property. I will redo the PR with this information and try to explain it with a better structure and also include a screenshot 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] huaxingao commented on pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
huaxingao commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-706456541


   This looks good to me overall. Thank for the good work! @planga82 
   A couple of comments: 
   1. Could you please put a `;` in the end of the SQL statements in the examples?
   2. I quickly tried a few explicit casting, seems I can cast Boolean to Timestamp OK? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
planga82 commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r496089738



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,33 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+The following is the hierarchy of data type compatibility. In an operation involving different and compatible data types, these will be promoted to the lowest common top type to perform the operation.
+
+For example, if you have an add operation between an integer and a float, the integer will be treated as a float, the least common compatible type, resulting the operation in a float.
+
+The most common operations where this hierarchy is applied are:

Review comment:
       I think it's fine to have both tables. I'm going to update the PR to include it




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29837: [SPARK-32463][SQL][DOCS] Add "Type Conversion" section in "Supported Data Types" of SQL docs

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r502787207



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,128 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+### Type conversion
+
+In general, an expression can contain different data types, type conversion is the transformation of some data types into others in order to solve the expressions. 
+Spark supports both implicit conversions by type coercion and explicit conversions by explicit casting and store assignment casting.
+
+#### Type coercion in operations between different types 
+
+The following matrix shows the resulting type to which they are implicitly converted to resolve an expression involving different data types 
+
+Numeric expresions:
+
+|               |ByteType   |ShortType  |IntegerType |LongType  |FloatType |DoubleType |StringType |
+|---------------|-----------|-----------|------------|----------|----------|-----------|-----------|
+|**ByteType**   |--         |ShortType  |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**ShortType**  |ShortType  |--         |IntegerType |LongType  |FloatType |DoubleType |DoubleType |
+|**IntegerType**|IntegerType|IntegerType|--          |LongType  |FloatType |DoubleType |DoubleType |
+|**LongType**   |LongType   |LongType   |LongType    |--        |FloatType |DoubleType |DoubleType |
+|**FloatType**  |FloatType  |FloatType  |FloatType   |FloatType |--        |DoubleType |DoubleType |
+|**DoubleType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|--         |DoubleType |
+|**StringType** |DoubleType |DoubleType |DoubleType  |DoubleType|DoubleType|DoubleType |--         |
+
+The case of DecimalType, is treated differently, for example, there is no common type for double and decimal because double's range is larger than decimal, and yet decimal is more precise than double, but in an expresion, we would cast the decimal into double.
+
+Time expresions:
+
+|                  |DateType     |TimestampType |
+|------------------|-------------|--------------|
+|**DateType**      |--           |TimestampType |
+|**TimestampType** |TimestampType|--            |
+
+#### Type coercion examples

Review comment:
       `Type coercion examples` -> `Type Coercion Examples`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] planga82 commented on pull request #29837: [SPARK-32463][SQL][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
planga82 commented on pull request #29837:
URL: https://github.com/apache/spark/pull/29837#issuecomment-702944386


   Updated with the matrix like Oracle Docs


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29837: [SPARK-32463][DOCS] SQL data type compatibility

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29837:
URL: https://github.com/apache/spark/pull/29837#discussion_r495709929



##########
File path: docs/sql-ref-datatypes.md
##########
@@ -314,3 +314,33 @@ SELECT COUNT(*), c2 FROM test GROUP BY c2;
 |        3| Infinity|
 +---------+---------+
 ```
+
+#### Data type compatibility
+
+The following is the hierarchy of data type compatibility. In an operation involving different and compatible data types, these will be promoted to the lowest common top type to perform the operation.
+
+For example, if you have an add operation between an integer and a float, the integer will be treated as a float, the least common compatible type, resulting the operation in a float.
+
+The most common operations where this hierarchy is applied are:

Review comment:
       Ah, I see. IMO the matrix format of the Oracle doc looks better than the current statement in this PR.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org