You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2018/10/06 10:59:00 UTC

[jira] [Created] (SPARK-25666) Internally document type conversion between Python data and SQL types in UDFs

Hyukjin Kwon created SPARK-25666:
------------------------------------

             Summary: Internally document type conversion between Python data and SQL types in UDFs
                 Key: SPARK-25666
                 URL: https://issues.apache.org/jira/browse/SPARK-25666
             Project: Spark
          Issue Type: Documentation
          Components: PySpark
    Affects Versions: 2.4.0
            Reporter: Hyukjin Kwon


Currently, UDF's type coercion is not cleanly defined. See also https://github.com/apache/spark/pull/22610 and https://github.com/apache/spark/pull/22610

This JIRA targets to describe the type conversion logic internally. For instance:

{code}
    +---------------+----+----+----+----+----+------------------------------+------------------------------+----+---------------+---------+----------------------------+------------+----+------------+---------+---------+  # noqa
    |   Type \ Value|None|True|   1|   a|   a|                    1970-01-01|           1970-01-01 00:00:00| 1.0|array('i', [1])|      [1]|                        (1,)|         ABC|   1|    {'a': 1}| Row(a=1)| Row(a=1)|  # noqa
    +---------------+----+----+----+----+----+------------------------------+------------------------------+----+---------------+---------+----------------------------+------------+----+------------+---------+---------+  # noqa
    |           null|None|None|None|None|None|                          None|                          None|None|           None|     None|                        None|        None|None|        None|        X|        X|  # noqa
    |        boolean|None|True|None|None|None|                          None|                          None|None|           None|     None|                        None|        None|None|        None|        X|        X|  # noqa
    |        tinyint|None|None|   1|None|None|                          None|                          None|None|           None|     None|                        None|        None|None|        None|        X|        X|  # noqa
    |       smallint|None|None|   1|None|None|                          None|                          None|None|           None|     None|                        None|        None|None|        None|        X|        X|  # noqa
    |            int|None|None|   1|None|None|                          None|                          None|None|           None|     None|                        None|        None|None|        None|        X|        X|  # noqa
    |         bigint|None|None|   1|None|None|                          None|                          None|None|           None|     None|                        None|        None|None|        None|        X|        X|  # noqa
    |         string|None|true|   1|   a|   a|java.util.GregorianCalendar...|java.util.GregorianCalendar...| 1.0|    [I@2d03fe27|      [1]|[Ljava.lang.Object;@5ae74a34| [B@6e96d01e|   1|       {a=1}|        X|        X|  # noqa
    |           date|None|   X|   X|   X|   X|                    1970-01-01|                    1970-01-01|   X|              X|        X|                           X|           X|   X|           X|        X|        X|  # noqa
    |      timestamp|None|   X|   X|   X|   X|                             X|           1970-01-01 00:00:00|   X|              X|        X|                           X|           X|   X|           X|        X|        X|  # noqa
    |          float|None|None|None|None|None|                          None|                          None| 1.0|           None|     None|                        None|        None|None|        None|        X|        X|  # noqa
    |         double|None|None|None|None|None|                          None|                          None| 1.0|           None|     None|                        None|        None|None|        None|        X|        X|  # noqa
    |     array<int>|None|None|None|None|None|                          None|                          None|None|            [1]|      [1]|                         [1]|[65, 66, 67]|None|        None|        X|        X|  # noqa
    |         binary|None|None|None|   a|   a|                          None|                          None|None|           None|     None|                        None|         ABC|None|        None|        X|        X|  # noqa
    |  decimal(10,0)|None|None|None|None|None|                          None|                          None|None|           None|     None|                        None|        None|   1|        None|        X|        X|  # noqa
    |map<string,int>|None|None|None|None|None|                          None|                          None|None|           None|     None|                        None|        None|None|   {u'a': 1}|        X|        X|  # noqa
    | struct<_1:int>|None|   X|   X|   X|   X|                             X|                             X|   X|              X|Row(_1=1)|                   Row(_1=1)|           X|   X|Row(_1=None)|Row(_1=1)|Row(_1=1)|  # noqa
    +---------------+----+----+----+----+----+------------------------------+------------------------------+----+---------------+---------+----------------------------+------------+----+------------+---------+---------+  # noqa
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org