You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yin Huai (JIRA)" <ji...@apache.org> on 2014/08/05 01:15:15 UTC

[jira] [Comment Edited] (SPARK-2854) Finalize _acceptable_types in pyspark.sql

    [ https://issues.apache.org/jira/browse/SPARK-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085472#comment-14085472 ] 

Yin Huai edited comment on SPARK-2854 at 8/4/14 11:15 PM:
----------------------------------------------------------

For ByteType, ShortType and IntegerType, I am not sure if we want to support Python long values. Seems it is not necessary to support long for these data types. For current master, seems we will have an exception if users use long values and the data type is one of above types.

Also, for values of datetime.time and datetime.date, should we automatically convert them to Timestamp values (by automatically filling the date/time fields)? I feel we should let users explicitly to do that and we should not accept datetime.time and datetime.date.


was (Author: yhuai):
For ByteType, ShortType and IntegerType, I am not sure if we want to support Python long values. Seems it is not necessary to support long for these data types.

Also, for values of datetime.time and datetime.date, should we automatically convert them to Timestamp values (by automatically filling the date/time fields)? I feel we should let users explicitly to do that and we should not accept datetime.time and datetime.date.

> Finalize _acceptable_types in pyspark.sql
> -----------------------------------------
>
>                 Key: SPARK-2854
>                 URL: https://issues.apache.org/jira/browse/SPARK-2854
>             Project: Spark
>          Issue Type: Task
>            Reporter: Yin Huai
>            Priority: Blocker
>
> In PySpark, _acceptable_types defines accepted Python data types for every Spark SQL data type. The list is shown below. 
> {code}
> _acceptable_types = {
>     BooleanType: (bool,),
>     ByteType: (int, long),
>     ShortType: (int, long),
>     IntegerType: (int, long),
>     LongType: (int, long),
>     FloatType: (float,),
>     DoubleType: (float,),
>     DecimalType: (decimal.Decimal,),
>     StringType: (str, unicode),
>     TimestampType: (datetime.datetime, datetime.time, datetime.date),
>     ArrayType: (list, tuple, array),
>     MapType: (dict,),
>     StructType: (tuple, list),
> }
> {code}
> Let's double check this mapping before 1.1 release.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org