You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yin Huai (JIRA)" <ji...@apache.org> on 2014/08/05 01:11:11 UTC

[jira] [Updated] (SPARK-2854) Finalize _acceptable_types in pyspark.sql

     [ https://issues.apache.org/jira/browse/SPARK-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yin Huai updated SPARK-2854:
----------------------------

    Description: 
In PySpark, _acceptable_types defines accepted Python data types for every Spark SQL data type. The list is shown below. 

{code}
_acceptable_types = {
    BooleanType: (bool,),
    ByteType: (int, long),
    ShortType: (int, long),
    IntegerType: (int, long),
    LongType: (int, long),
    FloatType: (float,),
    DoubleType: (float,),
    DecimalType: (decimal.Decimal,),
    StringType: (str, unicode),
    TimestampType: (datetime.datetime, datetime.time, datetime.date),
    ArrayType: (list, tuple, array),
    MapType: (dict,),
    StructType: (tuple, list),
}
{code}

Let's double check this mapping before 1.1 release.

  was:
In PySpark, _acceptable_types defines accepted Python data types for every Spark SQL data type. The list is shown below. 

{code}
# Mapping Python types to Spark SQL DateType
_type_mappings = {
    bool: BooleanType,
    int: IntegerType,
    long: LongType,
    float: DoubleType,
    str: StringType,
    unicode: StringType,
    decimal.Decimal: DecimalType,
    datetime.datetime: TimestampType,
    datetime.date: TimestampType,
    datetime.time: TimestampType,
}
{code}

Let's double check this mapping before 1.1 release.


> Finalize _acceptable_types in pyspark.sql
> -----------------------------------------
>
>                 Key: SPARK-2854
>                 URL: https://issues.apache.org/jira/browse/SPARK-2854
>             Project: Spark
>          Issue Type: Task
>            Reporter: Yin Huai
>            Priority: Blocker
>
> In PySpark, _acceptable_types defines accepted Python data types for every Spark SQL data type. The list is shown below. 
> {code}
> _acceptable_types = {
>     BooleanType: (bool,),
>     ByteType: (int, long),
>     ShortType: (int, long),
>     IntegerType: (int, long),
>     LongType: (int, long),
>     FloatType: (float,),
>     DoubleType: (float,),
>     DecimalType: (decimal.Decimal,),
>     StringType: (str, unicode),
>     TimestampType: (datetime.datetime, datetime.time, datetime.date),
>     ArrayType: (list, tuple, array),
>     MapType: (dict,),
>     StructType: (tuple, list),
> }
> {code}
> Let's double check this mapping before 1.1 release.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org