You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2016/03/28 07:11:25 UTC

[jira] [Created] (SPARK-14189) JSON data source infers a field type as StringType when some are inferred as DecimalType not capable of IntegralType.

Hyukjin Kwon created SPARK-14189:
------------------------------------

             Summary: JSON data source infers a field type as StringType when some are inferred as DecimalType not capable of IntegralType.
                 Key: SPARK-14189
                 URL: https://issues.apache.org/jira/browse/SPARK-14189
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.0.0
            Reporter: Hyukjin Kwon


When inferred types in the same field during finding competible {{DataType}} are {{IntegralType}} and {{DecimalType}} but {{DecimalType}} is not capable of the given {{IntegralType}}, JSON data source simply parse this as {{StringType}}.

This can be observed when {{floatAsBigDecimal}} is enabled.

{code}
def mixedIntegerAndDoubleRecords: RDD[String] =
  sqlContext.sparkContext.parallelize(
    """{"a": 3, "b": 1.1}""" ::
    """{"a": 3.1, "b": 1}""" :: Nil)

val jsonDF = sqlContext.read
  .option("floatAsBigDecimal", "true")
  .json(mixedIntegerAndDoubleRecords)
  .printSchema()
{code}

produces below:

{code}
root
 |-- a: string (nullable = true)
 |-- b: string (nullable = true)
{code}

 When {{floatAsBigDecimal}} is disabled.

{code}
def mixedIntegerAndDoubleRecords: RDD[String] =
  sqlContext.sparkContext.parallelize(
    """{"a": 3, "b": 1.1}""" ::
    """{"a": 3.1, "b": 1}""" :: Nil)

val jsonDF = sqlContext.read
  .option("floatAsBigDecimal", "false")
  .json(mixedIntegerAndDoubleRecords)
  .printSchema()
{code}

produces below correctly:

{code}
root
 |-- a: double (nullable = true)
 |-- b: double (nullable = true)
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org