You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:12:33 UTC

[jira] [Resolved] (SPARK-20332) Avro/Parquet GenericFixed decimal is not read into Spark correctly

     [ https://issues.apache.org/jira/browse/SPARK-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-20332.
----------------------------------
    Resolution: Incomplete

> Avro/Parquet GenericFixed decimal is not read into Spark correctly
> ------------------------------------------------------------------
>
>                 Key: SPARK-20332
>                 URL: https://issues.apache.org/jira/browse/SPARK-20332
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.6.0
>            Reporter: Justin Pihony
>            Priority: Minor
>              Labels: bulk-closed
>
> Take the following code:
> spark-shell --packages org.apache.avro:avro:1.8.1
> import org.apache.avro.{Conversions, LogicalTypes, Schema}
> import java.math.BigDecimal
> val dc = new Conversions.DecimalConversion()
> val javaBD = BigDecimal.valueOf(643.85924958)
> val schema = Schema.parse("{\"type\":\"record\",\"name\":\"Header\",\"namespace\":\"org.apache.avro.file\",\"fields\":["+
> "{\"name\":\"COLUMN\",\"type\":[\"null\",{\"type\":\"fixed\",\"name\":\"COLUMN\","+
> "\"size\":19,\"precision\":17,\"scale\":8,\"logicalType\":\"decimal\"}]}]}")
> val schemaDec = schema.getField("COLUMN").schema()
> val fieldSchema = if(schemaDec.getType() == Schema.Type.UNION)
> schemaDec.getTypes.get(1) else schemaDec
> val converted = dc.toFixed(javaBD, fieldSchema,
> LogicalTypes.decimal(javaBD.precision, javaBD.scale))
> sqlContext.createDataFrame(List(("value",converted)))
> and you'll get this error:
> java.lang.UnsupportedOperationException: Schema for type
> org.apache.avro.generic.GenericFixed is not supported
> However if you write out a parquet file using the AvroParquetWriter and the
> above GenericFixed value (converted), then read it in via the
> DataFrameReader the decimal value that is retrieved is not accurate (ie.
> 643... above is listed as -0.5...)
> Even if not supported, is there any way to at least have it throw an
> UnsupportedOperationException as it does when you try to do it directly (as
> compared to read in from a file)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org