You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Liang-Chi Hsieh (JIRA)" <ji...@apache.org> on 2018/09/28 08:07:00 UTC
[jira] [Commented] (SPARK-25554) Avro logical types get ignored in
SchemaConverters.toSqlType
[ https://issues.apache.org/jira/browse/SPARK-25554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16631505#comment-16631505 ]
Liang-Chi Hsieh commented on SPARK-25554:
-----------------------------------------
hmm, I think Spark 2.4 should have comprehensive support for Avro logical types.
{code:java}
{
"type" : "record",
"name" : "name",
"namespace" : "namespace",
"doc" : "docs",
"fields" : [ {
"name" : "field1",
"type" : [ "null", {
"type" : "int",
"logicalType" : "date"
} ],
"doc" : "doc"
} ]
}{code}
The DataFrame schema for above Avro file:
{code}
root
|-- field1: date (nullable = true)
{code}
From your attached maven dependencies, looks like you are using {{spark-avro}} and Spark 2.3? So I think it might be an issue of {{spark-avro}}.
> Avro logical types get ignored in SchemaConverters.toSqlType
> ------------------------------------------------------------
>
> Key: SPARK-25554
> URL: https://issues.apache.org/jira/browse/SPARK-25554
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.3.0
> Environment: Below is the maven dependencies:
> {code:java}
> <dependency>
> <groupId>org.apache.avro</groupId>
> <artifactId>avro</artifactId>
> <version>1.8.2</version>
> </dependency>
> <dependency>
> <groupId>com.databricks</groupId>
> <artifactId>spark-avro_2.11</artifactId>
> <version>4.0.0</version>
> </dependency>
> <!-- spark denpendencies -->
> <dependency>
> <groupId>org.apache.spark</groupId>
> <artifactId>spark-core_2.11</artifactId>
> <version>2.3.0</version>
> </dependency>
> <dependency>
> <groupId>org.apache.spark</groupId>
> <artifactId>spark-sql_2.11</artifactId>
> <version>2.3.0</version>
> </dependency>
> {code}
> Reporter: Yanan Li
> Priority: Major
>
> Having Avro schema defined as follow:
> {code:java}
> {
> "namespace": "com.xxx.avro",
> "name": "Book",
> "type": "record",
> "fields": [{
> "name": "name",
> "type": ["null", "string"],
> "default": null
> }, {
> "name": "author",
> "type": ["null", "string"],
> "default": null
> }, {
> "name": "published_date",
> "type": ["null", {"type": "int", "logicalType": "date"}],
> "default": null
> }
> ]
> }
> {code}
> Spark Schema converted from above Avro schema, logical type "date" gets ignored.
> {code:java}
> StructType(StructField(name,StringType,true),StructField(author,StringType,true),StructField(published_date,IntegerType,true))
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org