You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:21:18 UTC
[jira] [Updated] (SPARK-10153) Unable to query Avro data from Flume
using SparkSQL
[ https://issues.apache.org/jira/browse/SPARK-10153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon updated SPARK-10153:
---------------------------------
Labels: bulk-closed (was: )
> Unable to query Avro data from Flume using SparkSQL
> ---------------------------------------------------
>
> Key: SPARK-10153
> URL: https://issues.apache.org/jira/browse/SPARK-10153
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.4.1, 1.5.0
> Reporter: mathias kluba
> Priority: Major
> Labels: bulk-closed
>
> I use the Avro event serialiazer of Flume.
> The schema is:
> {code}
> {
> "type":"record",
> "name":"Event",
> "fields":[
> {
> "name":"headers",
> "type":{"type":"map","values":"string"}
> },
> {
> "name":"body",
> "type":"bytes"
> }
> ]}
> {code}
> I'm using HDP 2.2 with Hive 0.14 (using TEZ) and I'm able to query the data correctly.
> But with Spark SQL, I have issues.
> I tested with 1.4.1 and 1.5.0 (last snapshot) and I have different error message for different issues.
> In 1.4.1 I have:
> {code:sql}
> select body from mytable limit 10;
> {code}
> {code}
> conversion of string to map<string,string>not supported yet
> {code}
> It's related to the header which is a map<string,string>, but I don't understand why it's trying to convert to String. Maybe to display it as a single column ? If I do a "Select" without the header, I still have this issue.
> With 1.5.0 I have:
> {code:sql}
> select body from mytable limit 10;
> {code}
> {code}
> java.lang.RuntimeException: java.lang.ClassCastException: java.lang.String cannot be cast to [B
> {code}
> It's clearly not the same error, it seems that 1.5.0 is fixing the bug with the header.
> So it seems that there's an error why SparkSQL try to cast the body as String, even if it's a ByteArray in the column type (from the Avro schema).
> When I do the cast manually, it works:
> {code:sql}
> select cast(body as String) from mytable limit 10;
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org