You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Artem Kalchenko (JIRA)" <ji...@apache.org> on 2019/04/29 09:31:00 UTC

[jira] [Created] (SPARK-27591) A bug in UnivocityParser prevents using UDT

Artem Kalchenko created SPARK-27591:
---------------------------------------

             Summary: A bug in UnivocityParser prevents using UDT
                 Key: SPARK-27591
                 URL: https://issues.apache.org/jira/browse/SPARK-27591
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.4.2
            Reporter: Artem Kalchenko


I am trying to define a UserDefinedType based on String but different from StringType in Spark 2.4.1 but it looks like there is a bug in Spark or I am doing smth incorrectly.

I define my type as follows:
{code:java}
class MyType extends UserDefinedType[MyValue] {
  override def sqlType: DataType = StringType
  ...
}

@SQLUserDefinedType(udt = classOf[MyType])
case class MyValue
{code}
I expect it to be read and stored as String with just a custom SQL type. In fact Spark can't read the string at all:
{code:java}
java.lang.ClassCastException: org.apache.spark.sql.execution.datasources.csv.UnivocityParser$$anonfun$makeConverter$11 cannot be cast to org.apache.spark.unsafe.types.UTF8String
    at org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getUTF8String(rows.scala:46)
    at org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getUTF8String(rows.scala:195)
    at org.apache.spark.sql.catalyst.expressions.JoinedRow.getUTF8String(JoinedRow.scala:102)
{code}
the problem is with UnivocityParser.makeConverter that doesn't return (String => Any) function but (String => (String => Any)) in the case of UDT, see UnivocityParser:184
{code:java}
case udt: UserDefinedType[_] => (datum: String) =>
  makeConverter(name, udt.sqlType, nullable, options)

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org