You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Takeshi Yamamuro (JIRA)" <ji...@apache.org> on 2015/07/07 02:33:05 UTC

[jira] [Updated] (SPARK-6747) Throw an AnalysisException when unsupported Java list types used in Hive UDF

     [ https://issues.apache.org/jira/browse/SPARK-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Takeshi Yamamuro updated SPARK-6747:
------------------------------------
    Summary: Throw an AnalysisException when unsupported Java list types used in Hive UDF  (was: Support List<> as a return type in Hive UDF)

> Throw an AnalysisException when unsupported Java list types used in Hive UDF
> ----------------------------------------------------------------------------
>
>                 Key: SPARK-6747
>                 URL: https://issues.apache.org/jira/browse/SPARK-6747
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.4.0
>            Reporter: Takeshi Yamamuro
>
> The current implementation can't handle List<> as a return type in Hive UDF.
> We assume an UDF below;
> public class UDFToListString extends UDF {
>     public List<String> evaluate(Object o) {
>         return Arrays.asList("xxx", "yyy", "zzz");
>     }
> }
> An exception of scala.MatchError is thrown as follows when the UDF used;
> scala.MatchError: interface java.util.List (of class java.lang.Class)
> 	at org.apache.spark.sql.hive.HiveInspectors$class.javaClassToDataType(HiveInspectors.scala:174)
> 	at org.apache.spark.sql.hive.HiveSimpleUdf.javaClassToDataType(hiveUdfs.scala:76)
> 	at org.apache.spark.sql.hive.HiveSimpleUdf.dataType$lzycompute(hiveUdfs.scala:106)
> 	at org.apache.spark.sql.hive.HiveSimpleUdf.dataType(hiveUdfs.scala:106)
> 	at org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:131)
> 	at org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:95)
> 	at org.apache.spark.sql.catalyst.planning.PhysicalOperation$$anonfun$collectAliases$1.applyOrElse(patterns.scala:94)
> 	at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
> 	at scala.collection.TraversableLike$$anonfun$collect$1.apply(TraversableLike.scala:278)
> ...
> To fix this problem, we need to add an entry for List<> in HiveInspectors#javaClassToDataType.
> However, it has one difficulty because of type erasure in JVM.
> We assume that lines below are appended in HiveInspectors#javaClassToDataType;
>     // list type
>     case c: Class[_] if c == classOf[java.util.List[java.lang.Object]] =>
>     val tpe = c.getGenericInterfaces()(0).asInstanceOf[ParameterizedType]
>     println(tpe.getActualTypeArguments()(0).toString()) => 'E'
> This logic fails to catch a component type in List<>.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org