You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Nicholas Chammas <ni...@gmail.com> on 2016/07/29 02:06:57 UTC

PySpark UDFs with a return type of FloatType can't handle int return values

If I define a UDF in PySpark that has a return type of FloatType, but the
underlying function actually returns an int, the UDF throws the int away
and returns None.

It seems that some machinery inside pyspark.sql.types is perhaps unaware
that it can always cast ints to floats.

Is this functionality that we would want to add in, or is it beyond the
scope of what UDFs should be expected to do?

Nick
​