You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Michael Chirico (JIRA)" <ji...@apache.org> on 2018/12/11 02:50:00 UTC
[jira] [Updated] (SPARK-26331) Allow SQL UDF registration to
recognize default function values from Scala
[ https://issues.apache.org/jira/browse/SPARK-26331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Chirico updated SPARK-26331:
------------------------------------
Description:
{code:java}
de placeholder
{code}
As described here:
[https://stackoverflow.com/q/53702727/3576984]
I have a UDF I would like to be flexible enough to accept 3 arguments (or in general n+k), but for the most part, only 2 (in general, n) are required. The natural approach to this is to implement the UDF with 3 arguments, one of which has a standard default value.
Copying a toy example from SO:
{\{package myUDFs }}
{\{import org.apache.spark.sql.api.java.UDF3 }}
{{class my_udf extends UDF3[Int, Int, Int, Int] \{ }}\{{ }}
{\{ override def call(a: Int, b: Int, c: Int = 6): Int ={ }}\{{ }}
{\{ c*(a + b) }}\{{ }}
{\{ }}}
{{}}}
I would prefer the following to give the expected output of 18:
{\{from pyspark.conf import SparkConf }}
{\{from pyspark.sql import SparkSession }}
{\{from pyspark.sql.types import IntType }}
{\{spark_conf = SparkConf().setAll([ ('spark.jars', 'myUDFs-assembly-0.1.1.jar') ]) }}
{\{spark = SparkSession.builder.appName('my_app').config(conf = spark_conf).enableHiveSupport().getOrCreate() }}
{{spark.udf.registerJavaFunction("my_udf", "myUDFs.my_udf", IntType())}}
{{spark.sql('select my_udf(1, 2)').collect()}}
But it seems this is currently impossible.
was:
As described here:
[https://stackoverflow.com/q/53702727/3576984]
I have a UDF I would like to be flexible enough to accept 3 arguments (or in general n+k), but for the most part, only 2 (in general, n) are required. The natural approach to this is to implement the UDF with 3 arguments, one of which has a standard default value.
Copying a toy example from SO:
{{package myUDFs }}
{{import org.apache.spark.sql.api.java.UDF3 }}
{{class my_udf extends UDF3[Int, Int, Int, Int] { }}{{ }}
{{ override def call(a: Int, b: Int, c: Int = 6): Int ={ }}{{ }}
{{ c*(a + b) }}{{ }}
{{ }}}
{{}}}
I would prefer the following to give the expected output of 18:
{{from pyspark.conf import SparkConf }}
{{from pyspark.sql import SparkSession }}
{{from pyspark.sql.types import IntType }}
{{spark_conf = SparkConf().setAll([ ('spark.jars', 'myUDFs-assembly-0.1.1.jar') ]) }}
{{spark = SparkSession.builder.appName('my_app').config(conf = spark_conf).enableHiveSupport().getOrCreate() }}
{{spark.udf.registerJavaFunction("my_udf", "myUDFs.my_udf", IntType())}}
{{spark.sql('select my_udf(1, 2)').collect()}}
But it seems this is currently impossible.
> Allow SQL UDF registration to recognize default function values from Scala
> --------------------------------------------------------------------------
>
> Key: SPARK-26331
> URL: https://issues.apache.org/jira/browse/SPARK-26331
> Project: Spark
> Issue Type: Improvement
> Components: PySpark, SQL
> Affects Versions: 2.4.0
> Reporter: Michael Chirico
> Priority: Minor
>
> {code:java}
> de placeholder
> {code}
> As described here:
> [https://stackoverflow.com/q/53702727/3576984]
> I have a UDF I would like to be flexible enough to accept 3 arguments (or in general n+k), but for the most part, only 2 (in general, n) are required. The natural approach to this is to implement the UDF with 3 arguments, one of which has a standard default value.
> Copying a toy example from SO:
> {\{package myUDFs }}
> {\{import org.apache.spark.sql.api.java.UDF3 }}
> {{class my_udf extends UDF3[Int, Int, Int, Int] \{ }}\{{ }}
> {\{ override def call(a: Int, b: Int, c: Int = 6): Int ={ }}\{{ }}
> {\{ c*(a + b) }}\{{ }}
> {\{ }}}
> {{}}}
> I would prefer the following to give the expected output of 18:
> {\{from pyspark.conf import SparkConf }}
> {\{from pyspark.sql import SparkSession }}
> {\{from pyspark.sql.types import IntType }}
> {\{spark_conf = SparkConf().setAll([ ('spark.jars', 'myUDFs-assembly-0.1.1.jar') ]) }}
> {\{spark = SparkSession.builder.appName('my_app').config(conf = spark_conf).enableHiveSupport().getOrCreate() }}
> {{spark.udf.registerJavaFunction("my_udf", "myUDFs.my_udf", IntType())}}
> {{spark.sql('select my_udf(1, 2)').collect()}}
> But it seems this is currently impossible.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org