You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by "allisonwang-db (via GitHub)" <gi...@apache.org> on 2023/05/04 05:53:25 UTC

[GitHub] [spark] allisonwang-db commented on a diff in pull request #40896: [SPARK-43229][ML][PYTHON][CONNECT] Introduce Barrier Python UDF

allisonwang-db commented on code in PR #40896:
URL: https://github.com/apache/spark/pull/40896#discussion_r1184564949


##########
connector/connect/common/src/main/protobuf/spark/connect/expressions.proto:
##########
@@ -333,6 +333,9 @@ message PythonUDF {
   bytes command = 3;
   // (Required) Python version being used in the client.
   string python_ver = 4;
+  // (Optional) Whether this PythonUDF should be executed in barrier mode.

Review Comment:
   Can you briefly describe in the comment what barrier mode is used for?



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/PythonUDF.scala:
##########
@@ -73,9 +73,18 @@ case class PythonUDF(
     children: Seq[Expression],
     evalType: Int,
     udfDeterministic: Boolean,
-    resultId: ExprId = NamedExpression.newExprId)
+    resultId: ExprId = NamedExpression.newExprId,
+    isBarrier: Boolean = false)

Review Comment:
   This seems to be very specific to the ML use case and not generic for Python UDFs. Do we expect more configs like this to be introduced in the future? Have we considered using a metadata or options map here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org