You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/11/25 07:28:45 UTC

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38796: [SPARK-41260][PYTHON][SS] Cast NumPy instances to Python primitive types in GroupState update

HeartSaVioR commented on code in PR #38796:
URL: https://github.com/apache/spark/pull/38796#discussion_r1032072872


##########
sql/core/src/test/scala/org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite.scala:
##########
@@ -803,4 +803,49 @@ class FlatMapGroupsInPandasWithStateSuite extends StateStoreMetricsTest {
         total = Seq(1), updated = Seq(1), droppedByWatermark = Seq(0), removed = Some(Seq(1)))
     )
   }
+
+  test("SPARK-41260: applyInPandasWithState - NumPy instances to JVM rows in state") {
+    assume(shouldTestPandasUDFs)
+
+    val pythonScript =
+    """
+      |import pandas as pd
+      |import numpy as np
+      |from pyspark.sql.types import StructType, StructField, StringType
+      |
+      |tpe = StructType([
+      |    StructField("key", StringType()),
+      |    StructField("valueAsString", StringType())])
+      |
+      |def func(key, pdf_iter, state):
+      |    np_value = np.int64(1)  # NumPy instance

Review Comment:
   Could we please imagine/check some more types which is used widely and add them to the test here as well? No need to be exhaustive but single check seems to be too specific.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org