You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by "ueshin (via GitHub)" <gi...@apache.org> on 2023/03/13 19:58:51 UTC

[GitHub] [spark] ueshin opened a new pull request, #40402: [SPARK-42020][CONNECT][PYTHON] Support UserDefinedType in Spark Connect

ueshin opened a new pull request, #40402:
URL: https://github.com/apache/spark/pull/40402

   ### What changes were proposed in this pull request?
   
   Supports `UserDefinedType` in Spark Connect.
   
   ### Why are the changes needed?
   
   Currently Spark Connect doesn't support UDTs.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, UDTs will be available in Spark Connect.
   
   ### How was this patch tested?
   
   Enabled the related tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on a diff in pull request #40402: [SPARK-42020][CONNECT][PYTHON] Support UserDefinedType in Spark Connect

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on code in PR #40402:
URL: https://github.com/apache/spark/pull/40402#discussion_r1134776117


##########
connector/connect/common/src/main/protobuf/spark/connect/base.proto:
##########
@@ -272,6 +272,9 @@ message ExecutePlanResponse {
   // The metrics observed during the execution of the query plan.
   repeated ObservedMetrics observed_metrics = 6;
 
+  // The Spark schema
+  DataType schema = 7;

Review Comment:
   Is it a optional field only available in `df.collect`?



##########
python/pyspark/sql/connect/dataframe.py:
##########
@@ -1344,9 +1344,9 @@ def collect(self) -> List[Row]:
         if self._session is None:
             raise Exception("Cannot collect on empty session.")
         query = self._plan.to_proto(self._session.client)
-        table = self._session.client.to_table(query)
+        table, schema = self._session.client.to_table(query)
 
-        schema = from_arrow_schema(table.schema)
+        schema = schema or from_arrow_schema(table.schema)

Review Comment:
   when I was implementing the collection I was thinking about this dumb question: is it possible to make arrow schema 100% compatible with spark schema if we store the different fields in arrow metadata?
   e.g. for a spark `MapType`, store the `valueContainsNull` as a metadata in the `pa.schema`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] ueshin commented on pull request #40402: [SPARK-42020][CONNECT][PYTHON] Support UserDefinedType in Spark Connect

Posted by "ueshin (via GitHub)" <gi...@apache.org>.

ueshin commented on PR #40402:
URL: https://github.com/apache/spark/pull/40402#issuecomment-1480349942

   @zhengruifeng I submitted two PRs: #40526 and #40527.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on pull request #40402: [SPARK-42020][CONNECT][PYTHON] Support UserDefinedType in Spark Connect

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on PR #40402:
URL: https://github.com/apache/spark/pull/40402#issuecomment-1477202593

   @ueshin it seem that `createDataFrame` always use the underlying `sqlType` other than the UDT itself:
   
   ```
   In [1]: from pyspark.ml.linalg import Vectors
   
   In [2]: df = spark.createDataFrame([(1.0, 1.0, Vectors.dense(0.0, 5.0)), (0.0, 2.0, Vectors.dense(1.0, 2.0)), (1.0, 3.0, Vectors.dense(2.0,
      ...: 1.0)), (0.0, 4.0, Vectors.dense(3.0, 3.0)),], ["label", "weight", "features"],)
   
   In [3]: df.schema
   Out[3]: StructType([StructField('label', DoubleType(), True), StructField('weight', DoubleType(), True), StructField('features', StructType([StructField('type', ByteType(), False), StructField('size', IntegerType(), True), StructField('indices', ArrayType(IntegerType(), False), True), StructField('values', ArrayType(DoubleType(), False), True)]), True)])
   
   In [4]: df.collect()
   Out[4]: :>                                                          (0 + 4) / 4]
   [Row(label=1.0, weight=1.0, features=Row(type=1, size=None, indices=None, values=[0.0, 5.0])),
    Row(label=0.0, weight=2.0, features=Row(type=1, size=None, indices=None, values=[1.0, 2.0])),
    Row(label=1.0, weight=3.0, features=Row(type=1, size=None, indices=None, values=[2.0, 1.0])),
    Row(label=0.0, weight=4.0, features=Row(type=1, size=None, indices=None, values=[3.0, 3.0]))]
   ```
   
   while in vanilla PySpark:
   ```
   In [1]: from pyspark.ml.linalg import Vectors
   
   In [2]: df = spark.createDataFrame([(1.0, 1.0, Vectors.dense(0.0, 5.0)), (0.0, 2.0, Vectors.dense(1.0, 2.0)), (1.0, 3.0, Vectors.dense(2.0,
      ...:    ...: 1.0)), (0.0, 4.0, Vectors.dense(3.0, 3.0)),], ["label", "weight", "features"],)
   
   In [3]: df.schema
   Out[3]: StructType([StructField('label', DoubleType(), True), StructField('weight', DoubleType(), True), StructField('features', VectorUDT(), True)])
   
   In [4]: df.collect()
   Out[4]:                                                                         
   [Row(label=1.0, weight=1.0, features=DenseVector([0.0, 5.0])),
    Row(label=0.0, weight=2.0, features=DenseVector([1.0, 2.0])),
    Row(label=1.0, weight=3.0, features=DenseVector([2.0, 1.0])),
    Row(label=0.0, weight=4.0, features=DenseVector([3.0, 3.0]))]
   ```
   
   
   also cc @WeichenXu123 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on pull request #40402: [SPARK-42020][CONNECT][PYTHON] Support UserDefinedType in Spark Connect

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on PR #40402:
URL: https://github.com/apache/spark/pull/40402#issuecomment-1477518340

   @ueshin Sure, thanks!
   
   `StructType().add("label", DoubleType()).add("weight", DoubleType()).add("features", VectorUDT(), False)` works, but the `nullable` in column `features` must be `False`, otherwise:
   ```
   AnalysisException: [NULLABLE_COLUMN_OR_FIELD] Column or field `features`.`type` is nullable while it's required to be non-nullable.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] ueshin commented on a diff in pull request #40402: [SPARK-42020][CONNECT][PYTHON] Support UserDefinedType in Spark Connect

Posted by "ueshin (via GitHub)" <gi...@apache.org>.

ueshin commented on code in PR #40402:
URL: https://github.com/apache/spark/pull/40402#discussion_r1134799357


##########
python/pyspark/sql/connect/dataframe.py:
##########
@@ -1344,9 +1344,9 @@ def collect(self) -> List[Row]:
         if self._session is None:
             raise Exception("Cannot collect on empty session.")
         query = self._plan.to_proto(self._session.client)
-        table = self._session.client.to_table(query)
+        table, schema = self._session.client.to_table(query)
 
-        schema = from_arrow_schema(table.schema)
+        schema = schema or from_arrow_schema(table.schema)

Review Comment:
   Btw, for the collections, we can retrieve nullability from the schema.
   
   For example:
   
   ```py
   if pa.is_list(at):
     field = at.value_field
     ArrayType(from_arrow_type(field.type), containsNull=field.nullable)
   ```
   
   I guess map type also works?
   
   ```py
   if pa.is_map(at):
     valueContainsNull = at.item_field.nullable
   ...
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on pull request #40402: [SPARK-42020][CONNECT][PYTHON] Support UserDefinedType in Spark Connect

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on PR #40402:
URL: https://github.com/apache/spark/pull/40402#issuecomment-1467212310

   also cc @WeichenXu123 since this PR supports `df.collect` with UDT


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] HyukjinKwon closed pull request #40402: [SPARK-42020][CONNECT][PYTHON] Support UserDefinedType in Spark Connect

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.

HyukjinKwon closed pull request #40402: [SPARK-42020][CONNECT][PYTHON] Support UserDefinedType in Spark Connect
URL: https://github.com/apache/spark/pull/40402


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] ueshin commented on a diff in pull request #40402: [SPARK-42020][CONNECT][PYTHON] Support UserDefinedType in Spark Connect

Posted by "ueshin (via GitHub)" <gi...@apache.org>.

ueshin commented on code in PR #40402:
URL: https://github.com/apache/spark/pull/40402#discussion_r1134749726


##########
connector/connect/common/src/main/protobuf/spark/connect/base.proto:
##########


Review Comment:
   @grundprinzip The actual Spark data type is necessary to rebuild the UDT objects.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] ueshin commented on pull request #40402: [SPARK-42020][CONNECT][PYTHON] Support UserDefinedType in Spark Connect

Posted by "ueshin (via GitHub)" <gi...@apache.org>.

ueshin commented on PR #40402:
URL: https://github.com/apache/spark/pull/40402#issuecomment-1477260665

   @zhengruifeng ah, seems like something is wrong when the schema is a column name list.
   Could you use `StructType` to specify the schema as a workaround?
   I'll take a look later.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on pull request #40402: [SPARK-42020][CONNECT][PYTHON] Support UserDefinedType in Spark Connect

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on PR #40402:
URL: https://github.com/apache/spark/pull/40402#issuecomment-1477209035

   I save a df with UDT in pyspark, and then read it in python client, and it works fine. So I guess something is wrong in 
   `createDataFrame`
   
   vanilla PySpark:
   ```
   In [1]: from pyspark.ml.linalg import Vectors
   
   In [2]: df = spark.createDataFrame([(1.0, 1.0, Vectors.dense(0.0, 5.0)), (0.0, 2.0, Vectors.dense(1.0, 2.0)), (1.0, 3.0, Vectors.dense(2.0,
      ...:    ...: 1.0)), (0.0, 4.0, Vectors.dense(3.0, 3.0)),], ["label", "weight", "features"],)
   
   In [3]: df.write.parquet("/tmp/tmp.pq")
   ```
   
   Python Client:
   ```
   In [6]: df = spark.read.parquet("/tmp/tmp.pq")
   
   In [7]: df.schema
   Out[7]: StructType([StructField('label', DoubleType(), True), StructField('weight', DoubleType(), True), StructField('features', VectorUDT(), True)])
   
   In [8]: df.collect()
   Out[8]: 
   [Row(label=0.0, weight=4.0, features=DenseVector([3.0, 3.0])),
    Row(label=0.0, weight=2.0, features=DenseVector([1.0, 2.0])),
    Row(label=1.0, weight=3.0, features=DenseVector([2.0, 1.0])),
    Row(label=1.0, weight=1.0, features=DenseVector([0.0, 5.0]))]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] ueshin commented on a diff in pull request #40402: [SPARK-42020][CONNECT][PYTHON] Support UserDefinedType in Spark Connect

Posted by "ueshin (via GitHub)" <gi...@apache.org>.

ueshin commented on code in PR #40402:
URL: https://github.com/apache/spark/pull/40402#discussion_r1134793723


##########
connector/connect/common/src/main/protobuf/spark/connect/base.proto:
##########
@@ -272,6 +272,9 @@ message ExecutePlanResponse {
   // The metrics observed during the execution of the query plan.
   repeated ObservedMetrics observed_metrics = 6;
 
+  // The Spark schema
+  DataType schema = 7;

Review Comment:
   Yes, it's only for `df.collect` for now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] ueshin commented on a diff in pull request #40402: [SPARK-42020][CONNECT][PYTHON] Support UserDefinedType in Spark Connect

Posted by "ueshin (via GitHub)" <gi...@apache.org>.

ueshin commented on code in PR #40402:
URL: https://github.com/apache/spark/pull/40402#discussion_r1134799357


##########
python/pyspark/sql/connect/dataframe.py:
##########
@@ -1344,9 +1344,9 @@ def collect(self) -> List[Row]:
         if self._session is None:
             raise Exception("Cannot collect on empty session.")
         query = self._plan.to_proto(self._session.client)
-        table = self._session.client.to_table(query)
+        table, schema = self._session.client.to_table(query)
 
-        schema = from_arrow_schema(table.schema)
+        schema = schema or from_arrow_schema(table.schema)

Review Comment:
   Btw, for the collections, we can retrieve nullability from the schema.
   
   For example:
   
   ```
   if pa.is_list(at):
     field = at.value_field
     ArrayType(from_arrow_type(field.type), containsNull=field.nullable)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on a diff in pull request #40402: [SPARK-42020][CONNECT][PYTHON] Support UserDefinedType in Spark Connect

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on code in PR #40402:
URL: https://github.com/apache/spark/pull/40402#discussion_r1134827687


##########
python/pyspark/sql/connect/dataframe.py:
##########
@@ -1344,9 +1344,9 @@ def collect(self) -> List[Row]:
         if self._session is None:
             raise Exception("Cannot collect on empty session.")
         query = self._plan.to_proto(self._session.client)
-        table = self._session.client.to_table(query)
+        table, schema = self._session.client.to_table(query)
 
-        schema = from_arrow_schema(table.schema)
+        schema = schema or from_arrow_schema(table.schema)

Review Comment:
   It seems that MapType's `valueContainsNull` is stored in `to_arrow_type` but discarded in `from_arrow_type`.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] ueshin commented on a diff in pull request #40402: [SPARK-42020][CONNECT][PYTHON] Support UserDefinedType in Spark Connect

Posted by "ueshin (via GitHub)" <gi...@apache.org>.

ueshin commented on code in PR #40402:
URL: https://github.com/apache/spark/pull/40402#discussion_r1134796113


##########
python/pyspark/sql/connect/dataframe.py:
##########
@@ -1344,9 +1344,9 @@ def collect(self) -> List[Row]:
         if self._session is None:
             raise Exception("Cannot collect on empty session.")
         query = self._plan.to_proto(self._session.client)
-        table = self._session.client.to_table(query)
+        table, schema = self._session.client.to_table(query)
 
-        schema = from_arrow_schema(table.schema)
+        schema = schema or from_arrow_schema(table.schema)

Review Comment:
   That's interesting and I was thinking about the similar thing, but I didn't take the approach because it takes some space in the `RecordBatch` that could repeatedly be sent to client. That means the schema space could be huge if repeat many times.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] ueshin commented on a diff in pull request #40402: [SPARK-42020][CONNECT][PYTHON] Support UserDefinedType in Spark Connect

Posted by "ueshin (via GitHub)" <gi...@apache.org>.

ueshin commented on code in PR #40402:
URL: https://github.com/apache/spark/pull/40402#discussion_r1134750091


##########
connector/connect/common/src/main/protobuf/spark/connect/base.proto:
##########
@@ -272,6 +272,9 @@ message ExecutePlanResponse {
   // The metrics observed during the execution of the query plan.
   repeated ObservedMetrics observed_metrics = 6;
 
+  // The Spark schema
+  DataType schema = 7;

Review Comment:
   @grundprinzip The actual Spark data type is necessary to rebuild the UDT objects.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] ueshin commented on a diff in pull request #40402: [SPARK-42020][CONNECT][PYTHON] Support UserDefinedType in Spark Connect

Posted by "ueshin (via GitHub)" <gi...@apache.org>.

ueshin commented on code in PR #40402:
URL: https://github.com/apache/spark/pull/40402#discussion_r1134750091


##########
connector/connect/common/src/main/protobuf/spark/connect/base.proto:
##########
@@ -272,6 +272,9 @@ message ExecutePlanResponse {
   // The metrics observed during the execution of the query plan.
   repeated ObservedMetrics observed_metrics = 6;
 
+  // The Spark schema
+  DataType schema = 7;

Review Comment:
   @grundprinzip The actual Spark data type in the execution result is necessary to rebuild the UDT objects.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] ueshin commented on pull request #40402: [SPARK-42020][CONNECT][PYTHON] Support UserDefinedType in Spark Connect

Posted by "ueshin (via GitHub)" <gi...@apache.org>.

ueshin commented on PR #40402:
URL: https://github.com/apache/spark/pull/40402#issuecomment-1474479783

   @zhengruifeng Sorry, I missed your comment:
   
   > will there be another PR for the support of UDT in `createDataFrame`?
   
   No, this also enables UDT in `createDataFrame`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on pull request #40402: [SPARK-42020][CONNECT][PYTHON] Support UserDefinedType in Spark Connect

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.

HyukjinKwon commented on PR #40402:
URL: https://github.com/apache/spark/pull/40402#issuecomment-1475453518

   Merged to master and branch-3.4.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org