You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/11/24 01:12:22 UTC

[PR] [SPARK-46082][PYTHON][CONNECT] Fix protobuf string representation for Pandas Functions API with Spark Connect [spark]

HyukjinKwon opened a new pull request, #43991:
URL: https://github.com/apache/spark/pull/43991

   ### What changes were proposed in this pull request?
   
   This PR proposes to rename `_func` to `_functions` in the protobuf instances for Pandas Functions API with Spark Connect so the string presentation includes them (see also https://github.com/apache/spark/pull/39223).
   
   ### Why are the changes needed?
   
   In order to have the pretty string format for protobuf messages in Python side.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes,
   
   ```bash
   ./bin/pyspark --remote local
   ```
   
   ```python
   df = spark.range(1)
   print(df.mapInPandas(lambda x: x, df.schema)._plan.print())
   ```
   
   **Before:**
   ```
   <MapPartitions is_barrier='False'>
     <Range start='0', end='1', step='1', num_partitions='None'>
   ```
     
   **After:**
   
   ```
   <MapPartitions function='<lambda>(id)', is_barrier='False'>
     <Range start='0', end='1', step='1', num_partitions='None'>
   ```
   
   ### How was this patch tested?
   
   Manually tested as above.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46082][PYTHON][CONNECT] Fix protobuf string representation for Pandas Functions API with Spark Connect [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon closed pull request #43991: [SPARK-46082][PYTHON][CONNECT] Fix protobuf string representation for Pandas Functions API with Spark Connect
URL: https://github.com/apache/spark/pull/43991


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46082][PYTHON][CONNECT] Fix protobuf string representation for Pandas Functions API with Spark Connect [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #43991:
URL: https://github.com/apache/spark/pull/43991#discussion_r1403812894


##########
python/pyspark/sql/connect/group.py:
##########
@@ -378,15 +378,13 @@ def applyInPandas(
             evalType=PythonEvalType.SQL_COGROUPED_MAP_PANDAS_UDF,
         )
 
-        all_cols = self._extract_cols(self._gd1) + self._extract_cols(self._gd2)

Review Comment:
   This is not used. I piggyback this chance since I am here ... cc @xinrong-meng 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46082][PYTHON][CONNECT] Fix protobuf string representation for Pandas Functions API with Spark Connect [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #43991:
URL: https://github.com/apache/spark/pull/43991#issuecomment-1825038333

   Build: https://github.com/HyukjinKwon/spark/actions/runs/6975676691/job/18983131853


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46082][PYTHON][CONNECT] Fix protobuf string representation for Pandas Functions API with Spark Connect [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #43991:
URL: https://github.com/apache/spark/pull/43991#issuecomment-1825092637

   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org