You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/12/04 02:50:41 UTC

[PR] [WIP][SPARK-46229][PYTHON][CONNECT] Add applyInArrow to groupBy and cogroup in Spark Connect [spark]

HyukjinKwon opened a new pull request, #44146:
URL: https://github.com/apache/spark/pull/44146

   ### What changes were proposed in this pull request?
   
   This PR implements Spark Connect version of https://github.com/apache/spark/pull/38624.
   
   ### Why are the changes needed?
   
   For feature parity.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, it adds a new API for Python Spark Connect client.
   
   ### How was this patch tested?
   
   Reused unittest and doctests.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46229][PYTHON][CONNECT] Add applyInArrow to groupBy and cogroup in Spark Connect [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon closed pull request #44146: [SPARK-46229][PYTHON][CONNECT] Add applyInArrow to groupBy and cogroup in Spark Connect
URL: https://github.com/apache/spark/pull/44146


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [WIP][SPARK-46229][PYTHON][CONNECT] Add applyInArrow to groupBy and cogroup in Spark Connect [spark]

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on PR #44146:
URL: https://github.com/apache/spark/pull/44146#issuecomment-1837758466

   LGTM pending CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46229][PYTHON][CONNECT] Add applyInArrow to groupBy and cogroup in Spark Connect [spark]

Posted by "ueshin (via GitHub)" <gi...@apache.org>.
ueshin commented on code in PR #44146:
URL: https://github.com/apache/spark/pull/44146#discussion_r1414336068


##########
python/pyspark/sql/connect/_typing.py:
##########
@@ -14,14 +14,14 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
-
 import sys
 
 if sys.version_info >= (3, 8):
-    from typing import Protocol
+    from typing import Protocol, Tuple

Review Comment:
   nit: This seems to be no-op as `Tuple` will be overwritten by the following imports?



##########
python/pyspark/sql/connect/_typing.py:
##########
@@ -14,14 +14,14 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
-
 import sys
 
 if sys.version_info >= (3, 8):
-    from typing import Protocol
+    from typing import Protocol, Tuple

Review Comment:
   btw, we can remove `if sys.version_info >= (3, 8):` and the following part as we already dropped Python<3.8?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46229][PYTHON][CONNECT] Add applyInArrow to groupBy and cogroup in Spark Connect [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #44146:
URL: https://github.com/apache/spark/pull/44146#issuecomment-1837974869

   I manually tested the latest changes.
   
   
   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org