You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/10/29 10:39:43 UTC
[GitHub] [spark] dchvn opened a new pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
dchvn opened a new pull request #34439:
URL: https://github.com/apache/spark/pull/34439
### What changes were proposed in this pull request?
Inline type hints for python/pyspark/broadcast.py
### Why are the changes needed?
We can take advantage of static type checking within the functions by inlining the type hints.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Existing tests
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r754106603
##########
File path: python/pyspark/broadcast.py
##########
@@ -62,35 +81,44 @@ class Broadcast(object):
>>> large_broadcast = sc.broadcast(range(10000))
"""
- def __init__(self, sc=None, value=None, pickle_registry=None, path=None, sock_file=None):
+ def __init__(
+ self,
+ sc: Optional["SparkContext"] = None,
+ value: Optional[T] = None,
+ pickle_registry: Optional["BroadcastPickleRegistry"] = None,
+ path: Optional[Any] = None,
+ sock_file: Optional[Any] = None,
+ ):
"""
Should not be called directly by users -- use :meth:`SparkContext.broadcast`
instead.
"""
if sc is not None:
# we're on the driver. We want the pickled data to end up in a file (maybe encrypted)
- f = NamedTemporaryFile(delete=False, dir=sc._temp_dir)
+ f = NamedTemporaryFile(delete=False, dir=sc._temp_dir) # type: ignore[attr-defined]
self._path = f.name
- self._sc = sc
- self._python_broadcast = sc._jvm.PythonRDD.setupBroadcast(self._path)
- if sc._encryption_enabled:
+ self._sc: Optional["SparkContext"] = sc
+ self._python_broadcast = sc._jvm.PythonRDD.setupBroadcast(self._path) # type: ignore[attr-defined]
+ if sc._encryption_enabled: # type: ignore[attr-defined]
# with encryption, we ask the jvm to do the encryption for us, we send it data
# over a socket
port, auth_secret = self._python_broadcast.setupEncryptionServer()
(encryption_sock_file, _) = local_connect_and_auth(port, auth_secret)
- broadcast_out = ChunkedStream(encryption_sock_file, 8192)
+ broadcast_out: Union[ChunkedStream, IO[bytes]] = ChunkedStream(
+ encryption_sock_file, 8192
+ )
else:
# no encryption, we can just write pickled data directly to the file from python
broadcast_out = f
- self.dump(value, broadcast_out)
- if sc._encryption_enabled:
+ self.dump(cast(T, value), broadcast_out)
Review comment:
@HyukjinKwon @itholic @ueshin @xinrong-databricks WDYT?
Do we need a fancy overload on `__int__` here? Something around these lines
```python
@overload # On driver
def __init__(self: Broadcast[T], sc: SparkContext, value: T pickle_registry: BroadcastPickleRegistry): ...
@overload # On worker without decryption server
def __init__(self: Broadcast[Any], *, path: str): ... # This is a placeholder for arbitrary value, so not Broadcast[None]
@overload # On worker with decryption server
def __init__(self: Broadcast[Any], *, sock_file: str): ... # Ditto
```
`cast` definitely seems wrong, because we know that this thing can be `None` in this control flow (this is in contrast to many optional fields we access and we know, that under normal operating conditions, are not null). If anything, it should be ignored.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-992086510
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50590/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-992107037
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50590/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-983249533
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50255/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r759804329
##########
File path: python/pyspark/broadcast.py
##########
@@ -62,35 +81,44 @@ class Broadcast(object):
>>> large_broadcast = sc.broadcast(range(10000))
"""
- def __init__(self, sc=None, value=None, pickle_registry=None, path=None, sock_file=None):
+ def __init__(
+ self,
+ sc: Optional["SparkContext"] = None,
+ value: Optional[T] = None,
+ pickle_registry: Optional["BroadcastPickleRegistry"] = None,
+ path: Optional[Any] = None,
+ sock_file: Optional[Any] = None,
+ ):
"""
Should not be called directly by users -- use :meth:`SparkContext.broadcast`
instead.
"""
if sc is not None:
# we're on the driver. We want the pickled data to end up in a file (maybe encrypted)
- f = NamedTemporaryFile(delete=False, dir=sc._temp_dir)
+ f = NamedTemporaryFile(delete=False, dir=sc._temp_dir) # type: ignore[attr-defined]
self._path = f.name
- self._sc = sc
- self._python_broadcast = sc._jvm.PythonRDD.setupBroadcast(self._path)
- if sc._encryption_enabled:
+ self._sc: Optional["SparkContext"] = sc
+ self._python_broadcast = sc._jvm.PythonRDD.setupBroadcast(self._path) # type: ignore[attr-defined]
+ if sc._encryption_enabled: # type: ignore[attr-defined]
# with encryption, we ask the jvm to do the encryption for us, we send it data
# over a socket
port, auth_secret = self._python_broadcast.setupEncryptionServer()
(encryption_sock_file, _) = local_connect_and_auth(port, auth_secret)
- broadcast_out = ChunkedStream(encryption_sock_file, 8192)
+ broadcast_out: Union[ChunkedStream, IO[bytes]] = ChunkedStream(
+ encryption_sock_file, 8192
+ )
else:
# no encryption, we can just write pickled data directly to the file from python
broadcast_out = f
- self.dump(value, broadcast_out)
- if sc._encryption_enabled:
+ self.dump(cast(T, value), broadcast_out)
Review comment:
me too
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] ueshin commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
ueshin commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r783469335
##########
File path: python/pyspark/broadcast.py
##########
@@ -175,27 +223,27 @@ def destroy(self, blocking=False):
self._jbroadcast.destroy(blocking)
os.unlink(self._path)
- def __reduce__(self):
+ def __reduce__(self) -> Tuple[Callable[[int], "Broadcast[T]"], Tuple[int]]:
if self._jbroadcast is None:
raise RuntimeError("Broadcast can only be serialized in driver")
- self._pickle_registry.add(self)
+ cast("BroadcastPickleRegistry", self._pickle_registry).add(self)
Review comment:
This should be:
```py
assert self._pickle_registry is not None
self._pickle_registry.add(self)
```
?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-992092768
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146115/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-954661653
**[Test build #144752 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144752/testReport)** for PR 34439 at commit [`c7d3417`](https://github.com/apache/spark/commit/c7d3417293c5de7e2dd8378891c766489016f246).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-954735990
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49221/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r753856072
##########
File path: python/pyspark/broadcast.py
##########
@@ -175,27 +203,27 @@ def destroy(self, blocking=False):
self._jbroadcast.destroy(blocking)
os.unlink(self._path)
- def __reduce__(self):
+ def __reduce__(self) -> Tuple[Callable[[int], T], Tuple[int]]:
if self._jbroadcast is None:
raise RuntimeError("Broadcast can only be serialized in driver")
- self._pickle_registry.add(self)
- return _from_id, (self._jbroadcast.id(),)
+ cast(Any, self._pickle_registry).add(self)
+ return cast(Tuple[Callable[[int], T], Tuple[int]], (_from_id, (self._jbroadcast.id(),)))
class BroadcastPickleRegistry(threading.local):
"""Thread-local registry for broadcast variables that have been pickled"""
- def __init__(self):
+ def __init__(self) -> None:
self.__dict__.setdefault("_registry", set())
- def __iter__(self):
+ def __iter__(self) -> Iterator[Broadcast]:
for bcast in self._registry:
yield bcast
- def add(self, bcast):
+ def add(self, bcast: Any) -> None:
Review comment:
`bcast: Broadcast`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-983224101
**[Test build #145781 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145781/testReport)** for PR 34439 at commit [`1308016`](https://github.com/apache/spark/commit/13080168236f3084cb2250478215acd525b85701).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dchvn commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
dchvn commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r759809276
##########
File path: python/pyspark/broadcast.py
##########
@@ -62,35 +81,44 @@ class Broadcast(object):
>>> large_broadcast = sc.broadcast(range(10000))
"""
- def __init__(self, sc=None, value=None, pickle_registry=None, path=None, sock_file=None):
+ def __init__(
+ self,
+ sc: Optional["SparkContext"] = None,
+ value: Optional[T] = None,
+ pickle_registry: Optional["BroadcastPickleRegistry"] = None,
+ path: Optional[Any] = None,
+ sock_file: Optional[Any] = None,
+ ):
"""
Should not be called directly by users -- use :meth:`SparkContext.broadcast`
instead.
"""
if sc is not None:
# we're on the driver. We want the pickled data to end up in a file (maybe encrypted)
- f = NamedTemporaryFile(delete=False, dir=sc._temp_dir)
+ f = NamedTemporaryFile(delete=False, dir=sc._temp_dir) # type: ignore[attr-defined]
self._path = f.name
- self._sc = sc
- self._python_broadcast = sc._jvm.PythonRDD.setupBroadcast(self._path)
- if sc._encryption_enabled:
+ self._sc: Optional["SparkContext"] = sc
+ self._python_broadcast = sc._jvm.PythonRDD.setupBroadcast(self._path) # type: ignore[attr-defined]
+ if sc._encryption_enabled: # type: ignore[attr-defined]
# with encryption, we ask the jvm to do the encryption for us, we send it data
# over a socket
port, auth_secret = self._python_broadcast.setupEncryptionServer()
(encryption_sock_file, _) = local_connect_and_auth(port, auth_secret)
- broadcast_out = ChunkedStream(encryption_sock_file, 8192)
+ broadcast_out: Union[ChunkedStream, IO[bytes]] = ChunkedStream(
+ encryption_sock_file, 8192
+ )
else:
# no encryption, we can just write pickled data directly to the file from python
broadcast_out = f
- self.dump(value, broadcast_out)
- if sc._encryption_enabled:
+ self.dump(cast(T, value), broadcast_out)
Review comment:
updated overload on `__init__` and change `cast` to `ignore`. Thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-966060536
**[Test build #145092 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145092/testReport)** for PR 34439 at commit [`1b89310`](https://github.com/apache/spark/commit/1b893106f58b48d4240f94d8866620c0d5fc53e4).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
* `class Broadcast(Generic[T]):`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-983237879
**[Test build #145781 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145781/testReport)** for PR 34439 at commit [`1308016`](https://github.com/apache/spark/commit/13080168236f3084cb2250478215acd525b85701).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-983239937
**[Test build #145782 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145782/testReport)** for PR 34439 at commit [`cc65b8c`](https://github.com/apache/spark/commit/cc65b8c26c9e33ada63ade7218d2ea869010d0a6).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-983276401
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50255/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-992092768
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146115/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-954735990
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49221/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-954695002
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144752/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-971142238
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145302/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-971139468
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49772/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r781536111
##########
File path: python/pyspark/context.py
##########
@@ -1183,7 +1183,7 @@ def union(self, rdds: List["RDD[T]"]) -> "RDD[T]":
jrdds[i] = rdds[i]._jrdd # type: ignore[attr-defined]
return RDD(self._jsc.union(jrdds), self, rdds[0]._jrdd_deserializer) # type: ignore[attr-defined]
- def broadcast(self, value: T) -> "Broadcast[T]":
+ def broadcast(self, value: T) -> "Broadcast":
Review comment:
After `Broadcast` is made `Generic` again this should be `Broadcast[T]`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-1011944338
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-966172142
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49561/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-954708644
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49221/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r753855676
##########
File path: python/pyspark/broadcast.py
##########
@@ -175,27 +203,27 @@ def destroy(self, blocking=False):
self._jbroadcast.destroy(blocking)
os.unlink(self._path)
- def __reduce__(self):
+ def __reduce__(self) -> Tuple[Callable[[int], T], Tuple[int]]:
Review comment:
This signature seems to be wrong. Double checking the flow:
- `_from_id` returns `_broadcastRegistry[bid]`
- `_broadcastRegistry` is `_broadcastRegistry: Dict[int, "Broadcast[Any]"]`
- So `_from_id` is either `Callable[[int], Broadcast[T]]`, or if it doesn't type check, `Callable[[int], Broadcast[Any]]`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-983276401
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50255/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-992069186
**[Test build #146115 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146115/testReport)** for PR 34439 at commit [`6905265`](https://github.com/apache/spark/commit/6905265c2374e15479c241752eb19a24ac5e3589).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-954678753
**[Test build #144752 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144752/testReport)** for PR 34439 at commit [`c7d3417`](https://github.com/apache/spark/commit/c7d3417293c5de7e2dd8378891c766489016f246).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
* `class Broadcast(Generic[T]):`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dchvn commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
dchvn commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-957181189
CC @HyukjinKwon @zero323 @ueshin too. Many thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-966075665
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145092/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-966165831
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49561/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-966045479
**[Test build #145092 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145092/testReport)** for PR 34439 at commit [`1b89310`](https://github.com/apache/spark/commit/1b893106f58b48d4240f94d8866620c0d5fc53e4).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dchvn commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
dchvn commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r754040175
##########
File path: python/pyspark/broadcast.py
##########
@@ -62,35 +81,44 @@ class Broadcast(object):
>>> large_broadcast = sc.broadcast(range(10000))
"""
- def __init__(self, sc=None, value=None, pickle_registry=None, path=None, sock_file=None):
+ def __init__(
+ self,
+ sc: Optional["SparkContext"] = None,
+ value: Optional[T] = None,
+ pickle_registry: Optional["BroadcastPickleRegistry"] = None,
+ path: Optional[Any] = None,
+ sock_file: Optional[Any] = None,
+ ):
"""
Should not be called directly by users -- use :meth:`SparkContext.broadcast`
instead.
"""
if sc is not None:
# we're on the driver. We want the pickled data to end up in a file (maybe encrypted)
- f = NamedTemporaryFile(delete=False, dir=sc._temp_dir)
+ f = NamedTemporaryFile(delete=False, dir=sc._temp_dir) # type: ignore[attr-defined]
self._path = f.name
- self._sc = sc
- self._python_broadcast = sc._jvm.PythonRDD.setupBroadcast(self._path)
- if sc._encryption_enabled:
+ self._sc: Optional["SparkContext"] = sc
+ self._python_broadcast = sc._jvm.PythonRDD.setupBroadcast(self._path) # type: ignore[attr-defined]
+ if sc._encryption_enabled: # type: ignore[attr-defined]
# with encryption, we ask the jvm to do the encryption for us, we send it data
# over a socket
port, auth_secret = self._python_broadcast.setupEncryptionServer()
(encryption_sock_file, _) = local_connect_and_auth(port, auth_secret)
- broadcast_out = ChunkedStream(encryption_sock_file, 8192)
+ broadcast_out: Union[ChunkedStream, IO[bytes]] = ChunkedStream(
+ encryption_sock_file, 8192
+ )
else:
# no encryption, we can just write pickled data directly to the file from python
broadcast_out = f
- self.dump(value, broadcast_out)
- if sc._encryption_enabled:
+ self.dump(cast(T, value), broadcast_out)
Review comment:
yes
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r753858723
##########
File path: python/pyspark/broadcast.py
##########
@@ -62,35 +81,44 @@ class Broadcast(object):
>>> large_broadcast = sc.broadcast(range(10000))
"""
- def __init__(self, sc=None, value=None, pickle_registry=None, path=None, sock_file=None):
+ def __init__(
+ self,
+ sc: Optional["SparkContext"] = None,
+ value: Optional[T] = None,
+ pickle_registry: Optional["BroadcastPickleRegistry"] = None,
+ path: Optional[Any] = None,
Review comment:
`Optional[str]`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-975400484
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49979/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-983276375
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50255/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-992069186
**[Test build #146115 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146115/testReport)** for PR 34439 at commit [`6905265`](https://github.com/apache/spark/commit/6905265c2374e15479c241752eb19a24ac5e3589).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-992116255
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50590/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-1007619850
Could you please resolve the conflicts @dchvn?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r753859416
##########
File path: python/pyspark/broadcast.py
##########
@@ -62,35 +81,44 @@ class Broadcast(object):
>>> large_broadcast = sc.broadcast(range(10000))
"""
- def __init__(self, sc=None, value=None, pickle_registry=None, path=None, sock_file=None):
+ def __init__(
+ self,
+ sc: Optional["SparkContext"] = None,
+ value: Optional[T] = None,
+ pickle_registry: Optional["BroadcastPickleRegistry"] = None,
+ path: Optional[Any] = None,
+ sock_file: Optional[Any] = None,
+ ):
"""
Should not be called directly by users -- use :meth:`SparkContext.broadcast`
instead.
"""
if sc is not None:
# we're on the driver. We want the pickled data to end up in a file (maybe encrypted)
- f = NamedTemporaryFile(delete=False, dir=sc._temp_dir)
+ f = NamedTemporaryFile(delete=False, dir=sc._temp_dir) # type: ignore[attr-defined]
self._path = f.name
- self._sc = sc
- self._python_broadcast = sc._jvm.PythonRDD.setupBroadcast(self._path)
- if sc._encryption_enabled:
+ self._sc: Optional["SparkContext"] = sc
+ self._python_broadcast = sc._jvm.PythonRDD.setupBroadcast(self._path) # type: ignore[attr-defined]
+ if sc._encryption_enabled: # type: ignore[attr-defined]
# with encryption, we ask the jvm to do the encryption for us, we send it data
# over a socket
port, auth_secret = self._python_broadcast.setupEncryptionServer()
(encryption_sock_file, _) = local_connect_and_auth(port, auth_secret)
- broadcast_out = ChunkedStream(encryption_sock_file, 8192)
+ broadcast_out: Union[ChunkedStream, IO[bytes]] = ChunkedStream(
+ encryption_sock_file, 8192
+ )
else:
# no encryption, we can just write pickled data directly to the file from python
broadcast_out = f
- self.dump(value, broadcast_out)
- if sc._encryption_enabled:
+ self.dump(cast(T, value), broadcast_out)
Review comment:
Why do we need `cast` here? Is this because of `Optional`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r753859047
##########
File path: python/pyspark/broadcast.py
##########
@@ -113,11 +141,11 @@ def dump(self, value, f):
raise pickle.PicklingError(msg)
f.close()
- def load_from_path(self, path):
+ def load_from_path(self, path: Any) -> T:
with open(path, "rb", 1 << 20) as f:
return self.load(f)
- def load(self, file):
+ def load(self, file: Any) -> T:
Review comment:
Same as https://github.com/apache/spark/pull/34439/files#r753858983?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r753858864
##########
File path: python/pyspark/broadcast.py
##########
@@ -113,11 +141,11 @@ def dump(self, value, f):
raise pickle.PicklingError(msg)
f.close()
- def load_from_path(self, path):
+ def load_from_path(self, path: Any) -> T:
Review comment:
`path: str`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-975400554
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49979/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dchvn commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
dchvn commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-1008560825
ping @zero323 :smile:
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-983275972
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50254/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-971118816
**[Test build #145302 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145302/testReport)** for PR 34439 at commit [`e304c6c`](https://github.com/apache/spark/commit/e304c6cd391a4f17d8410fc00de90d003d243552).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-971118816
**[Test build #145302 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145302/testReport)** for PR 34439 at commit [`e304c6c`](https://github.com/apache/spark/commit/e304c6cd391a4f17d8410fc00de90d003d243552).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-971142238
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145302/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dchvn commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
dchvn commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-957181189
CC @HyukjinKwon @zero323 @ueshin too. Many thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-954661653
**[Test build #144752 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144752/testReport)** for PR 34439 at commit [`c7d3417`](https://github.com/apache/spark/commit/c7d3417293c5de7e2dd8378891c766489016f246).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-992081510
**[Test build #146115 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146115/testReport)** for PR 34439 at commit [`6905265`](https://github.com/apache/spark/commit/6905265c2374e15479c241752eb19a24ac5e3589).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-992116255
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50590/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r753850759
##########
File path: python/pyspark/broadcast.py
##########
@@ -113,11 +141,11 @@ def dump(self, value, f):
raise pickle.PicklingError(msg)
f.close()
- def load_from_path(self, path):
+ def load_from_path(self, path: Any) -> T:
with open(path, "rb", 1 << 20) as f:
return self.load(f)
- def load(self, file):
+ def load(self, file: Any) -> T:
Review comment:
`path: str`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-975290186
**[Test build #145507 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145507/testReport)** for PR 34439 at commit [`c1cb255`](https://github.com/apache/spark/commit/c1cb255a87f219e28315598c8820f4b1c9cdd765).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r754106603
##########
File path: python/pyspark/broadcast.py
##########
@@ -62,35 +81,44 @@ class Broadcast(object):
>>> large_broadcast = sc.broadcast(range(10000))
"""
- def __init__(self, sc=None, value=None, pickle_registry=None, path=None, sock_file=None):
+ def __init__(
+ self,
+ sc: Optional["SparkContext"] = None,
+ value: Optional[T] = None,
+ pickle_registry: Optional["BroadcastPickleRegistry"] = None,
+ path: Optional[Any] = None,
+ sock_file: Optional[Any] = None,
+ ):
"""
Should not be called directly by users -- use :meth:`SparkContext.broadcast`
instead.
"""
if sc is not None:
# we're on the driver. We want the pickled data to end up in a file (maybe encrypted)
- f = NamedTemporaryFile(delete=False, dir=sc._temp_dir)
+ f = NamedTemporaryFile(delete=False, dir=sc._temp_dir) # type: ignore[attr-defined]
self._path = f.name
- self._sc = sc
- self._python_broadcast = sc._jvm.PythonRDD.setupBroadcast(self._path)
- if sc._encryption_enabled:
+ self._sc: Optional["SparkContext"] = sc
+ self._python_broadcast = sc._jvm.PythonRDD.setupBroadcast(self._path) # type: ignore[attr-defined]
+ if sc._encryption_enabled: # type: ignore[attr-defined]
# with encryption, we ask the jvm to do the encryption for us, we send it data
# over a socket
port, auth_secret = self._python_broadcast.setupEncryptionServer()
(encryption_sock_file, _) = local_connect_and_auth(port, auth_secret)
- broadcast_out = ChunkedStream(encryption_sock_file, 8192)
+ broadcast_out: Union[ChunkedStream, IO[bytes]] = ChunkedStream(
+ encryption_sock_file, 8192
+ )
else:
# no encryption, we can just write pickled data directly to the file from python
broadcast_out = f
- self.dump(value, broadcast_out)
- if sc._encryption_enabled:
+ self.dump(cast(T, value), broadcast_out)
Review comment:
@HyukjinKwon @itholic @ueshin @xinrong-databricks WDYT?
Do we need a fancy overload on `__int__` here? Something around these lines
```python
@overload # On driver
def __init__(self: Broadcast[T], sc: SparkContext, value: T pickle_registry: BroadcastPickleRegistry): ...
@overload # On worker without decryption server
def __init__(self: Broadcast[Any], *, path: str): ... # This is a placeholder for arbitrary value, so not Broadcast[None]
@overload # On worker with decryption server
def __init__(self: Broadcast[Any], *, sock_file: str): ... # Ditto
```
`cast` definitely seems wrong, because we know that this thing can be `None` in this control flow (this is in contrast to many optional fields we access and we know, that under normal operating conditions, are not null). If anything, it should be ignored.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r781535651
##########
File path: examples/src/main/python/als.py
##########
@@ -94,8 +94,8 @@ def update(i, mat, ratings):
msb = sc.broadcast(ms)
us_ = sc.parallelize(range(U), partitions) \
- .map(lambda x: update(x, msb.value, Rb.value.T)) \
- .collect()
+ .map(lambda x: update(x, msb.value, Rb.value.T)).collect() # type: ignore[attr-defined]
Review comment:
Once you bring back `Generic` this shouldn't be necessary.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r753859207
##########
File path: python/pyspark/broadcast.py
##########
@@ -21,28 +21,47 @@
from tempfile import NamedTemporaryFile
import threading
import pickle
+from typing import (
+ cast,
+ Any,
+ Callable,
+ Dict,
+ Generic,
+ IO,
+ Iterator,
+ Optional,
+ Tuple,
+ TypeVar,
+ TYPE_CHECKING,
+ Union,
+)
from pyspark.java_gateway import local_connect_and_auth
from pyspark.serializers import ChunkedStream, pickle_protocol
from pyspark.util import print_exec
+if TYPE_CHECKING:
+ from pyspark import SparkContext
+
__all__ = ["Broadcast"]
+T = TypeVar("T")
+
# Holds broadcasted data received from Java, keyed by its id.
-_broadcastRegistry = {}
+_broadcastRegistry: Dict[int, "Broadcast"] = {}
Review comment:
Let's avoid implicit `Any` ‒ `Dict[int, "Broadcast[Any]"]`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r753859161
##########
File path: python/pyspark/broadcast.py
##########
@@ -21,28 +21,47 @@
from tempfile import NamedTemporaryFile
import threading
import pickle
+from typing import (
+ cast,
+ Any,
+ Callable,
+ Dict,
+ Generic,
+ IO,
+ Iterator,
+ Optional,
+ Tuple,
+ TypeVar,
+ TYPE_CHECKING,
+ Union,
+)
from pyspark.java_gateway import local_connect_and_auth
from pyspark.serializers import ChunkedStream, pickle_protocol
from pyspark.util import print_exec
+if TYPE_CHECKING:
+ from pyspark import SparkContext
+
__all__ = ["Broadcast"]
+T = TypeVar("T")
+
# Holds broadcasted data received from Java, keyed by its id.
-_broadcastRegistry = {}
+_broadcastRegistry: Dict[int, "Broadcast"] = {}
-def _from_id(bid):
+def _from_id(bid: int) -> "Broadcast":
Review comment:
`-> "Broadcast[Any]"`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r753855694
##########
File path: python/pyspark/broadcast.py
##########
@@ -177,28 +201,28 @@ def destroy(self, blocking=False):
self._jbroadcast.destroy(blocking)
os.unlink(self._path)
- def __reduce__(self):
+ def __reduce__(self) -> Tuple[Callable[[int], T], Tuple[int]]:
if self._jbroadcast is None:
raise RuntimeError("Broadcast can only be serialized in driver")
- self._pickle_registry.add(self)
- return _from_id, (self._jbroadcast.id(),)
+ cast(Any, self._pickle_registry).add(self)
+ return cast(Tuple[Callable[[int], T], Tuple[int]], (_from_id, (self._jbroadcast.id(),)))
Review comment:
https://github.com/apache/spark/pull/34439/files#r753855676
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-975347637
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49979/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-983246898
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50254/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] ueshin commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
ueshin commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r759746730
##########
File path: python/pyspark/broadcast.py
##########
@@ -62,35 +81,44 @@ class Broadcast(object):
>>> large_broadcast = sc.broadcast(range(10000))
"""
- def __init__(self, sc=None, value=None, pickle_registry=None, path=None, sock_file=None):
+ def __init__(
+ self,
+ sc: Optional["SparkContext"] = None,
+ value: Optional[T] = None,
+ pickle_registry: Optional["BroadcastPickleRegistry"] = None,
+ path: Optional[Any] = None,
+ sock_file: Optional[Any] = None,
+ ):
"""
Should not be called directly by users -- use :meth:`SparkContext.broadcast`
instead.
"""
if sc is not None:
# we're on the driver. We want the pickled data to end up in a file (maybe encrypted)
- f = NamedTemporaryFile(delete=False, dir=sc._temp_dir)
+ f = NamedTemporaryFile(delete=False, dir=sc._temp_dir) # type: ignore[attr-defined]
self._path = f.name
- self._sc = sc
- self._python_broadcast = sc._jvm.PythonRDD.setupBroadcast(self._path)
- if sc._encryption_enabled:
+ self._sc: Optional["SparkContext"] = sc
+ self._python_broadcast = sc._jvm.PythonRDD.setupBroadcast(self._path) # type: ignore[attr-defined]
+ if sc._encryption_enabled: # type: ignore[attr-defined]
# with encryption, we ask the jvm to do the encryption for us, we send it data
# over a socket
port, auth_secret = self._python_broadcast.setupEncryptionServer()
(encryption_sock_file, _) = local_connect_and_auth(port, auth_secret)
- broadcast_out = ChunkedStream(encryption_sock_file, 8192)
+ broadcast_out: Union[ChunkedStream, IO[bytes]] = ChunkedStream(
+ encryption_sock_file, 8192
+ )
else:
# no encryption, we can just write pickled data directly to the file from python
broadcast_out = f
- self.dump(value, broadcast_out)
- if sc._encryption_enabled:
+ self.dump(cast(T, value), broadcast_out)
Review comment:
I'm fine with adding the overloads.
##########
File path: python/pyspark/broadcast.py
##########
@@ -175,27 +205,27 @@ def destroy(self, blocking=False):
self._jbroadcast.destroy(blocking)
os.unlink(self._path)
- def __reduce__(self):
+ def __reduce__(self) -> Tuple[Callable[[int], "Broadcast[T]"], Tuple[int]]:
if self._jbroadcast is None:
raise RuntimeError("Broadcast can only be serialized in driver")
- self._pickle_registry.add(self)
+ cast(Any, self._pickle_registry).add(self)
return _from_id, (self._jbroadcast.id(),)
class BroadcastPickleRegistry(threading.local):
"""Thread-local registry for broadcast variables that have been pickled"""
- def __init__(self):
+ def __init__(self) -> None:
self.__dict__.setdefault("_registry", set())
- def __iter__(self):
+ def __iter__(self) -> Iterator[Broadcast]:
Review comment:
`Iterator[Broadcast[Any]]`?
##########
File path: python/pyspark/broadcast.py
##########
@@ -175,27 +205,27 @@ def destroy(self, blocking=False):
self._jbroadcast.destroy(blocking)
os.unlink(self._path)
- def __reduce__(self):
+ def __reduce__(self) -> Tuple[Callable[[int], "Broadcast[T]"], Tuple[int]]:
if self._jbroadcast is None:
raise RuntimeError("Broadcast can only be serialized in driver")
- self._pickle_registry.add(self)
+ cast(Any, self._pickle_registry).add(self)
Review comment:
`BroadcastPickleRegistry` instead of `Any`?
##########
File path: python/pyspark/broadcast.py
##########
@@ -175,27 +205,27 @@ def destroy(self, blocking=False):
self._jbroadcast.destroy(blocking)
os.unlink(self._path)
- def __reduce__(self):
+ def __reduce__(self) -> Tuple[Callable[[int], "Broadcast[T]"], Tuple[int]]:
if self._jbroadcast is None:
raise RuntimeError("Broadcast can only be serialized in driver")
- self._pickle_registry.add(self)
+ cast(Any, self._pickle_registry).add(self)
return _from_id, (self._jbroadcast.id(),)
class BroadcastPickleRegistry(threading.local):
"""Thread-local registry for broadcast variables that have been pickled"""
- def __init__(self):
+ def __init__(self) -> None:
self.__dict__.setdefault("_registry", set())
- def __iter__(self):
+ def __iter__(self) -> Iterator[Broadcast]:
for bcast in self._registry:
yield bcast
- def add(self, bcast):
+ def add(self, bcast: Broadcast) -> None:
Review comment:
`Broadcast[Any]`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-983250327
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-971131891
**[Test build #145302 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145302/testReport)** for PR 34439 at commit [`e304c6c`](https://github.com/apache/spark/commit/e304c6cd391a4f17d8410fc00de90d003d243552).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-971172257
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49772/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-966045479
**[Test build #145092 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145092/testReport)** for PR 34439 at commit [`1b89310`](https://github.com/apache/spark/commit/1b893106f58b48d4240f94d8866620c0d5fc53e4).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-966106285
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49561/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r781534745
##########
File path: python/pyspark/broadcast.py
##########
@@ -53,18 +51,18 @@
# Holds broadcasted data received from Java, keyed by its id.
-_broadcastRegistry: Dict[int, "Broadcast[Any]"] = {}
+_broadcastRegistry: Dict[int, "Broadcast"] = {}
-def _from_id(bid: int) -> "Broadcast[Any]":
+def _from_id(bid: int) -> "Broadcast":
from pyspark.broadcast import _broadcastRegistry
if bid not in _broadcastRegistry:
raise RuntimeError("Broadcast variable '%s' not loaded!" % bid)
return _broadcastRegistry[bid]
-class Broadcast(Generic[T]):
Review comment:
Oh, we cannot do that... `Generic[T]` has to go back.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
HyukjinKwon closed pull request #34439:
URL: https://github.com/apache/spark/pull/34439
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dchvn commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
dchvn commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-1009670737
Sorry about my mistakes, @zero323 , Can you review this PR again? Many thanks :smiling_face_with_tear:
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dchvn commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
dchvn commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r783555825
##########
File path: python/pyspark/broadcast.py
##########
@@ -175,27 +223,27 @@ def destroy(self, blocking=False):
self._jbroadcast.destroy(blocking)
os.unlink(self._path)
- def __reduce__(self):
+ def __reduce__(self) -> Tuple[Callable[[int], "Broadcast[T]"], Tuple[int]]:
if self._jbroadcast is None:
raise RuntimeError("Broadcast can only be serialized in driver")
- self._pickle_registry.add(self)
+ cast("BroadcastPickleRegistry", self._pickle_registry).add(self)
Review comment:
Updated, thanks! :smile:
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-966172142
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49561/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-966075665
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145092/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r753850759
##########
File path: python/pyspark/broadcast.py
##########
@@ -113,11 +141,11 @@ def dump(self, value, f):
raise pickle.PicklingError(msg)
f.close()
- def load_from_path(self, path):
+ def load_from_path(self, path: Any) -> T:
with open(path, "rb", 1 << 20) as f:
return self.load(f)
- def load(self, file):
+ def load(self, file: Any) -> T:
Review comment:
`path: str`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r753851766
##########
File path: python/pyspark/broadcast.py
##########
@@ -102,7 +130,7 @@ def __init__(self, sc=None, value=None, pickle_registry=None, path=None, sock_fi
assert path is not None
self._path = path
- def dump(self, value, f):
+ def dump(self, value: T, f: Any) -> None:
Review comment:
`f: BinaryIO`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-975342598
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145507/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-975342598
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145507/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-975400554
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49979/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dchvn commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
dchvn commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-971114804
@ueshin Thanks for your reviewing! I updated this PR follow your comments.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-1007619850
Could you please resolve the conflicts @dchvn?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-983272567
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50254/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dchvn commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
dchvn commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r745239197
##########
File path: python/pyspark/broadcast.py
##########
@@ -177,28 +201,28 @@ def destroy(self, blocking=False):
self._jbroadcast.destroy(blocking)
os.unlink(self._path)
- def __reduce__(self):
+ def __reduce__(self) -> Tuple[Callable[[int], T], Tuple[int]]:
if self._jbroadcast is None:
raise RuntimeError("Broadcast can only be serialized in driver")
- self._pickle_registry.add(self)
- return _from_id, (self._jbroadcast.id(),)
+ cast(Any, self._pickle_registry).add(self)
+ return cast(Tuple[Callable[[int], T], Tuple[int]], (_from_id, (self._jbroadcast.id(),)))
Review comment:
I use `cast` to match with type hint of this functions
##########
File path: python/pyspark/broadcast.py
##########
@@ -177,28 +201,28 @@ def destroy(self, blocking=False):
self._jbroadcast.destroy(blocking)
os.unlink(self._path)
- def __reduce__(self):
+ def __reduce__(self) -> Tuple[Callable[[int], T], Tuple[int]]:
if self._jbroadcast is None:
raise RuntimeError("Broadcast can only be serialized in driver")
- self._pickle_registry.add(self)
- return _from_id, (self._jbroadcast.id(),)
+ cast(Any, self._pickle_registry).add(self)
+ return cast(Tuple[Callable[[int], T], Tuple[int]], (_from_id, (self._jbroadcast.id(),)))
class BroadcastPickleRegistry(threading.local):
""" Thread-local registry for broadcast variables that have been pickled
"""
- def __init__(self):
+ def __init__(self) -> None:
self.__dict__.setdefault("_registry", set())
- def __iter__(self):
+ def __iter__(self) -> Generator[Broadcast, None, None]:
Review comment:
That differs from `broadcast.pyi` because we receive a `Generator` with `yield`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-954695002
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144752/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-954684975
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49221/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dchvn commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
dchvn commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-957181189
CC @HyukjinKwon @zero323 @ueshin too. Many thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] ueshin commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
ueshin commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r750715677
##########
File path: python/pyspark/broadcast.py
##########
@@ -21,28 +21,43 @@
from tempfile import NamedTemporaryFile
import threading
import pickle
+from typing import (
+ cast,
+ Any,
+ Callable,
+ Dict,
+ Generator,
+ Generic,
+ IO,
+ Optional,
+ Tuple,
+ TypeVar,
+ Union,
+)
from pyspark.java_gateway import local_connect_and_auth
from pyspark.serializers import ChunkedStream, pickle_protocol
-from pyspark.util import print_exec
+from pyspark.util import print_exec # type: ignore[attr-defined]
Review comment:
I think we don't need this change anymore.
##########
File path: python/pyspark/broadcast.py
##########
@@ -62,7 +77,14 @@ class Broadcast(object):
>>> large_broadcast = sc.broadcast(range(10000))
"""
- def __init__(self, sc=None, value=None, pickle_registry=None, path=None, sock_file=None):
+ def __init__(
+ self,
+ sc: Optional[Any] = None,
Review comment:
I guess `Optional[SparkContext]`?
##########
File path: python/pyspark/broadcast.py
##########
@@ -62,7 +77,14 @@ class Broadcast(object):
>>> large_broadcast = sc.broadcast(range(10000))
"""
- def __init__(self, sc=None, value=None, pickle_registry=None, path=None, sock_file=None):
+ def __init__(
+ self,
+ sc: Optional[Any] = None,
+ value: Optional[T] = None,
+ pickle_registry: Optional[Any] = None,
Review comment:
I guess `Optional[BroadcastPickleRegistry]`?
##########
File path: python/pyspark/broadcast.py
##########
@@ -177,28 +201,28 @@ def destroy(self, blocking=False):
self._jbroadcast.destroy(blocking)
os.unlink(self._path)
- def __reduce__(self):
+ def __reduce__(self) -> Tuple[Callable[[int], T], Tuple[int]]:
if self._jbroadcast is None:
raise RuntimeError("Broadcast can only be serialized in driver")
- self._pickle_registry.add(self)
- return _from_id, (self._jbroadcast.id(),)
+ cast(Any, self._pickle_registry).add(self)
+ return cast(Tuple[Callable[[int], T], Tuple[int]], (_from_id, (self._jbroadcast.id(),)))
class BroadcastPickleRegistry(threading.local):
""" Thread-local registry for broadcast variables that have been pickled
"""
- def __init__(self):
+ def __init__(self) -> None:
self.__dict__.setdefault("_registry", set())
- def __iter__(self):
+ def __iter__(self) -> Generator[Broadcast, None, None]:
Review comment:
I guess we can use `Iterator` instead of `Generator`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-971161510
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49772/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-975290186
**[Test build #145507 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145507/testReport)** for PR 34439 at commit [`c1cb255`](https://github.com/apache/spark/commit/c1cb255a87f219e28315598c8820f4b1c9cdd765).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r753858983
##########
File path: python/pyspark/broadcast.py
##########
@@ -62,35 +81,44 @@ class Broadcast(object):
>>> large_broadcast = sc.broadcast(range(10000))
"""
- def __init__(self, sc=None, value=None, pickle_registry=None, path=None, sock_file=None):
+ def __init__(
+ self,
+ sc: Optional["SparkContext"] = None,
+ value: Optional[T] = None,
+ pickle_registry: Optional["BroadcastPickleRegistry"] = None,
+ path: Optional[Any] = None,
+ sock_file: Optional[Any] = None,
Review comment:
Probably `Optional[BinaryIO]`, but more eyes on this would be good.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-975320529
**[Test build #145507 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145507/testReport)** for PR 34439 at commit [`c1cb255`](https://github.com/apache/spark/commit/c1cb255a87f219e28315598c8820f4b1c9cdd765).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
* `class Broadcast(Generic[T]):`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-971172257
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49772/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-983224101
**[Test build #145781 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145781/testReport)** for PR 34439 at commit [`1308016`](https://github.com/apache/spark/commit/13080168236f3084cb2250478215acd525b85701).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-983227579
**[Test build #145782 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145782/testReport)** for PR 34439 at commit [`cc65b8c`](https://github.com/apache/spark/commit/cc65b8c26c9e33ada63ade7218d2ea869010d0a6).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-983227579
**[Test build #145782 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145782/testReport)** for PR 34439 at commit [`cc65b8c`](https://github.com/apache/spark/commit/cc65b8c26c9e33ada63ade7218d2ea869010d0a6).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-983250327
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-983275972
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50254/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dchvn commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
dchvn commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-992061230
cc @ueshin @zero323 ! Please take a look if you have time! Thanks :smiley:
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34439:
URL: https://github.com/apache/spark/pull/34439#discussion_r781534293
##########
File path: python/pyspark/broadcast.py
##########
@@ -53,18 +51,18 @@
# Holds broadcasted data received from Java, keyed by its id.
-_broadcastRegistry: Dict[int, "Broadcast[Any]"] = {}
+_broadcastRegistry: Dict[int, "Broadcast"] = {}
-def _from_id(bid: int) -> "Broadcast[Any]":
+def _from_id(bid: int) -> "Broadcast":
from pyspark.broadcast import _broadcastRegistry
if bid not in _broadcastRegistry:
raise RuntimeError("Broadcast variable '%s' not loaded!" % bid)
return _broadcastRegistry[bid]
-class Broadcast(Generic[T]):
Review comment:
Oh, wecannot do that. `Generic[T]` has to go back.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dchvn commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py
Posted by GitBox <gi...@apache.org>.
dchvn commented on pull request #34439:
URL: https://github.com/apache/spark/pull/34439#issuecomment-1011947810
Thanks all! :smile:
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org