You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/08 20:07:14 UTC
[GitHub] [spark] bjornjorgensen opened a new pull request, #40716: Change `gRPC` to `grpcio` when it is not installed.
bjornjorgensen opened a new pull request, #40716:
URL: https://github.com/apache/spark/pull/40716
### What changes were proposed in this pull request?
Change `gRPC` to `grpcio`
This is ONLY in the printing, for users that haven't install `gRPC`
### Why are the changes needed?
Users that don't have install `gRPC` will get this error when starting connect.
ModuleNotFoundError Traceback (most recent call last)
File /opt/spark/python/pyspark/sql/connect/utils.py:45, in require_minimum_grpc_version()
44 try:
---> 45 import grpc
46 except ImportError as error:
ModuleNotFoundError: No module named 'grpc'
The above exception was the direct cause of the following exception:
ImportError Traceback (most recent call last)
Cell In[1], line 11
9 import pyarrow
10 from pyspark import SparkConf, SparkContext
---> 11 from pyspark import pandas as ps
12 from pyspark.sql import SparkSession
13 from pyspark.sql.functions import col, concat, concat_ws, expr, lit, trim
File /opt/spark/python/pyspark/pandas/__init__.py:59
50 warnings.warn(
51 "'PYARROW_IGNORE_TIMEZONE' environment variable was not set. It is required to "
52 "set this environment variable to '1' in both driver and executor sides if you use "
(...)
55 "already launched."
56 )
57 os.environ["PYARROW_IGNORE_TIMEZONE"] = "1"
---> 59 from pyspark.pandas.frame import DataFrame
60 from pyspark.pandas.indexes.base import Index
61 from pyspark.pandas.indexes.category import CategoricalIndex
File /opt/spark/python/pyspark/pandas/frame.py:88
85 from pyspark.sql.window import Window
87 from pyspark import pandas as ps # For running doctests and reference resolution in PyCharm.
---> 88 from pyspark.pandas._typing import (
89 Axis,
90 DataFrameOrSeries,
91 Dtype,
92 Label,
93 Name,
94 Scalar,
95 T,
96 GenericColumn,
97 )
98 from pyspark.pandas.accessors import PandasOnSparkFrameMethods
99 from pyspark.pandas.config import option_context, get_option
File /opt/spark/python/pyspark/pandas/_typing.py:25
22 from pandas.api.extensions import ExtensionDtype
24 from pyspark.sql.column import Column as PySparkColumn
---> 25 from pyspark.sql.connect.column import Column as ConnectColumn
26 from pyspark.sql.dataframe import DataFrame as PySparkDataFrame
27 from pyspark.sql.connect.dataframe import DataFrame as ConnectDataFrame
File /opt/spark/python/pyspark/sql/connect/column.py:19
1 #
2 # Licensed to the Apache Software Foundation (ASF) under one or more
3 # contributor license agreements. See the NOTICE file distributed with
(...)
15 # limitations under the License.
16 #
17 from pyspark.sql.connect.utils import check_dependencies
---> 19 check_dependencies(__name__)
21 import datetime
22 import decimal
File /opt/spark/python/pyspark/sql/connect/utils.py:35, in check_dependencies(mod_name)
33 require_minimum_pandas_version()
34 require_minimum_pyarrow_version()
---> 35 require_minimum_grpc_version()
File /opt/spark/python/pyspark/sql/connect/utils.py:47, in require_minimum_grpc_version()
45 import grpc
46 except ImportError as error:
---> 47 raise ImportError(
48 "grpc >= %s must be installed; however, " "it was not found." % minimum_grpc_version
49 ) from error
50 if LooseVersion(grpc.__version__) < LooseVersion(minimum_grpc_version):
51 raise ImportError(
52 "gRPC >= %s must be installed; however, "
53 "your version was %s." % (minimum_grpc_version, grpc.__version__)
54 )
ImportError: grpc >= 1.48.1 must be installed; however, it was not found.
The last line tells that there is a module named `grpc` that's missing.
`pip install grpc`
Collecting grpc
Downloading grpc-1.0.0.tar.gz (5.2 kB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [6 lines of output]
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/tmp/pip-install-vp4d8s4c/grpc_c0f1992ad8f7456b8ac09ecbaeb81750/setup.py", line 33, in <module>
raise RuntimeError(HINT)
RuntimeError: Please install the official package with: pip install grpcio
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
Note: you may need to restart the kernel to use updated packages.
The right way to install this is `pip install grpcio`
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Pass GA.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #40716: [SPARK-43075][CONNECT] Change `gRPC` to `grpcio` when it is not installed.
Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #40716:
URL: https://github.com/apache/spark/pull/40716#issuecomment-1501048066
Merged to master and branch-3.4.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #40716: [SPARK-43075][CONNECT] Change `gRPC` to `grpcio` when it is not installed.
Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon closed pull request #40716: [SPARK-43075][CONNECT] Change `gRPC` to `grpcio` when it is not installed.
URL: https://github.com/apache/spark/pull/40716
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org