You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "bjornjorgensen (via GitHub)" <gi...@apache.org> on 2023/04/08 20:07:14 UTC

[GitHub] [spark] bjornjorgensen opened a new pull request, #40716: Change `gRPC` to `grpcio` when it is not installed.

bjornjorgensen opened a new pull request, #40716:
URL: https://github.com/apache/spark/pull/40716

   ### What changes were proposed in this pull request?
   
   Change `gRPC` to `grpcio` 
   This is ONLY in the printing, for users that haven't install `gRPC`
   
   ### Why are the changes needed?
   Users that don't have install `gRPC` will get this error when starting connect.
   
   ModuleNotFoundError                       Traceback (most recent call last)
   File /opt/spark/python/pyspark/sql/connect/utils.py:45, in require_minimum_grpc_version()
        44 try:
   ---> 45     import grpc
        46 except ImportError as error:
   
   ModuleNotFoundError: No module named 'grpc'
   
   The above exception was the direct cause of the following exception:
   
   ImportError                               Traceback (most recent call last)
   Cell In[1], line 11
         9 import pyarrow
        10 from pyspark import SparkConf, SparkContext
   ---> 11 from pyspark import pandas as ps
        12 from pyspark.sql import SparkSession
        13 from pyspark.sql.functions import col, concat, concat_ws, expr, lit, trim
   
   File /opt/spark/python/pyspark/pandas/__init__.py:59
        50     warnings.warn(
        51         "'PYARROW_IGNORE_TIMEZONE' environment variable was not set. It is required to "
        52         "set this environment variable to '1' in both driver and executor sides if you use "
      (...)
        55         "already launched."
        56     )
        57     os.environ["PYARROW_IGNORE_TIMEZONE"] = "1"
   ---> 59 from pyspark.pandas.frame import DataFrame
        60 from pyspark.pandas.indexes.base import Index
        61 from pyspark.pandas.indexes.category import CategoricalIndex
   
   File /opt/spark/python/pyspark/pandas/frame.py:88
        85 from pyspark.sql.window import Window
        87 from pyspark import pandas as ps  # For running doctests and reference resolution in PyCharm.
   ---> 88 from pyspark.pandas._typing import (
        89     Axis,
        90     DataFrameOrSeries,
        91     Dtype,
        92     Label,
        93     Name,
        94     Scalar,
        95     T,
        96     GenericColumn,
        97 )
        98 from pyspark.pandas.accessors import PandasOnSparkFrameMethods
        99 from pyspark.pandas.config import option_context, get_option
   
   File /opt/spark/python/pyspark/pandas/_typing.py:25
        22 from pandas.api.extensions import ExtensionDtype
        24 from pyspark.sql.column import Column as PySparkColumn
   ---> 25 from pyspark.sql.connect.column import Column as ConnectColumn
        26 from pyspark.sql.dataframe import DataFrame as PySparkDataFrame
        27 from pyspark.sql.connect.dataframe import DataFrame as ConnectDataFrame
   
   File /opt/spark/python/pyspark/sql/connect/column.py:19
         1 #
         2 # Licensed to the Apache Software Foundation (ASF) under one or more
         3 # contributor license agreements.  See the NOTICE file distributed with
      (...)
        15 # limitations under the License.
        16 #
        17 from pyspark.sql.connect.utils import check_dependencies
   ---> 19 check_dependencies(__name__)
        21 import datetime
        22 import decimal
   
   File /opt/spark/python/pyspark/sql/connect/utils.py:35, in check_dependencies(mod_name)
        33 require_minimum_pandas_version()
        34 require_minimum_pyarrow_version()
   ---> 35 require_minimum_grpc_version()
   
   File /opt/spark/python/pyspark/sql/connect/utils.py:47, in require_minimum_grpc_version()
        45     import grpc
        46 except ImportError as error:
   ---> 47     raise ImportError(
        48         "grpc >= %s must be installed; however, " "it was not found." % minimum_grpc_version
        49     ) from error
        50 if LooseVersion(grpc.__version__) < LooseVersion(minimum_grpc_version):
        51     raise ImportError(
        52         "gRPC >= %s must be installed; however, "
        53         "your version was %s." % (minimum_grpc_version, grpc.__version__)
        54     )
   
   ImportError: grpc >= 1.48.1 must be installed; however, it was not found.
   
   
   The last line tells that there is a module named `grpc` that's missing. 
   
    `pip install grpc`
   
   Collecting grpc
     Downloading grpc-1.0.0.tar.gz (5.2 kB)
     Preparing metadata (setup.py) ... error
     error: subprocess-exited-with-error
     
     × python setup.py egg_info did not run successfully.
     │ exit code: 1
     ╰─> [6 lines of output]
         Traceback (most recent call last):
           File "<string>", line 2, in <module>
           File "<pip-setuptools-caller>", line 34, in <module>
           File "/tmp/pip-install-vp4d8s4c/grpc_c0f1992ad8f7456b8ac09ecbaeb81750/setup.py", line 33, in <module>
             raise RuntimeError(HINT)
         RuntimeError: Please install the official package with: pip install grpcio
         [end of output]
     
     note: This error originates from a subprocess, and is likely not a problem with pip.
   error: metadata-generation-failed
   
   × Encountered error while generating package metadata.
   ╰─> See above for output.
   
   note: This is an issue with the package mentioned above, not pip.
   hint: See above for details.
   Note: you may need to restart the kernel to use updated packages.
   
   The right way to install this is `pip install grpcio`
   
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   
   ### How was this patch tested?
   Pass GA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #40716: [SPARK-43075][CONNECT] Change `gRPC` to `grpcio` when it is not installed.

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #40716:
URL: https://github.com/apache/spark/pull/40716#issuecomment-1501048066

   Merged to master and branch-3.4.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #40716: [SPARK-43075][CONNECT] Change `gRPC` to `grpcio` when it is not installed.

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon closed pull request #40716: [SPARK-43075][CONNECT] Change `gRPC` to `grpcio` when it is not installed. 
URL: https://github.com/apache/spark/pull/40716


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org