You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2023/08/06 01:25:00 UTC
[jira] [Resolved] (SPARK-44670) Fix the `test_to_excel` tests for python3.7
[ https://issues.apache.org/jira/browse/SPARK-44670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-44670.
----------------------------------
Resolution: Fixed
Issue resolved by pull request 42339
[https://github.com/apache/spark/pull/42339]
> Fix the `test_to_excel` tests for python3.7
> -------------------------------------------
>
> Key: SPARK-44670
> URL: https://issues.apache.org/jira/browse/SPARK-44670
> Project: Spark
> Issue Type: Bug
> Components: Pandas API on Spark
> Affects Versions: 3.4.1
> Reporter: Madhukar
> Assignee: Madhukar
> Priority: Minor
> Fix For: 3.4.2
>
>
> With python3.7 and openpyxl installed got error:
> ======================================================================
> ERROR: test_to_excel (pyspark.pandas.tests.test_dataframe_conversion.DataFrameConversionTest)
> Traceback (most recent call last):
> File "/workspace/apache-spark/python/pyspark/pandas/tests/test_dataframe_conversion.py", line 102, in test_to_excel
> dataframes = self.get_excel_dfs(pandas_on_spark_location, pandas_location)
> File "/workspace/apache-spark/python/pyspark/pandas/tests/test_dataframe_conversion.py", line 89, in get_excel_dfs
> "got": pd.read_excel(pandas_on_spark_location, index_col=0),
> File "/opt/conda/lib/python3.7/site-packages/pandas/util/_decorators.py", line 296, in wrapper
> return func(*args, **kwargs)
> File "/opt/conda/lib/python3.7/site-packages/pandas/io/excel/_base.py", line 304, in read_excel
> io = ExcelFile(io, engine=engine)
> File "/opt/conda/lib/python3.7/site-packages/pandas/io/excel/_base.py", line 867, in __init__
> self._reader = self._engines[engine](self._io)
> File "/opt/conda/lib/python3.7/site-packages/pandas/io/excel/_xlrd.py", line 21, in __init__
> import_optional_dependency("xlrd", extra=err_msg)
> File "/opt/conda/lib/python3.7/site-packages/pandas/compat/_optional.py", line 110, in import_optional_dependency
> raise ImportError(msg) from None
> ImportError: Missing optional dependency 'xlrd'. Install xlrd >= 1.0.0 for Excel support Use pip or conda to install xlrd.
> ----------------------------------------------------------------------
>
>
>
> But with xlrd 2.0.1 installed getting error
> ======================================================================
> ERROR: test_to_excel (pyspark.pandas.tests.test_dataframe_conversion.DataFrameConversionTest)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> File "/workspace/apache-spark/python/pyspark/pandas/tests/test_dataframe_conversion.py", line 102, in test_to_excel
> dataframes = self.get_excel_dfs(pandas_on_spark_location, pandas_location)
> File "/workspace/apache-spark/python/pyspark/pandas/tests/test_dataframe_conversion.py", line 89, in get_excel_dfs
> "got": pd.read_excel(pandas_on_spark_location, index_col=0),
> File "/opt/conda/lib/python3.7/site-packages/pandas/util/_decorators.py", line 296, in wrapper
> return func(*args, **kwargs)
> File "/opt/conda/lib/python3.7/site-packages/pandas/io/excel/_base.py", line 304, in read_excel
> io = ExcelFile(io, engine=engine)
> File "/opt/conda/lib/python3.7/site-packages/pandas/io/excel/_base.py", line 867, in __init__
> self._reader = self._engines[engine](self._io)
> File "/opt/conda/lib/python3.7/site-packages/pandas/io/excel/_xlrd.py", line 22, in __init__
> super().__init__(filepath_or_buffer)
> File "/opt/conda/lib/python3.7/site-packages/pandas/io/excel/_base.py", line 353, in __init__
> self.book = self.load_workbook(filepath_or_buffer)
> File "/opt/conda/lib/python3.7/site-packages/pandas/io/excel/_xlrd.py", line 37, in load_workbook
> return open_workbook(filepath_or_buffer)
> File "/opt/conda/lib/python3.7/site-packages/xlrd/__init__.py", line 170, in open_workbook
> raise XLRDError(FILE_FORMAT_DESCRIPTIONS[file_format]+'; not supported')
> xlrd.biffh.XLRDError: Excel xlsx file; not supported
> ----------------------------------------------------------------------
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org