You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean R. Owen (Jira)" <ji...@apache.org> on 2023/03/14 13:31:00 UTC

[jira] [Resolved] (SPARK-42752) Unprintable IllegalArgumentException with Hive catalog enabled in "Hadoop Free" distibution

     [ https://issues.apache.org/jira/browse/SPARK-42752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean R. Owen resolved SPARK-42752.
----------------------------------
    Fix Version/s: 3.5.0
         Assignee: Gera Shegalov
       Resolution: Fixed

Resolved by https://github.com/apache/spark/pull/40372

> Unprintable IllegalArgumentException with Hive catalog enabled in "Hadoop Free" distibution
> -------------------------------------------------------------------------------------------
>
>                 Key: SPARK-42752
>                 URL: https://issues.apache.org/jira/browse/SPARK-42752
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark, SQL
>    Affects Versions: 3.1.3, 3.2.4, 3.3.3, 3.4.1, 3.5.0
>         Environment: local
>            Reporter: Gera Shegalov
>            Assignee: Gera Shegalov
>            Priority: Minor
>             Fix For: 3.5.0
>
>
> Reproduction steps:
> 1. download a standard "Hadoop Free" build
> 2. Start pyspark REPL with Hive support
> {code:java}
> SPARK_DIST_CLASSPATH=$(~/dist/hadoop-3.4.0-SNAPSHOT/bin/hadoop classpath) ~/dist/spark-3.2.3-bin-without-hadoop/bin/pyspark --conf spark.sql.catalogImplementation=hive
> {code}
> 3. Execute any simple dataframe operation
> {code:java}
> >>> spark.range(100).show()
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/session.py", line 416, in range
>     jdf = self._jsparkSession.range(0, int(start), int(step), int(numPartitions))
>   File "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1321, in __call__
>   File "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/utils.py", line 117, in deco
>     raise converted from None
> pyspark.sql.utils.IllegalArgumentException: <exception str() failed>
> {code}
> 4. In fact you can just call spark.conf to trigger this issue
> {code:java}
> >>> spark.conf
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ...
> {code}
> There are probably two issues here:
> 1) that Hive support should be gracefully disabled if it the dependency not on the classpath as claimed by https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html
> 2) but at the very least the user should be able to see the exception to understand the issue, and take an action
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org