You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Willi Raschkowski (Jira)" <ji...@apache.org> on 2021/09/15 16:11:00 UTC
[jira] [Updated] (SPARK-36768) Cannot resolve attribute with table
reference
[ https://issues.apache.org/jira/browse/SPARK-36768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Willi Raschkowski updated SPARK-36768:
--------------------------------------
Description:
Spark seems in some cases unable to resolve attributes that contain multi-part names where the first parts reference a table. Here's a repro:
{code:python}
>>> spark.range(3).toDF("col").write.parquet("testdata")
# Single name part attribute is fine
>>> spark.sql("SELECT col FROM parquet.testdata").show()
+---+
|col|
+---+
| 1|
| 0|
| 2|
+---+
# Name part with the table reference fails
>>> spark.sql("SELECT parquet.testdata.col FROM parquet.testdata").show()
AnalysisException: cannot resolve '`parquet.testdata.col`' given input columns: [col]; line 1 pos 7;
'Project ['parquet.testdata.col]
+- Relation[col#50L] parquet
{code}
The expected behavior is that {{parquet.testdata.col}} is recognized as referring to attribute {{col}} in {{parquet.testdata}}.
This also reproduces on master at time of writing.
was:
Spark seems in some cases unable to resolve attributes that contain multi-part names where the first parts reference a table. Here's a repro:
{code:python}
>>> spark.range(3).toDF("col").write.parquet("testdata")
# Single name part attribute is fine
>>> spark.sql("SELECT col FROM parquet.testdata").show()
+---+
|col|
+---+
| 1|
| 0|
| 2|
+---+
# Name part with the table reference fails
>>> spark.sql("SELECT parquet.testdata.col FROM parquet.testdata").show()
AnalysisException: cannot resolve '`parquet.testdata.col`' given input columns: [col]; line 1 pos 7;
'Project ['parquet.testdata.col]
+- Relation[col#50L] parquet
{code}
This also reproduces on master at time of writing.
> Cannot resolve attribute with table reference
> ---------------------------------------------
>
> Key: SPARK-36768
> URL: https://issues.apache.org/jira/browse/SPARK-36768
> Project: Spark
> Issue Type: Task
> Components: SQL
> Affects Versions: 2.4.7, 3.0.3, 3.1.2
> Reporter: Willi Raschkowski
> Priority: Major
>
> Spark seems in some cases unable to resolve attributes that contain multi-part names where the first parts reference a table. Here's a repro:
> {code:python}
> >>> spark.range(3).toDF("col").write.parquet("testdata")
> # Single name part attribute is fine
> >>> spark.sql("SELECT col FROM parquet.testdata").show()
> +---+
> |col|
> +---+
> | 1|
> | 0|
> | 2|
> +---+
> # Name part with the table reference fails
> >>> spark.sql("SELECT parquet.testdata.col FROM parquet.testdata").show()
> AnalysisException: cannot resolve '`parquet.testdata.col`' given input columns: [col]; line 1 pos 7;
> 'Project ['parquet.testdata.col]
> +- Relation[col#50L] parquet
> {code}
> The expected behavior is that {{parquet.testdata.col}} is recognized as referring to attribute {{col}} in {{parquet.testdata}}.
> This also reproduces on master at time of writing.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org