You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Willi Raschkowski (Jira)" <ji...@apache.org> on 2021/09/15 16:11:00 UTC

[jira] [Updated] (SPARK-36768) Cannot resolve attribute with table reference

     [ https://issues.apache.org/jira/browse/SPARK-36768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Willi Raschkowski updated SPARK-36768:
--------------------------------------
    Description: 
Spark seems in some cases unable to resolve attributes that contain multi-part names where the first parts reference a table. Here's a repro:
{code:python}
>>> spark.range(3).toDF("col").write.parquet("testdata")

# Single name part attribute is fine
>>> spark.sql("SELECT col FROM parquet.testdata").show()
+---+
|col|
+---+
|  1|
|  0|
|  2|
+---+

# Name part with the table reference fails
>>> spark.sql("SELECT parquet.testdata.col FROM parquet.testdata").show()

AnalysisException: cannot resolve '`parquet.testdata.col`' given input columns: [col]; line 1 pos 7;
'Project ['parquet.testdata.col]
+- Relation[col#50L] parquet
{code}

The expected behavior is that {{parquet.testdata.col}} is recognized as referring to attribute {{col}} in {{parquet.testdata}}.

This also reproduces on master at time of writing.

  was:
Spark seems in some cases unable to resolve attributes that contain multi-part names where the first parts reference a table. Here's a repro:
{code:python}
>>> spark.range(3).toDF("col").write.parquet("testdata")

# Single name part attribute is fine
>>> spark.sql("SELECT col FROM parquet.testdata").show()
+---+
|col|
+---+
|  1|
|  0|
|  2|
+---+

# Name part with the table reference fails
>>> spark.sql("SELECT parquet.testdata.col FROM parquet.testdata").show()

AnalysisException: cannot resolve '`parquet.testdata.col`' given input columns: [col]; line 1 pos 7;
'Project ['parquet.testdata.col]
+- Relation[col#50L] parquet
{code}

This also reproduces on master at time of writing.


> Cannot resolve attribute with table reference
> ---------------------------------------------
>
>                 Key: SPARK-36768
>                 URL: https://issues.apache.org/jira/browse/SPARK-36768
>             Project: Spark
>          Issue Type: Task
>          Components: SQL
>    Affects Versions: 2.4.7, 3.0.3, 3.1.2
>            Reporter: Willi Raschkowski
>            Priority: Major
>
> Spark seems in some cases unable to resolve attributes that contain multi-part names where the first parts reference a table. Here's a repro:
> {code:python}
> >>> spark.range(3).toDF("col").write.parquet("testdata")
> # Single name part attribute is fine
> >>> spark.sql("SELECT col FROM parquet.testdata").show()
> +---+
> |col|
> +---+
> |  1|
> |  0|
> |  2|
> +---+
> # Name part with the table reference fails
> >>> spark.sql("SELECT parquet.testdata.col FROM parquet.testdata").show()
> AnalysisException: cannot resolve '`parquet.testdata.col`' given input columns: [col]; line 1 pos 7;
> 'Project ['parquet.testdata.col]
> +- Relation[col#50L] parquet
> {code}
> The expected behavior is that {{parquet.testdata.col}} is recognized as referring to attribute {{col}} in {{parquet.testdata}}.
> This also reproduces on master at time of writing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org