You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Shuai Zheng (JIRA)" <ji...@apache.org> on 2015/04/23 21:08:38 UTC
[jira] [Issue Comment Deleted] (SPARK-6273) Got error when one
table's alias name is the same with other table's column name
[ https://issues.apache.org/jira/browse/SPARK-6273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shuai Zheng updated SPARK-6273:
-------------------------------
Comment: was deleted
(was: I use 1.3.1, and I have similar issue. It is still there.
And I am using purely DataFrame, spark SqlContext not HiveContext
DataFrame df3 = df1.join(df2, df1.col(col).equalTo(df2.col(col))).select(col);
because df1 and df2 join on the same key col,
Then I can't reference the key col ("id"):
Exception in thread "main" org.apache.spark.sql.AnalysisException: Reference 'id' is ambiguous, could be: id#8L, id#0L.;
It looks that joined key can't be referenced by name or by df1.col name pattern.
The https://issues.apache.org/jira/browse/SPARK-5278 refer to a hive case, so I am not sure whether it is the same issue, but I still have the issue in latest code.
It looks like the result after join won't keep the parent DF information any where?)
> Got error when one table's alias name is the same with other table's column name
> --------------------------------------------------------------------------------
>
> Key: SPARK-6273
> URL: https://issues.apache.org/jira/browse/SPARK-6273
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.2.1, 1.3.1
> Reporter: Jeff
>
> while one table's alias name is the same with other table's column name
> get the error Ambiguous references
> {code}
> Error: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Ambiguous references to salary.pay_date: (pay_date#34749,List()),(salary#34792,List(pay_date)), tree:
> 'Filter (((('salary.pay_date = 'time_by_day.the_date) && ('time_by_day.the_year = 1997.0)) && ('salary.employee_id = 'employee.employee_id)) && ('employee.store_id = 'store.store_id))
> Join Inner, None
> Join Inner, None
> Join Inner, None
> MetastoreRelation yxqtest, time_by_day, Some(time_by_day)
> MetastoreRelation yxqtest, salary, Some(salary)
> MetastoreRelation yxqtest, store, Some(store)
> MetastoreRelation yxqtest, employee, Some(employee) (state=,code=0)
> Error: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Ambiguous references to salary.pay_date: (pay_date#34749,List()),(salary#34792,List(pay_date)), tree:
> 'Filter (((('salary.pay_date = 'time_by_day.the_date) && ('time_by_day.the_year = 1997.0)) && ('salary.employee_id = 'employee.employee_id)) && ('employee.store_id = 'store.store_id))
> Join Inner, None
> Join Inner, None
> Join Inner, None
> MetastoreRelation yxqtest, time_by_day, Some(time_by_day)
> MetastoreRelation yxqtest, salary, Some(salary)
> MetastoreRelation yxqtest, store, Some(store)
> MetastoreRelation yxqtest, employee, Some(employee) (state=,code=0)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org