You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jianshi Huang (JIRA)" <ji...@apache.org> on 2014/08/07 02:19:11 UTC
[jira] [Created] (SPARK-2890) Spark SQL should allow SELECT with
duplicated columns
Jianshi Huang created SPARK-2890:
------------------------------------
Summary: Spark SQL should allow SELECT with duplicated columns
Key: SPARK-2890
URL: https://issues.apache.org/jira/browse/SPARK-2890
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 1.1.0
Reporter: Jianshi Huang
Spark reported error java.lang.IllegalArgumentException with messages:
java.lang.IllegalArgumentException: requirement failed: Found fields with the same name.
at scala.Predef$.require(Predef.scala:233)
at org.apache.spark.sql.catalyst.types.StructType.<init>(dataTypes.scala:317)
at org.apache.spark.sql.catalyst.types.StructType$.fromAttributes(dataTypes.scala:310)
at org.apache.spark.sql.parquet.ParquetTypesConverter$.convertToString(ParquetTypes.scala:306)
at org.apache.spark.sql.parquet.ParquetTableScan.execute(ParquetTableOperations.scala:83)
at org.apache.spark.sql.execution.Filter.execute(basicOperators.scala:57)
at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:85)
at org.apache.spark.sql.SchemaRDD.collect(SchemaRDD.scala:433)
After trial and error, it seems it's caused by duplicated columns in my select clause.
I made the duplication on purpose for my code to parse correctly. I think we should allow users to specify duplicated columns as return value.
Jianshi
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org