You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by PierreB <pi...@realimpactanalytics.com> on 2015/01/26 12:15:42 UTC
[SQL] Self join with ArrayType columns problems
Using Spark 1.2.0, we are facing some weird behaviour when performing self
join on a table with some ArrayType field.
(potential bug ?)
I have set up a minimal non working example here:
https://gist.github.com/pierre-borckmans/4853cd6d0b2f2388bf4f
<https://gist.github.com/pierre-borckmans/4853cd6d0b2f2388bf4f>
In a nutshell, if the ArrayType column used for the pivot is created
manually in the StructType definition, everything works as expected.
However, if the ArrayType pivot column is obtained by a sql query (be it by
using a "array" wrapper, or using a collect_list operator for instance),
then results are completely off.
Could anyone have a look as this really is a blocking issue.
Thanks!
Cheers
P.
--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/SQL-Self-join-with-ArrayType-columns-problems-tp10269.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org
Re: [SQL] Self join with ArrayType columns problems
Posted by PierreB <pi...@realimpactanalytics.com>.
Should I file a JIRA for this?
--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/SQL-Self-join-with-ArrayType-columns-problems-tp10269p10322.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org