You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2019/10/29 03:17:00 UTC
[jira] [Created] (SPARK-29627) array_contains should allow column
instances in PySpark
Hyukjin Kwon created SPARK-29627:
------------------------------------
Summary: array_contains should allow column instances in PySpark
Key: SPARK-29627
URL: https://issues.apache.org/jira/browse/SPARK-29627
Project: Spark
Issue Type: Bug
Components: PySpark, SQL
Affects Versions: 3.0.0
Reporter: Hyukjin Kwon
Scala API works well with column instances:
{code}
import org.apache.spark.sql.functions._
val df = Seq(Array("a", "b", "c"), Array.empty[String]).toDF("data")
df.select(array_contains($"data", lit("a"))).collect()
{code}
{code}
Array[org.apache.spark.sql.Row] = Array([true], [false])
{code}
However, seems PySpark one doesn't:
{code}
from pyspark.sql.functions import array_contains, lit
df = spark.createDataFrame([(["a", "b", "c"],), ([],)], ['data'])
df.select(array_contains(df.data, lit("a"))).show()
{code}
{code}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/.../spark/python/pyspark/sql/functions.py", line 1950, in array_contains
return Column(sc._jvm.functions.array_contains(_to_java_column(col), value))
File "/.../spark/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1277, in __call__
File "/.../spark/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1241, in _build_args
File "/.../spark/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1228, in _get_args
File "/.../spark/python/lib/py4j-0.10.8.1-src.zip/py4j/java_collections.py", line 500, in convert
File "/.../spark/python/pyspark/sql/column.py", line 344, in __iter__
raise TypeError("Column is not iterable")
TypeError: Column is not iterable
{code}
We should let it allow
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org