You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "kevin yu (JIRA)" <ji...@apache.org> on 2015/12/09 08:53:11 UTC
[jira] [Commented] (SPARK-12231) Failed to generate predicate Error
when using dropna
[ https://issues.apache.org/jira/browse/SPARK-12231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15048237#comment-15048237 ]
kevin yu commented on SPARK-12231:
----------------------------------
Hello Yahsuan: I am looking at this problem now. I can recreate the problem. but when you say 'if write data without partitionBy, the error won't happen'. are you trying with this?
df1.write.parquet('./data')
df2 = sqlc.read.parquet('./data')
df2.dropna()
df2.count()
I tried without partitionBy, and using
df2 = sqlc.read.parquet('./data')
df2.dropna().count()
I still get the exception.
I will update with my progress. Thanks.
> Failed to generate predicate Error when using dropna
> ----------------------------------------------------
>
> Key: SPARK-12231
> URL: https://issues.apache.org/jira/browse/SPARK-12231
> Project: Spark
> Issue Type: Bug
> Components: PySpark, SQL
> Affects Versions: 1.5.2
> Environment: python version: 2.7.9
> os: ubuntu 14.04
> Reporter: yahsuan, chang
>
> code to reproduce error
> # write.py
> import pyspark
> sc = pyspark.SparkContext()
> sqlc = pyspark.SQLContext(sc)
> df = sqlc.range(10)
> df1 = df.withColumn('a', df['id'] * 2)
> df1.write.partitionBy('id').parquet('./data')
> # read.py
> import pyspark
> sc = pyspark.SparkContext()
> sqlc = pyspark.SQLContext(sc)
> df2 = sqlc.read.parquet('./data')
> df2.dropna().count()
> $ spark-submit write.py
> $ spark-submit read.py
> # error message
> 15/12/08 17:20:34 ERROR Filter: Failed to generate predicate, fallback to interpreted org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Binding attribute, tree: a#0L
> ...
> If write data without partitionBy, the error won't happen
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org