You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2019/03/01 21:22:00 UTC

[jira] [Resolved] (SPARK-26943) Weird behaviour with `.cache()`

     [ https://issues.apache.org/jira/browse/SPARK-26943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved SPARK-26943.
-------------------------------
    Resolution: Cannot Reproduce

I don't think this is a bug, or at least, I can think of other reasons this happens.

Your transformation and/or data have some problem (see the error). It doesn't come up in .count() because, for example, Spark can avoid actually parsing the data if you just want to know how many things there are. To cache it requires persisting its representation in memory and actually parsing it, and so that's why it comes up.

> Weird behaviour with `.cache()`
> -------------------------------
>
>                 Key: SPARK-26943
>                 URL: https://issues.apache.org/jira/browse/SPARK-26943
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 2.1.0
>            Reporter: Will Uto
>            Priority: Major
>
>  
> {code:java}
> sdf.count(){code}
>  
> works fine. However:
>  
> {code:java}
> sdf = sdf.cache()
> sdf.count()
> {code}
>  does not, and produces error
> {code:java}
> Py4JJavaError: An error occurred while calling o314.count.
> : org.apache.spark.SparkException: Job aborted due to stage failure: Task 75 in stage 8.0 failed 4 times, most recent failure: Lost task 75.3 in stage 8.0 (TID 438, uat-datanode-02, executor 1): java.text.ParseException: Unparseable number: "(N/A)"
> 	at java.text.NumberFormat.parse(NumberFormat.java:350)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org