You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Iverson Hu (JIRA)" <ji...@apache.org> on 2018/09/08 06:20:00 UTC
[jira] [Created] (SPARK-25377) spark sql dataframe cache is invalid
Iverson Hu created SPARK-25377:
----------------------------------
Summary: spark sql dataframe cache is invalid
Key: SPARK-25377
URL: https://issues.apache.org/jira/browse/SPARK-25377
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 2.3.0
Environment: spark version 2.3.0
scala version 2.1.8
Reporter: Iverson Hu
When I use SQL dataframe in application, I found that dataframe.cache is invalid, the first time to execute Action like count() took me 40 seconds, and the seconds time to execute Action also.So I use dataframe.rdd.cache, second execution time is less than first execution time. And I think it's SQL dataframe's bug.
This is my codes and console log, and I have cached the datafame of result before. !image-2018-09-08-14-18-36-780.png!
!image-2018-09-08-14-18-07-759.png!
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org