You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Dustin Smith (Jira)" <ji...@apache.org> on 2020/06/22 07:04:00 UTC

[jira] [Created] (SPARK-32046) current_timestamp called in a cache dataframe freezes the time for all future calls

Dustin Smith created SPARK-32046:
------------------------------------

             Summary: current_timestamp called in a cache dataframe freezes the time for all future calls
                 Key: SPARK-32046
                 URL: https://issues.apache.org/jira/browse/SPARK-32046
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.4.4, 2.3.0
            Reporter: Dustin Smith


If I call current_timestamp 3 times while caching the dataframe variable in order to freeze that dataframes time, the 3rd dataframe time and beyond (4th, 5th, ...) will be frozen to the 2nd dataframe's time. The 1st dataframe and the 2nd will differ in time but will become static on the 3rd usage and beyond.

 
{code:java}
val df1 = spark.range(1).select(current_timestamp as "datetime").cache
df1.count

df1.show(false)

Thread.sleep(9500)

val df2 = spark.range(1).select(current_timestamp as "datetime").cache
df2.count 

df2.show(false)

Thread.sleep(9500)

val df3 = spark.range(1).select(current_timestamp as "datetime").cache 
df3.count 

df3.show(false){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org