You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean R. Owen (Jira)" <ji...@apache.org> on 2019/10/26 23:08:00 UTC

[jira] [Resolved] (SPARK-28575) Time lag between two consecutive spark actions using Spark 2.3.1

     [ https://issues.apache.org/jira/browse/SPARK-28575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean R. Owen resolved SPARK-28575.
----------------------------------
    Resolution: Invalid

There's not enough information here. We don't even know what your code is doing in between. There could be other sources of legitimate difference here.

> Time lag between two consecutive spark actions using Spark 2.3.1
> ----------------------------------------------------------------
>
>                 Key: SPARK-28575
>                 URL: https://issues.apache.org/jira/browse/SPARK-28575
>             Project: Spark
>          Issue Type: Bug
>          Components: Scheduler
>    Affects Versions: 2.3.1
>            Reporter: Kushal Mahajan
>            Priority: Major
>         Attachments: spark_2.1_screenshot.PNG, spark_2.3_screenshot.PNG
>
>
> Steps to reproduce:
>  # Read a directory(consisting of txt files) using spark context's wholetextfile method
>  # Perform transformation on the resultant paired rdd
>  # Perform an action(foreach) on each entry corresponding to each txt file
>  # Time lag can be seen between these actions in Spark UI. 
> The action itself is not taking that much time. There is time lag between start time for each action(excluding the time taken by the job itself). Kindly refer to the attachments
> PS: This time lag is not seen when running the job in Spark 2.1.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org