You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Marcelo Vanzin (JIRA)" <ji...@apache.org> on 2019/02/07 18:41:00 UTC

[jira] [Resolved] (SPARK-21755) Spark 2.1.1 UI page not displaying any dynamic updates on job progress after showing progress for initial few minutes of job run.

     [ https://issues.apache.org/jira/browse/SPARK-21755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marcelo Vanzin resolved SPARK-21755.
------------------------------------
    Resolution: Not A Bug

I'm closing this for a few reasons.

- There's been a lot of changes in the history server handling of event logs since 2.2 that may make this better.
- From your description, it seems that "UI" here means the history server. It's normal for it to not see many updates, depending on how the event logs are written.
- For example, if your event logs are on s3 or in some other storage where the read side doesn't necessarily see updates from the write side, you can get into this situation.

I'd suggest working with the EMR guys first, and if they identify a Spark issue, then file a bug here.

> Spark 2.1.1 UI page not displaying any dynamic updates on job progress after showing progress for initial few minutes of job run.
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-21755
>                 URL: https://issues.apache.org/jira/browse/SPARK-21755
>             Project: Spark
>          Issue Type: Bug
>          Components: Web UI
>    Affects Versions: 2.1.1
>         Environment: Issue was produced on an EMR cluster with following configurations:
> ### EMR Release label:      emr-5.6.0
> ### Hadoop distribution:      Amazon 2.7.3
> ### Applications installed:   Hive 2.1.1, Spark 2.1.1
>            Reporter: Ankur
>            Priority: Major
>
> When a Spark SQL job is ran, Spark Application’s Web Console ( UI ) is getting intermittently updated for initial few minutes ( ~ 10-15 minutes ) and after that there are no updates on job progress ( even after job execution completes).  As soon as "Spark SQL" session is terminated I can see Spark UI got updated with the job summary.
> Issue was reproduced by using spark-sql on a data-set of around 1.2 TB size. Here are the steps:
> Step 1> An EMR cluster is launched ( release emr-5.6.0 and applications as Hive 2.1.1, Spark 2.1.1 )
> Step 2>> Following command is ran:
> spark-sql> CREATE TABLE total_flights USING com.databricks.spark.csv OPTIONS (path "s3://bucket/test_web_UI/flight/", header "true", inferSchema "true");
> Data-set used : Flights history in CSV files provided by US Department of Transportation, Bureau of Transportation Statistics - https://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time
> Step 3> There were no updates on Web UI after initial ~10 minutes. Web UI did not got updated even after few hours when job was completed successfully. 
> Step 4> Once the spark-sql session is ended, Spark UI got updated with the job summary correctly as expected. 
> I have verified that "spark.history.fs.update.interval" is set to default value of 10 seconds as mentioned in this document "https://spark.apache.org/docs/latest/monitoring.html ".  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org