You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/12/13 14:56:00 UTC

[jira] [Commented] (SPARK-26363) Remove redundant field `executorLogs` in TaskData

    [ https://issues.apache.org/jira/browse/SPARK-26363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16720259#comment-16720259 ] 

ASF GitHub Bot commented on SPARK-26363:
----------------------------------------

gengliangwang opened a new pull request #23310: [SPARK-26363][WebUI] Remove redundant field `executorLogs` in TaskData
URL: https://github.com/apache/spark/pull/23310
 
 
   ## What changes were proposed in this pull request?
   
   In https://github.com/apache/spark/pull/21688, a new filed `executorLogs` is added to `TaskData` in `api.scala`:
   1. The field should not belong to `TaskData` (from the meaning of wording).
   2. This is redundant with ExecutorSummary. 
   3. For each row in the task table, the executor log value is lookup in KV store every time, which can be avoided for better performance.  
   ![image](https://user-images.githubusercontent.com/1097932/49946230-841c7680-ff29-11e8-8b83-d8f7553bfe5e.png)
   
   
   This PR propose to reuse the executor details of request "/allexecutors" , so that we can have a cleaner api data structure, and redundant KV store queries are avoided.
   (Before https://github.com/apache/spark/pull/21688 ,  stage page used a hash map to avoid duplicated executor logs lookup. But I think reusing the result of "allexecutors" is better.)
   ## How was this patch tested?
   
   Manual check

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Remove redundant field `executorLogs` in TaskData
> -------------------------------------------------
>
>                 Key: SPARK-26363
>                 URL: https://issues.apache.org/jira/browse/SPARK-26363
>             Project: Spark
>          Issue Type: Improvement
>          Components: Web UI
>    Affects Versions: 3.0.0
>            Reporter: Gengliang Wang
>            Priority: Major
>
> In https://github.com/apache/spark/pull/21688, a new filed `executorLogs` is added to `TaskData` in `api.scala`:
> 1. The field should not belong to `TaskData` (from the meaning of wording).
> 2. This is redundant with ExecutorSummary. 
> 3. For each row in the task table, the executor log value is lookup in KV store every time, which can be avoided for better performance in large scale.
> This PR propose to reuse the executor details of request "/allexecutors" , so that we can have a cleaner api data structure, and redundant KV store queries are avoided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org