You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-issues@hadoop.apache.org by "Wangda Tan (JIRA)" <ji...@apache.org> on 2018/11/13 18:02:02 UTC

[jira] [Assigned] (MAPREDUCE-7158) Inefficient Flush Logic in JobHistory EventWriter

     [ https://issues.apache.org/jira/browse/MAPREDUCE-7158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wangda Tan reassigned MAPREDUCE-7158:
-------------------------------------

    Assignee: Zichen Sun

> Inefficient Flush Logic in JobHistory EventWriter
> -------------------------------------------------
>
>                 Key: MAPREDUCE-7158
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7158
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 3.2.0
>            Reporter: Zichen Sun
>            Assignee: Zichen Sun
>            Priority: Major
>         Attachments: MAPREDUCE-7158-001.patch
>
>
> In HDFS, if the flush is implemented to send server request to actually commit the pending writes on the storage service side, we could observe in the benchmark runs that the MR jobs are taking much longer. From investigation we see the current implementation for writing events doesn't look right:
> EventWriter# write()
> This flush is redundant and this statement should be removed. It defeats the purpose of having a separate flush function itself.
> Encoder.flush calls flush of the underlying output stream
> After patching with the fix the MR jobs could complete normally, please kindly find the patch in attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org