You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org> on 2010/07/06 13:18:52 UTC
[jira] Resolved: (MAPREDUCE-583) get rid of excessive flushes from
PipeMapper/Reducer
[ https://issues.apache.org/jira/browse/MAPREDUCE-583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amareshwari Sriramadasu resolved MAPREDUCE-583.
-----------------------------------------------
Resolution: Duplicate
Fixed by HADOOP-3429
> get rid of excessive flushes from PipeMapper/Reducer
> ----------------------------------------------------
>
> Key: MAPREDUCE-583
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-583
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Joydeep Sen Sarma
>
> there's a flush on the buffered output streams in mapper/reducer for every row of data.
> // 2/4 Hadoop to Tool
> if (numExceptions_ == 0) {
> if (!this.ignoreKey) {
> write(key);
> clientOut_.write('\t');
> }
> write(value);
> if(!this.skipNewline) {
> clientOut_.write('\n');
> }
> clientOut_.flush();
> } else {
> numRecSkipped_++;
> }
> tried to measure impact of removing this. number of context switches reported by vmstat shows marked decline.
> with flush (10 second intervals):
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 4 2 784 23140 83352 3114648 0 0 4819 32397 1175 13220 59 11 13 17
> 1 2 784 129724 80704 3075696 0 0 4614 27196 1156 14797 49 11 19 21
> 4 0 784 24160 83440 3174880 0 0 96 36070 1337 10976 67 11 9 12
> 5 0 784 155872 84400 3158840 0 0 125 44084 1280 11044 68 14 10 8
> 2 1 784 365128 87048 2892032 0 0 119 38472 1317 11610 69 14 10 7
> without flush:
> 5 0 784 24652 56056 3217864 0 0 310 29499 1379 7603 76 9 7 8
> 5 3 784 118456 54568 3209992 0 0 3249 33426 1173 6828 63 11 12 14
> 0 2 784 227628 54820 3198560 0 0 7840 30063 1146 8899 60 10 15 15
> 3 1 784 25608 55048 3313512 0 0 3251 36276 1194 7915 60 10 15 15
> 1 2 784 197324 49968 3194572 0 0 4714 35479 1281 8204 62 13 12 13
> cs goes down by about 20-30%. but having trouble measuring overall speed improvement (too many variables due to spec. execution etc. - need better benchmark).
> can't hurt.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.