You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-commits@hadoop.apache.org by cd...@apache.org on 2009/12/09 10:14:06 UTC
svn commit: r888743 - in /hadoop/mapreduce/branches/branch-0.21: CHANGES.txt
src/docs/src/documentation/content/xdocs/mapred_tutorial.xml
Author: cdouglas
Date: Wed Dec 9 09:14:00 2009
New Revision: 888743
URL: http://svn.apache.org/viewvc?rev=888743&view=rev
Log:
MAPREDUCE-1074. Document Reducer mark/reset functionality.
Contributed by Jothi Padmanabhan
Modified:
hadoop/mapreduce/branches/branch-0.21/CHANGES.txt
hadoop/mapreduce/branches/branch-0.21/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml
Modified: hadoop/mapreduce/branches/branch-0.21/CHANGES.txt
URL: http://svn.apache.org/viewvc/hadoop/mapreduce/branches/branch-0.21/CHANGES.txt?rev=888743&r1=888742&r2=888743&view=diff
==============================================================================
--- hadoop/mapreduce/branches/branch-0.21/CHANGES.txt (original)
+++ hadoop/mapreduce/branches/branch-0.21/CHANGES.txt Wed Dec 9 09:14:00 2009
@@ -859,3 +859,5 @@
MAPREDUCE-754. Fix NPE in expiry thread when a TT is lost. (Amar Kamat
via sharad)
+ MAPREDUCE-1074. Document Reducer mark/reset functionality. (Jothi
+ Padmanabhan via cdouglas)
Modified: hadoop/mapreduce/branches/branch-0.21/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml
URL: http://svn.apache.org/viewvc/hadoop/mapreduce/branches/branch-0.21/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml?rev=888743&r1=888742&r2=888743&view=diff
==============================================================================
--- hadoop/mapreduce/branches/branch-0.21/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml (original)
+++ hadoop/mapreduce/branches/branch-0.21/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml Wed Dec 9 09:14:00 2009
@@ -947,6 +947,177 @@
map-outputs before writing them out to the <code>FileSystem</code>.
</p>
</section>
+
+ <section>
+ <title>Mark-Reset</title>
+
+ <p>While applications iterate through the values for a given key, it is
+ possible to mark the current position and later reset the iterator to
+ this position and continue the iteration process. The corresponding
+ methods are <code>mark()</code> and <code>reset()</code>.
+ </p>
+
+ <p><code>mark()</code> and <code>reset()</code> can be called any
+ number of times during the iteration cycle. The <code>reset()</code>
+ method will reset the iterator to the last record before a call to
+ the previous <code>mark()</code>.
+ </p>
+
+ <p>This functionality is available only with the new context based
+ reduce iterator.
+ </p>
+
+ <p> The following code snippet demonstrates the use of this
+ functionality.
+ </p>
+
+ <section>
+ <title>Source Code</title>
+
+ <table>
+ <tr><td>
+ <code>
+ public void reduce(IntWritable key,
+ Iterable<IntWritable> values,
+ Context context) throws IOException, InterruptedException {
+ </code>
+ </td></tr>
+
+ <tr><td></td></tr>
+
+ <tr><td>
+ <code>
+
+ MarkableIterator<IntWritable> mitr =
+ new MarkableIterator<IntWritable>(values.iterator());
+ </code>
+ </td></tr>
+
+ <tr><td></td></tr>
+
+ <tr><td>
+ <code>
+
+ // Mark the position
+ </code>
+ </td></tr>
+
+ <tr><td>
+ <code>
+
+ values.mark();
+ </code>
+ </td></tr>
+
+ <tr><td></td></tr>
+
+ <tr><td>
+ <code>
+
+ while (values.hasNext()) {
+ </code>
+ </td></tr>
+
+ <tr><td>
+ <code>
+
+ i = values.next();
+ </code>
+ </td></tr>
+
+ <tr><td>
+ <code>
+
+ // Do the necessary processing
+ </code>
+ </td></tr>
+
+ <tr><td>
+ <code>
+
+ }
+ </code>
+ </td></tr>
+
+ <tr><td></td></tr>
+
+ <tr><td>
+ <code>
+
+ // Reset
+ </code>
+ </td></tr>
+
+ <tr><td>
+ <code>
+
+ values.reset();
+ </code>
+ </td></tr>
+
+ <tr><td></td></tr>
+
+ <tr><td>
+ <code>
+
+ // Iterate all over again. Since mark was called before the first
+ </code>
+ </td></tr>
+
+ <tr><td>
+ <code>
+
+ // call to values.next() in this example, we will iterate over all
+ </code>
+ </td></tr>
+
+ <tr><td>
+ <code>
+
+ // the values now
+ </code>
+ </td></tr>
+
+ <tr><td>
+ <code>
+
+ while (values.hasNext()) {
+ </code>
+ </td></tr>
+
+ <tr><td>
+ <code>
+
+ i = values.next();
+ </code>
+ </td></tr>
+
+ <tr><td>
+ <code>
+
+ // Do the necessary processing
+ </code>
+ </td></tr>
+
+ <tr><td>
+ <code>
+
+ }
+ </code>
+ </td></tr>
+
+ <tr><td></td></tr>
+
+ <tr><td>
+ <code>
+ }
+ </code>
+ </td></tr>
+
+ </table>
+ </section>
+
+ </section>
</section>
<section>