You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pig.apache.org by ch...@apache.org on 2014/02/19 19:27:17 UTC
svn commit: r1569868 - in /pig/trunk: CHANGES.txt
src/docs/src/documentation/content/xdocs/perf.xml
Author: cheolsoo
Date: Wed Feb 19 18:27:17 2014
New Revision: 1569868
URL: http://svn.apache.org/r1569868
Log:
PIG-3740: Document direct fetch optimization (lbendig via cheolsoo)
Modified:
pig/trunk/CHANGES.txt
pig/trunk/src/docs/src/documentation/content/xdocs/perf.xml
Modified: pig/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/pig/trunk/CHANGES.txt?rev=1569868&r1=1569867&r2=1569868&view=diff
==============================================================================
--- pig/trunk/CHANGES.txt (original)
+++ pig/trunk/CHANGES.txt Wed Feb 19 18:27:17 2014
@@ -89,6 +89,8 @@ OPTIMIZATIONS
BUG FIXES
+PIG-3740: Document direct fetch optimization (lbendig via cheolsoo)
+
PIG-3746: NPE is thrown if Pig fails before PigStats is intialized (cheolsoo)
PIG-3747: Update skewed join documentation (cheolsoo)
Modified: pig/trunk/src/docs/src/documentation/content/xdocs/perf.xml
URL: http://svn.apache.org/viewvc/pig/trunk/src/docs/src/documentation/content/xdocs/perf.xml?rev=1569868&r1=1569867&r2=1569868&view=diff
==============================================================================
--- pig/trunk/src/docs/src/documentation/content/xdocs/perf.xml (original)
+++ pig/trunk/src/docs/src/documentation/content/xdocs/perf.xml Wed Feb 19 18:27:17 2014
@@ -1047,6 +1047,34 @@ java -cp $PIG_HOME/pig.jar
</ul>
<p></p>
</section>
+
+<!-- +++++++++++++++++++++++++++++++ -->
+<section id="direct-fetch">
+<title>Direct Fetch</title>
+<p>When the <a href="test.html#dump">DUMP</a> operator is used to execute Pig Latin statements, Pig can take the advantage to minimize latency by directly reading data from HDFS rather than launching MapReduce jobs.</p>
+
+<p>
+The result is fetched if the query contains any of the following operators:
+<a href="basic.html#filter">FILTER</a>,
+<a href="basic.html#foreach">FOREACH</a>,
+<a href="basic.html#limit">LIMIT</a>,
+<a href="basic.html#stream">STREAM</a>,
+<a href="basic.html#union">UNION</a>.
+<br></br>
+Fetching will be disabled in case of:
+</p>
+<ul>
+ <li>the presence of other operators, <a href="http://pig.apache.org/docs/r0.13.0/api/org/apache/pig/impl/builtin/SampleLoader.html">sample loaders</a> and scalar expressions</li>
+ <li>implicit splits</li>
+</ul>
+
+<p>
+You can check if the query can be fetched by running EXPLAIN. You should see "No MR jobs. Fetch only." in the MapReduce part of the plan.
+<br></br>
+Direct fetch is turned on by default. To turn it off set the property opt.fetch to false or start Pig with the "-N" or "-no_fetch" option.
+</p>
+
+</section>
</section>
<!-- ==================================================================== -->