You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pig.apache.org by ch...@apache.org on 2014/02/19 19:27:17 UTC

svn commit: r1569868 - in /pig/trunk: CHANGES.txt src/docs/src/documentation/content/xdocs/perf.xml

Author: cheolsoo
Date: Wed Feb 19 18:27:17 2014
New Revision: 1569868

URL: http://svn.apache.org/r1569868
Log:
PIG-3740: Document direct fetch optimization (lbendig via cheolsoo)

Modified:
    pig/trunk/CHANGES.txt
    pig/trunk/src/docs/src/documentation/content/xdocs/perf.xml

Modified: pig/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/pig/trunk/CHANGES.txt?rev=1569868&r1=1569867&r2=1569868&view=diff
==============================================================================
--- pig/trunk/CHANGES.txt (original)
+++ pig/trunk/CHANGES.txt Wed Feb 19 18:27:17 2014
@@ -89,6 +89,8 @@ OPTIMIZATIONS
  
 BUG FIXES
 
+PIG-3740: Document direct fetch optimization (lbendig via cheolsoo)
+
 PIG-3746: NPE is thrown if Pig fails before PigStats is intialized (cheolsoo)
 
 PIG-3747: Update skewed join documentation (cheolsoo)

Modified: pig/trunk/src/docs/src/documentation/content/xdocs/perf.xml
URL: http://svn.apache.org/viewvc/pig/trunk/src/docs/src/documentation/content/xdocs/perf.xml?rev=1569868&r1=1569867&r2=1569868&view=diff
==============================================================================
--- pig/trunk/src/docs/src/documentation/content/xdocs/perf.xml (original)
+++ pig/trunk/src/docs/src/documentation/content/xdocs/perf.xml Wed Feb 19 18:27:17 2014
@@ -1047,6 +1047,34 @@ java -cp $PIG_HOME/pig.jar 
 </ul>
 <p></p>
 </section>
+
+<!-- +++++++++++++++++++++++++++++++ -->
+<section id="direct-fetch">
+<title>Direct Fetch</title>
+<p>When the <a href="test.html#dump">DUMP</a> operator is used to execute Pig Latin statements, Pig can take the advantage to minimize latency by directly reading data from HDFS rather than launching MapReduce jobs.</p>
+
+<p>
+The result is fetched if the query contains any of the following operators: 
+<a href="basic.html#filter">FILTER</a>, 
+<a href="basic.html#foreach">FOREACH</a>, 
+<a href="basic.html#limit">LIMIT</a>, 
+<a href="basic.html#stream">STREAM</a>, 
+<a href="basic.html#union">UNION</a>.
+<br></br>
+Fetching will be disabled in case of:
+</p>
+<ul>
+  <li>the presence of other operators, <a href="http://pig.apache.org/docs/r0.13.0/api/org/apache/pig/impl/builtin/SampleLoader.html">sample loaders</a> and scalar expressions</li>
+  <li>implicit splits</li>
+</ul>
+
+<p>
+You can check if the query can be fetched by running EXPLAIN. You should see "No MR jobs. Fetch only." in the MapReduce part of the plan.
+<br></br>
+Direct fetch is turned on by default. To turn it off set the property opt.fetch to false or start Pig with the "-N" or "-no_fetch" option.
+</p>
+
+</section>
 </section>
   
 <!-- ==================================================================== -->