You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@crunch.apache.org by bu...@apache.org on 2015/01/11 18:27:58 UTC

svn commit: r935831 - in /websites/staging/crunch/trunk/content: ./ user-guide.html

Author: buildbot
Date: Sun Jan 11 17:27:58 2015
New Revision: 935831

Log:
Staging update by buildbot for crunch

Modified:
    websites/staging/crunch/trunk/content/   (props changed)
    websites/staging/crunch/trunk/content/user-guide.html

Propchange: websites/staging/crunch/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Sun Jan 11 17:27:58 2015
@@ -1 +1 @@
-1650712
+1650928

Modified: websites/staging/crunch/trunk/content/user-guide.html
==============================================================================
--- websites/staging/crunch/trunk/content/user-guide.html (original)
+++ websites/staging/crunch/trunk/content/user-guide.html Sun Jan 11 17:27:58 2015
@@ -1579,15 +1579,23 @@ computations that combine custom DoFns w
 implementation to create test data sets that we can easily verify by hand, and then this same logic can be executed on
 a distributed data set using either the <a href="#mrpipeline">MRPipeline</a> or <a href="#sparkpipeline">SparkPipeline</a> implementations.</p>
 <h3 id="pipeline-execution-plan-visualizations">Pipeline execution plan visualizations</h3>
-<p>Crunch provides tools to visualize the pipeline execution plans. The <a href="apidocs/0.10.0/org/apache/crunch/PipelineExecution.html">PipelineExecution</a>  <code>String getPlanDotFile()</code> method returns an execution plan visualization in DOT format. When the output folder property is set, Crunch produces a DOT file after each pipeline run. </p>
-<p>Additional aspects of the execution plans are provided when the DOT file debug mode is enabled. Then Crunch provides 4 additional DOT diagrams visualizing different internal stages of the execution plan. Such plans include PCollection lineage, Base graph plan, Split graph plans, Run-time nodes. 
-Note: To enable the debug mode you should set an out put folder first. The following snapped switches the DOT file debug mode.  As a result 5 DOT diagrams are generated in the output folder after each Pipeline execution:</p>
-<div class="codehilite"><pre>    <span class="n">Configuration</span> <span class="n">conf</span> <span class="p">=</span> <span class="p">...</span>
-    <span class="n">String</span> <span class="n">dotfileDir</span> <span class="p">=</span> <span class="p">...</span>
+<p>Crunch provides tools to visualize the pipeline execution plan. The <a href="apidocs/0.10.0/org/apache/crunch/PipelineExecution.html">PipelineExecution</a>  <code>String getPlanDotFile()</code> method returns a DOT format visualization of the exaction plan. Furthermore if the output folder is set then Crunch will save the dotfile diagram on each pipeline execution: </p>
+<div class="codehilite"><pre>    <span class="n">Configuration</span> <span class="n">conf</span> <span class="p">=...;</span>     
+    <span class="n">String</span> <span class="n">dotfileDir</span> <span class="p">=...;</span>
 
+    <span class="o">//</span> <span class="n">Set</span> <span class="n">DOT</span> <span class="n">files</span> <span class="n">out</span> <span class="n">put</span> <span class="n">folder</span> <span class="n">path</span>
     <span class="n">DotfileUtills</span><span class="p">.</span><span class="n">setPipelineDotfileOutputDir</span><span class="p">(</span><span class="n">conf</span><span class="p">,</span> <span class="n">dotfileDir</span><span class="p">);</span>
+</pre></div>
+
+
+<p>Additional details of the Crunch execution plan can be exposed by enabling the dotfile debug mode like this:</p>
+<div class="codehilite"><pre>    <span class="o">//</span> <span class="n">Requires</span> <span class="n">the</span> <span class="n">output</span> <span class="n">folder</span> <span class="n">to</span> <span class="n">be</span> <span class="n">set</span><span class="p">.</span>
     <span class="n">DotfileUtills</span><span class="p">.</span><span class="n">enableDebugDotfiles</span><span class="p">(</span><span class="n">conf</span><span class="p">);</span>
 </pre></div>
+
+
+<p>This will produce (and save) 4 additional diagrams that visualize the internal stages of the pipeline execution plan. Such diagrams are the PCollection pineage, the pipeline base and split graphs and the run-time node (RTNode) representation. </p>
+<p>(Note: The debug mode requires the output folder to be set. )</p>
         </div> <!-- /span -->
 
       </div> <!-- /row-fluid -->