You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pig.apache.org by Apache Wiki <wi...@apache.org> on 2009/05/04 19:05:50 UTC

[Pig Wiki] Update of "PigMultiQueryPerformanceSpecification" by RichardDing

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.

The following page has been changed by RichardDing:
http://wiki.apache.org/pig/PigMultiQueryPerformanceSpecification

------------------------------------------------------------------------------
  
  Will be executed as:
  
- attachment:mapreduce.png
+ [TBD]
  
  If a split happens in a reduce plan, splittees have to be map-only jobs to be merged into the splitter.
  If there are map-reduce splittees the reduce will result in a tmp store and the splittees are run in separate
@@ -570, +570 @@

  The merging of splittees into a splitter consists of:
  
     * Creating a split operator in the map or reduce and setting the splittee plans as nested plans of the split
-    * If it needs to merge combiners it will introduce a Demux operator to route the input from mixed split branches in the mapper to the right combine plan. The separate combiner plans are the nested plans of the Demux operator
+    * If it needs to merge combiners it will introduce a Demux operator to route the input from mixed split branches in the mapper to the right combine plan. The separate combiner plans are the nested plans of the Demux operator   
-    * If a map reduce operator does not have a combiner it will insert a FakeLocalRearrange operator to simply route the input through.
     * If it needs to merge reduce plans, it will do so using the Demux operator the same way the combiner is merged.
+    * In the cases where some splittees have combiners and some do not have combiners, the optimizer chooses either the subset of splittees with combiners or the subset of splittees without combiners--depending on which subset is larger--and merges these splittees into the splitter.
  
  Note: As an end result this merging will result in Split or Demux operators with multiple stores tucked away in their nested plans.
  
@@ -636, +636 @@

  [[Anchor(DemuxOperator)]]
  ===== Demux Operator =====
  
- The demux operator is used in combiners and reducers where the input is a mix of different split plans of the mapper. It will decide which of it's nested plans a record belongs to and then attach it to that particular plan.
+ The demux operator is used in combiners and reducers where the input is a mix of different split plans of the mapper. The outputs of split plans are indexed and based on the index, the demux operator will decide which of it's nested plans a record belongs to and then attach it to that particular plan. 
+ 
  
  [[Anchor(Local_Execution_engine)]]
  ==== Local Execution Engine ====