You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pig.apache.org by Apache Wiki <wi...@apache.org> on 2009/11/19 18:55:03 UTC

[Pig Wiki] Update of "PigSkewedJoinSpec" by ThejasNair

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.

The "PigSkewedJoinSpec" page has been changed by ThejasNair.
http://wiki.apache.org/pig/PigSkewedJoinSpec?action=diff&rev1=12&rev2=13

--------------------------------------------------

  
  In order to use skewed join,
  
-    * Skewed join currently works with tow-table inner join.
+    * Skewed join currently works with two-table inner join.
     * Append 'using "skewed"' construct to the join to force pig to use skewed join
     * pig.skewedjoin.reduce.memusage specifies the fraction of heap available for the reducer to perform the join. A low fraction forces pig to use more reducers but increases copying cost. For pigmix tests, we have seen good performance when we set this value in the range 0.1 - 0.4. However, note that this is hardly an accurate range. Its value depends on the amount of heap available for the operation, the number of columns in the input and the skew. It is best obtained by conducting experiments to achieve a good performance. The default value is =0.5=.