You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nemo.apache.org by GitBox <gi...@apache.org> on 2018/06/10 14:37:13 UTC

[GitHub] jeongyooneo opened a new pull request #30: [NEMO-63] On-demand dynamic optimization metric aggregation

jeongyooneo opened a new pull request #30: [NEMO-63] On-demand dynamic optimization metric aggregation
URL: https://github.com/apache/incubator-nemo/pull/30
 
 
   JIRA: [NEMO-63: On-demand dynamic optimization metric aggregation](https://issues.apache.org/jira/projects/NEMO/issues/NEMO-63)
   
   **Major changes:**
   - As metric data used in DataSkewRuntimePass are sent via NCSMessage and piled up until stage completion, it caused OOM in 100~200GB-scale skew experiment. This PR aggregates them on-demand to avoid this.
   
   **Minor changes to note:**
   - Added MapReduceSkew beam example, which generates skewed key-value tuple from skewed input in map stage and calculates median of the per-key values in reduce stage.
   
   **Tests for the changes:**
   - Adapted DataSkewRuntimePassTest according to the change
   
   **Other comments:**
   - N/A
   
   resolves [NEMO-63](https://issues.apache.org/jira/projects/NEMO/issues/NEMO-63)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services