You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/08/17 05:33:45 UTC

[jira] [Commented] (FLINK-2534) Improve execution code in CompactingHashTable.java

    [ https://issues.apache.org/jira/browse/FLINK-2534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698969#comment-14698969 ] 

ASF GitHub Bot commented on FLINK-2534:
---------------------------------------

GitHub user HuangWHWHW opened a pull request:

    https://github.com/apache/flink/pull/1029

    [FLINK-2534][RUNTIME]Improve in CompactingHashTable.java

    1. Remove the var currentForwardPointer is unused in the code since this can reduce little memory cost.
    2.Take the numInSegment++ out of the if branch and remove the else branch to decrease the loop complexity.
    3.Improve the "while" to a formula:
    numBuckets += numPartitions - numBuckets % numPartitions;
    
    Otherwise, I found some places just simply using "for" to find a max or min number like the code following:
    'private int getMaxPartition() {
    		int maxPartition = 0;
    		for(InMemoryPartition<T> p1 : this.partitions) {
    			if(p1.getBlockCount() > maxPartition) {
    				maxPartition = p1.getBlockCount();
    			}
    		}
    		return maxPartition;
    	}'
    This does some harm to the performance.
    Should we use some algorithm(.i.e RMQ, Segment tree, Heap structure) to take a optimization?


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HuangWHWHW/flink FLINK-2534

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1029.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1029
    
----
commit 4303d6541cd539715d64780b1c4dc2b883ead99e
Author: HuangWHWHW <40...@qq.com>
Date:   2015-08-17T03:31:02Z

    [FLINK-2534][RUNTIME]Improve in CompactingHashTable.java

----


> Improve execution code in CompactingHashTable.java
> --------------------------------------------------
>
>                 Key: FLINK-2534
>                 URL: https://issues.apache.org/jira/browse/FLINK-2534
>             Project: Flink
>          Issue Type: Improvement
>          Components: Local Runtime
>    Affects Versions: 0.10
>            Reporter: Huang Wei
>             Fix For: 0.10
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> I found some improved code in CompactingHashTable.java since this code will execute many times when flink runs.
> In my opinion, some codes in "for" and "while" can be optimized to reduce the times of execution and it is effective to increase the performance.
> For example, the code following:
> 'while(numBuckets % numPartitions != 0) {
> 			numBuckets++;
> 		}'
> can be optimized into a formula:
> numBuckets += numPartitions - (numBuckets % numPartitions);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)