You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/08/17 05:33:45 UTC
[jira] [Commented] (FLINK-2534) Improve execution code in
CompactingHashTable.java
[ https://issues.apache.org/jira/browse/FLINK-2534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698969#comment-14698969 ]
ASF GitHub Bot commented on FLINK-2534:
---------------------------------------
GitHub user HuangWHWHW opened a pull request:
https://github.com/apache/flink/pull/1029
[FLINK-2534][RUNTIME]Improve in CompactingHashTable.java
1. Remove the var currentForwardPointer is unused in the code since this can reduce little memory cost.
2.Take the numInSegment++ out of the if branch and remove the else branch to decrease the loop complexity.
3.Improve the "while" to a formula:
numBuckets += numPartitions - numBuckets % numPartitions;
Otherwise, I found some places just simply using "for" to find a max or min number like the code following:
'private int getMaxPartition() {
int maxPartition = 0;
for(InMemoryPartition<T> p1 : this.partitions) {
if(p1.getBlockCount() > maxPartition) {
maxPartition = p1.getBlockCount();
}
}
return maxPartition;
}'
This does some harm to the performance.
Should we use some algorithm(.i.e RMQ, Segment tree, Heap structure) to take a optimization?
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/HuangWHWHW/flink FLINK-2534
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/1029.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1029
----
commit 4303d6541cd539715d64780b1c4dc2b883ead99e
Author: HuangWHWHW <40...@qq.com>
Date: 2015-08-17T03:31:02Z
[FLINK-2534][RUNTIME]Improve in CompactingHashTable.java
----
> Improve execution code in CompactingHashTable.java
> --------------------------------------------------
>
> Key: FLINK-2534
> URL: https://issues.apache.org/jira/browse/FLINK-2534
> Project: Flink
> Issue Type: Improvement
> Components: Local Runtime
> Affects Versions: 0.10
> Reporter: Huang Wei
> Fix For: 0.10
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> I found some improved code in CompactingHashTable.java since this code will execute many times when flink runs.
> In my opinion, some codes in "for" and "while" can be optimized to reduce the times of execution and it is effective to increase the performance.
> For example, the code following:
> 'while(numBuckets % numPartitions != 0) {
> numBuckets++;
> }'
> can be optimized into a formula:
> numBuckets += numPartitions - (numBuckets % numPartitions);
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)