You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Lisheng Sun (Jira)" <ji...@apache.org> on 2020/05/31 11:47:00 UTC

[jira] [Comment Edited] (FLINK-12693) Store state per key-group in CopyOnWriteStateTable

    [ https://issues.apache.org/jira/browse/FLINK-12693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120452#comment-17120452 ] 

Lisheng Sun edited comment on FLINK-12693 at 5/31/20, 11:46 AM:
----------------------------------------------------------------

hi [~banmoy] 

According to  test's result,  the performance of calculation hash in CopyOnWriteStateMap is much worse than JDK HashMap.

 Could you tell what the new hash algorithm is for. Reduce hash collision? Thank you.

CopyOnWriteStateMap#computeHashForOperationAndDoIncrementalRehash#compositeHash#bitMix

 
{code:java}
public static int bitMix(int in) {
   in ^= in >>> 16;
   in *= 0x85ebca6b;
   in ^= in >>> 13;
   in *= 0xc2b2ae35;
   in ^= in >>> 16;
   return in;
}
{code}
HashMap#hash
{code:java}
static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}{code}
 


was (Author: leosun08):
hi [~banmoy] 

According to  test's result,  the performance of calculation hash in CopyOnWriteStateMap is much worse than JDK HashMap.

 Could you tell what the new hash algorithm is for.  Thank you.

CopyOnWriteStateMap#computeHashForOperationAndDoIncrementalRehash#compositeHash#bitMix

 
{code:java}
public static int bitMix(int in) {
   in ^= in >>> 16;
   in *= 0x85ebca6b;
   in ^= in >>> 13;
   in *= 0xc2b2ae35;
   in ^= in >>> 16;
   return in;
}
{code}
HashMap#hash
{code:java}
static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}{code}
 

> Store state per key-group in CopyOnWriteStateTable
> --------------------------------------------------
>
>                 Key: FLINK-12693
>                 URL: https://issues.apache.org/jira/browse/FLINK-12693
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / State Backends
>            Reporter: Yu Li
>            Assignee: PengFei Li
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Since we propose to use KeyGroup as the unit of spilling/loading, the first step is to store state per key-groups. Currently {{NestedMapsStateTable}} natively supports this, so we only need to refine {{CopyOnWriteStateTable}}
> The main efforts required here is to extract the customized hash-map out of {{CopyOnWriteStateTable}} then use such a hash-map as the state holder for each KeyGroup. Whereafter we could extract some common logic out into {{StateTable}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)