You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by sihuazhou <gi...@git.apache.org> on 2018/03/05 16:58:24 UTC
[GitHub] flink pull request #5641: [FLINK-8601][WIP] Introduce PartitionedBloomFilter...
GitHub user sihuazhou opened a pull request:
https://github.com/apache/flink/pull/5641
[FLINK-8601][WIP] Introduce PartitionedBloomFilter for Approximate calculation and other situations of performance optimization
This PR introduce PartitionedBloomFilter which support rescaling and can deal with data skew problem
properly.
## Brief change log
- introduce PartitionedBloomFilter for Approximate calculation and other situations of performance optimization.
## Verifying this change
This change can be verified by the unit tests in below files:
- PartitionedBloomFilterTest.java
- LinkedBloomFilterTest.java
- LinkedBloomFilterNodeTest.java
- PartitionedBloomFilterManagerTest.java
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): (no)
- The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no)
- The serializers: (no)
- The runtime per-record code paths (performance sensitive): (no)
- Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
- The S3 file system connector: (no)
## Documentation
- Does this pull request introduce a new feature? (yes)
doc: [google doc](https://docs.google.com/document/d/1s8w2dkNFDM9Fb2zoHwHY0hJRrqatAFta42T97nDXmqc/edit?usp=sharing)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sihuazhou/flink bloomfilter_state
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/5641.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #5641
----
commit 5429abe0031a93596b12dada6e9696f3179eb4e8
Author: summerleafs <su...@...>
Date: 2018-02-06T16:47:25Z
introduce bloom filter state.
commit 2d1f66c10fbf74272be76283b909b290ae55d4fd
Author: summerleafs <su...@...>
Date: 2018-02-07T14:52:22Z
add unit tests for bloom filter state.
commit 433370a12814f7bd80127d4508e1dd0812a9d3fe
Author: summerleafs <su...@...>
Date: 2018-02-07T18:12:13Z
add general type support.
commit 5e05ee84353516fe7ff6eb7dd3a01dfdb3337bc5
Author: summerleafs <su...@...>
Date: 2018-02-09T15:10:11Z
this is a tmp commit.
commit 6e4ff0cebed853c598e0647e9f8aa56b5b59d0cc
Author: summerleafs <su...@...>
Date: 2018-02-10T14:30:13Z
this is a tmp commit.
commit aa672e6e1e89b185722fde44a9b4044b87010c99
Author: summerleafs <su...@...>
Date: 2018-02-10T15:32:01Z
this is a tmp commit.
commit 3b04502ba277cad2a7b0bc381fb192d18b56f17d
Author: summerleafs <su...@...>
Date: 2018-02-11T11:34:54Z
fix build.
commit 775d6aaf354de35c7ddff242f8e006e13e9a0e76
Author: summerleafs <su...@...>
Date: 2018-02-12T03:52:43Z
add annotation for classes.
commit b7f04303aa1ec1fbe9696bb58b13838b6a74a7ae
Author: summerleafs <su...@...>
Date: 2018-02-12T03:53:19Z
a temp commit.
commit 28222bf5fc352a26082f2aee19be70ca5f9aa9d9
Author: sihuazhou <su...@...>
Date: 2018-03-05T16:48:15Z
fix build.
----
---
[GitHub] flink issue #5641: [FLINK-8601][WIP] Introduce PartitionedBloomFilter for Ap...
Posted by sihuazhou <gi...@git.apache.org>.
Github user sihuazhou commented on the issue:
https://github.com/apache/flink/pull/5641
For the sake of easy to discussion later..
---