You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by aljoscha <gi...@git.apache.org> on 2016/05/02 14:03:39 UTC

[GitHub] flink pull request: [FLINK-3581] [FLINK-3582] State Iterator and A...

GitHub user aljoscha opened a pull request:

    https://github.com/apache/flink/pull/1957

    [FLINK-3581] [FLINK-3582] State Iterator and Aligned Time Windows

    This adds two things. A new call `getPartitionedStateForAllKeys` on `AbstractStateBackend` that returns a `StateIterator` that can be used to iterate over all keys with their respective state in a partition.
    
    The second addition is a new special-purpose window operator `AlignedEventTimeWindowOperator` that behaves like `WindowOperator` if a `Sliding/TumblingEventTimeWindows` assigner is used with an `EventTimeTrigger`. The new operator does this without keeping state/timers per window and key, only one timer is kept per window. Upon firing, the new `StateIterator` is used to traverse all keys and emit windows.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/aljoscha/flink window-aligned-special-case-as-operator

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1957.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1957
    
----
commit 57af40ec71da20e5a9a5052959a3d1f5750c21f8
Author: Aljoscha Krettek <al...@gmail.com>
Date:   2016-03-07T16:06:34Z

    [FLINK-3582] Add Iterator over State for All Keys in Partitioned State

commit 65a0fff2ceca04b7ccfdd8a3632e0a98300d3509
Author: Aljoscha Krettek <al...@gmail.com>
Date:   2016-03-07T16:06:34Z

    [FLINK-3581] Add Special Aligned Event-Time WindowOperator

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3581] [FLINK-3582] State Iterator and A...

Posted by aljoscha <gi...@git.apache.org>.
Github user aljoscha commented on the pull request:

    https://github.com/apache/flink/pull/1957#issuecomment-216258540
  
    @gyfora I thought that was possible, yes, but I don't think so anymore. I only found this document after I finished my implementation: https://github.com/facebook/rocksdb/wiki/Delete-A-Range-Of-Keys It describes what I'm doing here. The more optimized stuff they are describing is only possible in the C++ API, if I'm not mistaken.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3581] [FLINK-3582] State Iterator and A...

Posted by gyfora <gi...@git.apache.org>.
Github user gyfora commented on the pull request:

    https://github.com/apache/flink/pull/1957#issuecomment-216354689
  
    Is that a problem? Maybe we could do some periodic garbage collection on the empty column families.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3581] [FLINK-3582] State Iterator and A...

Posted by gyfora <gi...@git.apache.org>.
Github user gyfora commented on the pull request:

    https://github.com/apache/flink/pull/1957#issuecomment-216259593
  
    The other possibility would be to store them in different column families. Not sure about the performance there though


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3581] [FLINK-3582] State Iterator and A...

Posted by aljoscha <gi...@git.apache.org>.
Github user aljoscha commented on the pull request:

    https://github.com/apache/flink/pull/1957#issuecomment-216223941
  
    CC @tillrohrmann for review since you are also working on state
    CC @StephanEwen for review as original requestor of this feature


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3581] [FLINK-3582] State Iterator and A...

Posted by aljoscha <gi...@git.apache.org>.
Github user aljoscha commented on the pull request:

    https://github.com/apache/flink/pull/1957#issuecomment-216466778
  
    It is a problem because we would be trashing the db, right. I thought about garbage collection but this would be yet another layer of complexity. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3581] [FLINK-3582] State Iterator and A...

Posted by gyfora <gi...@git.apache.org>.
Github user gyfora commented on the pull request:

    https://github.com/apache/flink/pull/1957#issuecomment-216257051
  
    Hi,
    So just a quick question regarding the namespace dropping in rocks. I though you said it would be possible to do this by using prefixes in rocks. Are there some limitations of this approach?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3581] [FLINK-3582] State Iterator and A...

Posted by aljoscha <gi...@git.apache.org>.
Github user aljoscha commented on the pull request:

    https://github.com/apache/flink/pull/1957#issuecomment-216261831
  
    I had a version that was doing this. The problem there is that you don't know when you can drop column families. For knowing when a column family is empty you would have to keep a Set of keys for which you have state in memory. Which somewhat defeats the purpose of the RocksDB backend.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---