You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by kl0u <gi...@git.apache.org> on 2017/05/22 12:02:59 UTC

[GitHub] flink pull request #3961: [FLINK-6656] [cep] Change element PriorityQueue to...

GitHub user kl0u opened a pull request:

    https://github.com/apache/flink/pull/3961

    [FLINK-6656] [cep] Change element PriorityQueue to MapState.

    This is to leverage the fact that RocksDB already returns the
    keys sorted. So now elements, instead of being stores in a PQ
    and all of them being deserialized and serialized at each incoming
    element, the are stored in a MapState with the key being the
    timestamp and the value, a List of elements that refer to the
    same timestamp.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kl0u/flink cep-map-state-pr

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/3961.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3961
    
----

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3961: [FLINK-6656] [cep] Change element PriorityQueue to MapSta...

Posted by kl0u <gi...@git.apache.org>.
Github user kl0u commented on the issue:

    https://github.com/apache/flink/pull/3961
  
    Hi @dawidwys ! Your understanding is correct in the sense that it does not leverage to the fullest the fact of keys being returned in order. The reason is that this is valid only in RocksDB state backend. 
    
    But when it comes to insertion to the PriorityQueue, this can come in handy as inserting will be fast. In addition, as you said, we only deserialize the values we need, when we need them, instead of everything all the time.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3961: [FLINK-6656] [cep] Change element PriorityQueue to...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/3961


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3961: [FLINK-6656] [cep] Change element PriorityQueue to MapSta...

Posted by kl0u <gi...@git.apache.org>.
Github user kl0u commented on the issue:

    https://github.com/apache/flink/pull/3961
  
    The re-ordering is just to verify that at watermark time, we actually order
    by event-time.
    
    On May 22, 2017 2:57 PM, "Aljoscha Krettek" <no...@github.com>
    wrote:
    
    > This looks good! Why is the element processing reordered in places in the
    > tests, though?
    >
    > —
    > You are receiving this because you authored the thread.
    > Reply to this email directly, view it on GitHub
    > <https://github.com/apache/flink/pull/3961#issuecomment-303092277>, or mute
    > the thread
    > <https://github.com/notifications/unsubscribe-auth/ACS1qPxn5NjNu9Z9HWNxNf5klLZau-fdks5r8YY8gaJpZM4NiQwf>
    > .
    >



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3961: [FLINK-6656] [cep] Change element PriorityQueue to MapSta...

Posted by rmetzger <gi...@git.apache.org>.
Github user rmetzger commented on the issue:

    https://github.com/apache/flink/pull/3961
  
    @kl0u asked me to merge this PR to the release-1.3 branch. It seems the change is good to merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3961: [FLINK-6656] [cep] Change element PriorityQueue to MapSta...

Posted by dawidwys <gi...@git.apache.org>.
Github user dawidwys commented on the issue:

    https://github.com/apache/flink/pull/3961
  
    Just a question from my side to understand this change better. Does it really leverage the fact of keys being returned in order? To evaluate `getSortedTimestamps` you need to iterate whole keySet anyway.
    
    I see though it benefits from laziness of value deserialization and partitioning of the incoming events (by timestamp). 
    
    Is my understanding correct?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3961: [FLINK-6656] [cep] Change element PriorityQueue to MapSta...

Posted by dawidwys <gi...@git.apache.org>.
Github user dawidwys commented on the issue:

    https://github.com/apache/flink/pull/3961
  
    Thanks for clarification!
    
    Z pozdrowieniami! / Cheers!
    
    Dawid Wysakowicz
    
    *Data/Software Engineer*
    
    Skype: dawid_wys | Twitter: @OneMoreCoder
    
    <http://getindata.com/>
    
    2017-05-22 18:01 GMT+02:00 Kostas Kloudas <no...@github.com>:
    
    > Hi @dawidwys <https://github.com/dawidwys> ! Your understanding is
    > correct in the sense that it does not leverage to the fullest the fact of
    > keys being returned in order. The reason is that this is valid only in
    > RocksDB state backend.
    >
    > But when it comes to insertion to the PriorityQueue, this can come in
    > handy as inserting will be fast. In addition, as you said, we only
    > deserialize the values we need, when we need them, instead of everything
    > all the time.
    >
    > —
    > You are receiving this because you were mentioned.
    > Reply to this email directly, view it on GitHub
    > <https://github.com/apache/flink/pull/3961#issuecomment-303143858>, or mute
    > the thread
    > <https://github.com/notifications/unsubscribe-auth/AF8_0_gLnOlm1xuG-VHVJTnEJx9KxVpSks5r8bFogaJpZM4NiQwf>
    > .
    >



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3961: [FLINK-6656] [cep] Change element PriorityQueue to MapSta...

Posted by aljoscha <gi...@git.apache.org>.
Github user aljoscha commented on the issue:

    https://github.com/apache/flink/pull/3961
  
    This looks good! Why is the element processing reordered in places in the tests, though?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---