You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by srdo <gi...@git.apache.org> on 2017/05/22 19:04:53 UTC

[GitHub] storm pull request #2090: STORM-2489: Overlap and data loss on WindowedBolt ...

Github user srdo commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2090#discussion_r117821993
  
    --- Diff: storm-client/src/jvm/org/apache/storm/windowing/TimeTriggerPolicy.java ---
    @@ -62,7 +62,9 @@ public void reset() {
     
         @Override
         public void start() {
    -        executorFuture = executor.scheduleAtFixedRate(newTriggerTask(), duration, duration, TimeUnit.MILLISECONDS);
    +        // initial delay is slightly less than the duration so that the initial tuples wont't expire due to time drift
    +        long initialDelay = duration - Math.min((long) (duration * .1), 10);
    +        executorFuture = executor.scheduleAtFixedRate(newTriggerTask(), initialDelay, duration, TimeUnit.MILLISECONDS);
    --- End diff --
    
    Won't this still be vulnerable to losing initial tuples if the first execution is delayed longer than expected for some reason?
    
    I'm wondering if we could make this more reliable by tracking the time we consider expired instead of/in addition to tracking what time it is now. Tuples between the prevExpired timestamp and the nowExpired timestamp are tuples we can't be sure have been processed, so they should be included in the window. Since the expired timestamps also move ~duration every onTrigger, we still get the tuples expired we expect. For the first onTrigger, we can just process all the tuples because we can tell it's the first call since the prevExpired timestamp isn't set yet.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---